[03:01:07] 06Data-Engineering, 06Data-Platform-SRE, 06SRE: Data Platform access streamlining for WMDE staff - https://phabricator.wikimedia.org/T381824#10392465 (10Dzahn) One thing to answer here would be how you would know who actually is WMDE staff. There used to be a public page that lists them but then that stopped... [04:00:58] (03CR) 10Joal: [C:03+2] "Merging for later deploy" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1101460 (https://phabricator.wikimedia.org/T381746) (owner: 10Joal) [04:12:19] (03Merged) 10jenkins-bot: Fix HdfsXMLFsImageConverter block reading [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1101460 (https://phabricator.wikimedia.org/T381746) (owner: 10Joal) [05:10:12] (03PS1) 10Joal: Update changelog.md to version 0.2.54 for deploy [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1101617 [05:12:55] (03CR) 10Joal: [V:03+2 C:03+2] "Merging for deploy" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1101617 (owner: 10Joal) [05:14:33] !log Releasing new refinery-source v0.2.54 to archiva [05:14:34] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [05:15:09] Starting build #23 for job analytics-refinery-maven-release [05:25:16] 10Data-Engineering (Q2 2024 October 1st - December 31th), 10Data-Platform-SRE (2024.11.30 - 2024.12.20), 13Patch-For-Review: Upgrade Spark to a version with long term Iceberg support, and with fixes to support Dumps 2.0 - https://phabricator.wikimedia.org/T338057#10392519 (10JAllemandou) I have managed to fi... [05:37:17] Project analytics-refinery-maven-release build #23: 09SUCCESS in 22 min: https://integration.wikimedia.org/ci/job/analytics-refinery-maven-release/23/ [07:57:56] 06Data-Engineering, 06Data-Platform-SRE, 06SRE: Data Platform access streamlining for WMDE staff - https://phabricator.wikimedia.org/T381824#10392591 (10MoritzMuehlenhoff) >>! In T381824#10392465, @Dzahn wrote: > One thing to answer here would be how you would know who actually is WMDE staff. There used to b... [08:16:01] 06Data-Engineering, 06Research, 10Data-Platform-SRE (2024.11.30 - 2024.12.20), 03Discovery-Search (Current work): Low available space on Hadoop / HDFS - https://phabricator.wikimedia.org/T381707#10392601 (10brouberol) {F57793896} Beautiful! We reclaimed about 8% of free space! [08:33:35] 10Data-Engineering (Q2 2024 October 1st - December 31th), 10Wikidata, 10Multi-Content-Revisions (Deployment), 13Patch-For-Review: MCR schema migration stage 4: Migrate External Store URLs (wmf production) - https://phabricator.wikimedia.org/T183490#10392628 (10Marostegui) Oh wow, nice one! [08:33:54] 10Data-Engineering (Q2 2024 October 1st - December 31th), 10Wikidata, 10Multi-Content-Revisions (Deployment), 13Patch-For-Review: MCR schema migration stage 4: Migrate External Store URLs (wmf production) - https://phabricator.wikimedia.org/T183490#10392629 (10Marostegui) Oh wow, nice one! [08:43:33] 10Analytics-Canonical-Data, 06Movement-Insights: Update mobile domain derivation code to match new canonical version - https://phabricator.wikimedia.org/T353300#10392666 (10nshahquinn-wmf) 05Open→03Resolved a:05Hghani→03nshahquinn-wmf Changes merged and deployed. [08:55:27] (03PS1) 10Joal: Add HQL scripts to backfill actor tables [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1101807 (https://phabricator.wikimedia.org/T378852) [08:56:18] (03CR) 10Joal: "Manually tested on the cluster" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1101807 (https://phabricator.wikimedia.org/T378852) (owner: 10Joal) [09:35:53] (03PS2) 10Joal: Add HQL scripts to backfill actor tables [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1101807 (https://phabricator.wikimedia.org/T378852) [09:37:55] (03CR) 10Aqu: [C:03+1] Add HQL scripts to backfill actor tables [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1101807 (https://phabricator.wikimedia.org/T378852) (owner: 10Joal) [09:38:33] (03CR) 10Joal: [V:03+2 C:03+2] "Merging for deploy" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1101807 (https://phabricator.wikimedia.org/T378852) (owner: 10Joal) [09:40:59] !log Deploying refinery for backfill using scap [09:41:01] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [09:46:21] !log Deploying refinery onto HDFS for backfill [09:46:22] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [09:55:03] !log rerun latest hdfs_usage job to pick up correct data sizes [09:55:05] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:14:52] 06Data-Engineering, 06Growth-Team, 06MediaWiki-Engineering, 10MediaWiki-extensions-WikimediaEvents, and 7 others: Add Prometheus support to statsd.js via mw.track() - https://phabricator.wikimedia.org/T355837#10392903 (10Sgs) [15:24:54] (03PS1) 10Peter Fischer: Rewrite MediawikiDumper partitioning implementation [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1101892 [15:33:47] (03CR) 10CI reject: [V:04-1] Rewrite MediawikiDumper partitioning implementation [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1101892 (owner: 10Peter Fischer) [16:14:10] 06Data-Engineering, 06Research, 10Data-Platform-SRE (2024.11.30 - 2024.12.20), 03Discovery-Search (Current work): Low available space on Hadoop / HDFS - https://phabricator.wikimedia.org/T381707#10394226 (10Gehel) https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/941 has bee... [17:03:54] !log restarting eventgate-analytics to pick up stream config changes for T381322 [17:03:58] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:03:58] T381322: Rename Flink application and streams to match prod conventions - https://phabricator.wikimedia.org/T381322 [17:41:39] 06Data-Engineering: Fix `hdfs_usage` data size columns - https://phabricator.wikimedia.org/T381746#10394649 (10JAllemandou) This is fixed for snapshot '2024-12-02' onward: ` SELECT blocks_size FROM wmf.hdfs_usage WHERE snapshot = '2024-12-02' AND type = 'DIRECTORY' AND path = '/wmf/data/wmf/webrequest/webreq... [17:42:25] 10Data-Engineering (Q2 2024 October 1st - December 31th): Fix `hdfs_usage` data size columns - https://phabricator.wikimedia.org/T381746#10394659 (10JAllemandou) [17:42:27] 06Data-Engineering, 06Data-Platform-SRE, 06SRE: Data Platform access streamlining for WMDE staff - https://phabricator.wikimedia.org/T381824#10394660 (10odimitrijevic) Yes, I approve streamlining the access to WMDE staff in the same way that we do for WMF staff as proposed in https://phabricator.wikimedia.or... [17:43:00] 10Data-Engineering (Q2 2024 October 1st - December 31th): Fix `hdfs_usage` data size columns - https://phabricator.wikimedia.org/T381746#10394666 (10JAllemandou) a:03JAllemandou [17:59:55] 10Data-Engineering (Q2 2024 October 1st - December 31th), 06Data Products, 06Data-Platform, 06Movement-Insights, 13Patch-For-Review: Backfill and recalculate unique devices data from July 2024 to present - https://phabricator.wikimedia.org/T378852#10394762 (10Ahoelzl) a:05odimitrijevic→03JAllemandou [18:01:20] 10Data-Engineering (Q2 2024 October 1st - December 31th): Airflow skips canary-event tasks - https://phabricator.wikimedia.org/T380836#10394775 (10Ahoelzl) a:03Antoine_Quhen [18:21:21] 06Data-Engineering, 10CampaignEvents, 06Data Products, 05Campaign-Registration, and 2 others: Add "event_is_test_event" field to "campaign_events" table - https://phabricator.wikimedia.org/T381759#10394820 (10ifried) 05Open→03In progress [19:11:07] 10Data-Engineering (Q2 2024 October 1st - December 31th), 10Data-Platform-SRE (2024.11.30 - 2024.12.20), 13Patch-For-Review: Deploy the HDFS synchronizer (blunderbuss) service to the dse-k8s cluster - https://phabricator.wikimedia.org/T371994#10394935 (10amastilovic) [19:15:15] 06Data-Engineering, 10EventStreams, 10Event-Platform: EventStreams should deserialize Kafka key - https://phabricator.wikimedia.org/T353650#10394943 (10Ottomata) →14Duplicate dup:03T373689 [19:15:17] 06Data-Engineering, 10EventStreams, 10Event-Platform: EventStreams: kafka key should be serialized as a string - https://phabricator.wikimedia.org/T373689#10394945 (10Ottomata) [19:17:02] 06Data-Engineering, 10Event-Platform: [Event Platform] eventutilities-python should convert pyflink Instants to python DateTimes - https://phabricator.wikimedia.org/T349640#10394953 (10Ottomata) 05Open→03Resolved a:03Ottomata [19:44:25] (03PS1) 10Mforns: Optimize Commons Impact Metrics calculations by using digests [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1101930 (https://phabricator.wikimedia.org/T381799) [19:45:44] (03PS2) 10Mforns: Optimize Commons Impact Metrics calculations by using digests [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1101930 (https://phabricator.wikimedia.org/T381799) [19:55:00] (03CR) 10Milimetric: [V:03+2 C:03+2] Optimize Commons Impact Metrics calculations by using digests [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1101930 (https://phabricator.wikimedia.org/T381799) (owner: 10Mforns) [20:07:57] 06Data-Engineering, 06Research, 10Data-Platform-SRE (2024.11.30 - 2024.12.20), 03Discovery-Search (Current work): Low available space on Hadoop / HDFS - https://phabricator.wikimedia.org/T381707#10395153 (10Gehel) p:05Triage→03High [20:10:52] 06Data-Engineering, 06Research, 10Data-Platform-SRE (2024.11.30 - 2024.12.20), 03Discovery-Search (Current work): Low available space on Hadoop / HDFS - https://phabricator.wikimedia.org/T381707#10395156 (10Gehel) The current rate of capacity consumption seems to be around 10% / month since October 1. If t... [20:31:42] !log starting deployment of refinery to fix Commons Impact Metrics job [20:31:45] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:50:29] !log finished deployment of refinery to fix Commons Impact Metrics job [20:50:31] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:56:19] !log deployed airflow-analytics to fix the commons impact metrics pipeline [20:56:21] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [21:00:25] !log re-ran commons_impact_metrics_monthly 2024-11 and cleared dependent DAGs which had timed out [21:00:27] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [22:14:32] 10Data-Engineering (Q2 2024 October 1st - December 31th), 06Data Products, 06Data-Platform, 06Movement-Insights, 13Patch-For-Review: Backfill and recalculate unique devices data from July 2024 to present - https://phabricator.wikimedia.org/T378852#10395498 (10Ahoelzl) The 6 month retention period will be...