[02:28:18] FIRING: HdfsCapacityRemainingPercent: Alarmingly low free space on the analytics-hadoop HDFS cluster. - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Alerts#HDFS_Capacity_Remaining - https://grafana.wikimedia.org/d/000000585/hadoop?var-hadoop_cluster=analytics-hadoop&orgId=1&panelId=106&fullscreen - https://alerts.wikimedia.org/?q=alertname%3DHdfsCapacityRemainingPercent [06:28:18] FIRING: HdfsCapacityRemainingPercent: Alarmingly low free space on the analytics-hadoop HDFS cluster. - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Alerts#HDFS_Capacity_Remaining - https://grafana.wikimedia.org/d/000000585/hadoop?var-hadoop_cluster=analytics-hadoop&orgId=1&panelId=106&fullscreen - https://alerts.wikimedia.org/?q=alertname%3DHdfsCapacityRemainingPercent [07:11:33] 06Data-Engineering, 06Data Products, 06DBA, 13Patch-For-Review, 07Schema-change: Design and merge the new tables of file tables - https://phabricator.wikimedia.org/T368113#10389321 (10Bugreporter) >>! In T368113#10375890, @gerritbot wrote: > Change #1100125 had a related patch set uploaded (by Ladsgroup;... [07:41:31] 06Data-Engineering, 06Data Products, 06DBA, 13Patch-For-Review, 07Schema-change: Design and merge the new tables of file tables - https://phabricator.wikimedia.org/T368113#10389334 (10Ladsgroup) I am aware. That's why the patch is WIP [08:51:57] (03PS1) 10Joal: Fix HdfsXMLFsImageConverter block reading [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1101460 [09:04:00] (03CR) 10Gmodena: [C:03+1] "LGTM. Do you have a phab task to pin it to?" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1101460 (owner: 10Joal) [09:09:45] 06Data-Engineering: Fix `hdfs_usage` data size columns - https://phabricator.wikimedia.org/T381746 (10JAllemandou) 03NEW [09:10:21] (03PS2) 10Joal: Fix HdfsXMLFsImageConverter block reading [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1101460 (https://phabricator.wikimedia.org/T381746) [09:11:04] (03CR) 10Joal: "Done!" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1101460 (https://phabricator.wikimedia.org/T381746) (owner: 10Joal) [09:11:55] (03CR) 10Aqu: [C:03+1] "Nice catch." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1101460 (https://phabricator.wikimedia.org/T381746) (owner: 10Joal) [09:18:14] (03PS3) 10Joal: Update Spark to version 3.5.3 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1093393 (https://phabricator.wikimedia.org/T338057) (owner: 10Btullis) [09:18:46] (03CR) 10CI reject: [V:04-1] Update Spark to version 3.5.3 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1093393 (https://phabricator.wikimedia.org/T338057) (owner: 10Btullis) [09:49:36] 06Data-Engineering, 06Data-Platform, 10Dumps-Generation, 10Data-Platform-SRE (2024.11.30 - 2024.12.20): Hide autoblocks from the globalblocks table database dump - https://phabricator.wikimedia.org/T376726#10389585 (10BTullis) 05Open→03Resolved I found the stray files that were being copied. There... [10:24:27] 06Data-Engineering, 10CampaignEvents, 06Data Products, 05Campaign-Registration, and 2 others: Add "event_is_test_event" field to "campaign_events" table - https://phabricator.wikimedia.org/T381759 (10MHorsey-WMF) 03NEW [10:28:18] FIRING: HdfsCapacityRemainingPercent: Alarmingly low free space on the analytics-hadoop HDFS cluster. - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Alerts#HDFS_Capacity_Remaining - https://grafana.wikimedia.org/d/000000585/hadoop?var-hadoop_cluster=analytics-hadoop&orgId=1&panelId=106&fullscreen - https://alerts.wikimedia.org/?q=alertname%3DHdfsCapacityRemainingPercent [10:53:15] 10Data-Engineering (Q2 2024 October 1st - December 31th), 06Data Products, 06DBA, 07Schema-change-in-production: Cleanup revision table schema - https://phabricator.wikimedia.org/T367856#10389835 (10Marostegui) The change went throught the sanitarium master already, so it is being executed on wikireplicas... [13:33:23] (03PS4) 10Joal: Update Spark to version 3.5.3 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1093393 (https://phabricator.wikimedia.org/T338057) (owner: 10Btullis) [13:33:57] (03CR) 10CI reject: [V:04-1] Update Spark to version 3.5.3 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1093393 (https://phabricator.wikimedia.org/T338057) (owner: 10Btullis) [13:36:32] (03CR) 10Joal: "I have fixed almost everything. 2 things left:" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1093393 (https://phabricator.wikimedia.org/T338057) (owner: 10Btullis) [13:40:09] (03PS5) 10Joal: Update Spark to version 3.5.3 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1093393 (https://phabricator.wikimedia.org/T338057) (owner: 10Btullis) [13:51:24] (03CR) 10CI reject: [V:04-1] Update Spark to version 3.5.3 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1093393 (https://phabricator.wikimedia.org/T338057) (owner: 10Btullis) [14:28:19] FIRING: HdfsCapacityRemainingPercent: Alarmingly low free space on the analytics-hadoop HDFS cluster. - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Alerts#HDFS_Capacity_Remaining - https://grafana.wikimedia.org/d/000000585/hadoop?var-hadoop_cluster=analytics-hadoop&orgId=1&panelId=106&fullscreen - https://alerts.wikimedia.org/?q=alertname%3DHdfsCapacityRemainingPercent [15:21:20] (03CR) 10Xcollazo: "Left some minor comments." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1093393 (https://phabricator.wikimedia.org/T338057) (owner: 10Btullis) [15:46:18] !log kubectl cordon dse-k8s-worker1005.eqiad.wmnet [15:46:20] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:07:17] !log kubectl uncordon dse-k8s-worker1005.eqiad.wmnet [16:07:18] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:10:02] 10Data-Engineering (Q2 2024 October 1st - December 31th): [SPIKE] Learn and document how to use Flink-CDC from MediaWiki MariaDB locally - https://phabricator.wikimedia.org/T373144#10391046 (10Ottomata) > It seems to be reliable. That is good news! > Iceberg Compatibility Mode, this feature does not work. What... [18:28:19] FIRING: HdfsCapacityRemainingPercent: Alarmingly low free space on the analytics-hadoop HDFS cluster. - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Alerts#HDFS_Capacity_Remaining - https://grafana.wikimedia.org/d/000000585/hadoop?var-hadoop_cluster=analytics-hadoop&orgId=1&panelId=106&fullscreen - https://alerts.wikimedia.org/?q=alertname%3DHdfsCapacityRemainingPercent [19:45:16] (03PS1) 10Milimetric: Fix bad string replace that causes slower pull [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1101585 [20:06:55] (03PS2) 10Xcollazo: Fix bad string replace that causes slower pull [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1101585 (https://phabricator.wikimedia.org/T381615) (owner: 10Milimetric) [20:07:54] (03CR) 10Xcollazo: [C:03+2] "LGTM!" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1101585 (https://phabricator.wikimedia.org/T381615) (owner: 10Milimetric) [20:15:43] (03PS5) 10Mforns: Update MediaWiki History to support Temp Accounts [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1088232 (https://phabricator.wikimedia.org/T379230) [20:20:04] (03Merged) 10jenkins-bot: Fix bad string replace that causes slower pull [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1101585 (https://phabricator.wikimedia.org/T381615) (owner: 10Milimetric) [20:24:03] (03CR) 10CI reject: [V:04-1] Update MediaWiki History to support Temp Accounts [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1088232 (https://phabricator.wikimedia.org/T379230) (owner: 10Mforns) [20:48:19] RESOLVED: HdfsCapacityRemainingPercent: Alarmingly low free space on the analytics-hadoop HDFS cluster. - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Alerts#HDFS_Capacity_Remaining - https://grafana.wikimedia.org/d/000000585/hadoop?var-hadoop_cluster=analytics-hadoop&orgId=1&panelId=106&fullscreen - https://alerts.wikimedia.org/?q=alertname%3DHdfsCapacityRemainingPercent [20:55:32] 06Data-Engineering, 06Research, 10Data-Platform-SRE (2024.11.30 - 2024.12.20), 03Discovery-Search (Current work): Low available space on Hadoop / HDFS - https://phabricator.wikimedia.org/T381707#10391588 (10xcollazo) The dumps 2.0 intermediate table is currently hoarding these many bytes: ` xcollazo@stat10... [22:04:08] 06Data-Engineering, 06Data-Platform, 06Data-Platform-SRE, 06DC-Ops: Detect hardware failures/automatically create tickets for DC Ops - https://phabricator.wikimedia.org/T367790#10391960 (10Ottomata) [22:05:56] 06Data-Engineering, 06SRE, 10SRE-Access-Requests: Requesting access to analytics-privatedata-users for Suzanne Wood (WMDE) - https://phabricator.wikimedia.org/T380994#10391965 (10Scott_French) a:03Scott_French [22:07:14] 06Data-Engineering, 06SRE, 10SRE-Access-Requests, 13Patch-For-Review: Requesting access to analytics-privatedata-users for Suzanne Wood (WMDE) - https://phabricator.wikimedia.org/T380994#10391970 (10Scott_French) [22:11:08] 06Data-Engineering, 06Data-Platform, 10DPE Temporary Accounts, 06Product-Analytics, and 3 others: Ensure performer attributes in schemas clarify if the user is a temporary account - https://phabricator.wikimedia.org/T374940#10391984 (10VirginiaPoundstone) [22:11:30] 10Data-Engineering (Q2 2024 October 1st - December 31th), 06Data Products, 06Data-Platform, 06Movement-Insights: Modify the automated traffic detection to be applied at the project family level - https://phabricator.wikimedia.org/T377257#10391986 (10VirginiaPoundstone) [22:11:53] 06Data-Engineering, 06SRE, 10SRE-Access-Requests, 13Patch-For-Review: Requesting access to analytics-privatedata-users for Suzanne Wood (WMDE) - https://phabricator.wikimedia.org/T380994#10391996 (10Scott_French) @odimitrijevic @Milimetric @Ahoelzl @Ottomata - Could one of you please approve access to `ana... [22:14:04] 06Data-Engineering, 06SRE, 10SRE-Access-Requests, 13Patch-For-Review: Requesting access to analytics-privatedata-users for Suzanne Wood (WMDE) - https://phabricator.wikimedia.org/T380994#10392008 (10Ottomata) Approved! [22:15:14] 06Data-Engineering, 06SRE, 10SRE-Access-Requests, 13Patch-For-Review: Requesting access to analytics-privatedata-users for Suzanne Wood (WMDE) - https://phabricator.wikimedia.org/T380994#10392011 (10Ottomata) > Also, if WMDE staff are similarly covered by the recent streamlining in T370424, it would be gre... [22:17:38] 06Data-Engineering, 06SRE, 10SRE-Access-Requests, 13Patch-For-Review: Requesting access to analytics-privatedata-users for Suzanne Wood (WMDE) - https://phabricator.wikimedia.org/T380994#10392013 (10Scott_French) [22:21:28] 06Data-Engineering, 06SRE, 10SRE-Access-Requests, 13Patch-For-Review: Requesting access to analytics-privatedata-users for Suzanne Wood (WMDE) - https://phabricator.wikimedia.org/T380994#10392028 (10Scott_French) Great, thank you very much @Ottomata. [22:38:05] 06Data-Engineering, 06MediaWiki-Engineering, 10MediaWiki-extensions-WikimediaEvents, 06MediaWiki-Platform-Team, and 6 others: Add Prometheus support to statsd.js via mw.track() - https://phabricator.wikimedia.org/T355837#10392071 (10lmata) [22:40:52] 06Data-Engineering, 06SRE, 10SRE-Access-Requests, 13Patch-For-Review: Requesting access to analytics-privatedata-users for Suzanne Wood (WMDE) - https://phabricator.wikimedia.org/T380994#10392073 (10Scott_French) 05Stalled→03Resolved Alright, this should now be complete, though the underlying chang... [22:53:58] 06Data-Engineering, 06Data-Platform-SRE, 06SRE: Data Platform access streamlining for WMDE staff - https://phabricator.wikimedia.org/T381824 (10Scott_French) 03NEW [22:54:48] 10Data-Engineering (Q2 2024 October 1st - December 31th), 06Data-Platform-SRE, 06SRE: Streamline Data Platform access approvals for WMF staff - https://phabricator.wikimedia.org/T370424#10392108 (10Scott_French) See T381824 for potentially extending the same streamlining to WMDE staff.