[02:28:15] FIRING: HdfsCapacityRemainingPercent: Alarmingly low free space on the analytics-hadoop HDFS cluster. - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Alerts#HDFS_Capacity_Remaining - https://grafana.wikimedia.org/d/000000585/hadoop?var-hadoop_cluster=analytics-hadoop&orgId=1&panelId=106&fullscreen - https://alerts.wikimedia.org/?q=alertname%3DHdfsCapacityRemainingPercent [06:28:15] FIRING: HdfsCapacityRemainingPercent: Alarmingly low free space on the analytics-hadoop HDFS cluster. - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Alerts#HDFS_Capacity_Remaining - https://grafana.wikimedia.org/d/000000585/hadoop?var-hadoop_cluster=analytics-hadoop&orgId=1&panelId=106&fullscreen - https://alerts.wikimedia.org/?q=alertname%3DHdfsCapacityRemainingPercent [07:29:23] (03CR) 10Joal: [C:03+2] Remove dead code, doc and artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1092879 (owner: 10Joal) [07:29:27] (03CR) 10Joal: [V:03+2] Remove dead code, doc and artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1092879 (owner: 10Joal) [07:29:31] (03CR) 10Joal: [V:03+2 C:03+2] Remove dead code, doc and artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1092879 (owner: 10Joal) [07:32:52] !log Deploying refinery using scap [07:33:04] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [07:33:58] With a lot less artifacts to download, deployment time has dropped significantly :) [07:37:18] !log deploy refinery onto HDFS [07:38:12] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [09:14:09] 06Data-Engineering, 06SRE, 10SRE-Access-Requests, 10Data-Platform-SRE (2024.11.09 - 2024.11.29): Requesting access to analytics-privatedata-users group, sql_lab role, Kerberos Principal for Khantstop - https://phabricator.wikimedia.org/T379303#10356459 (10Gehel) p:05Triage→03High [09:23:50] 10Data-Engineering (Q2 2024 October 1st - December 31th), 10Data-Platform-SRE (2024.11.09 - 2024.11.29), 13Patch-For-Review: Upgrade Spark to a version with long term Iceberg support, and with fixes to support Dumps 2.0 - https://phabricator.wikimedia.org/T338057#10356492 (10JAllemandou) In taking over the w... [09:33:52] 06Data-Engineering: Airflow has skipped some canary-event tasks when the Scheduler was failing - https://phabricator.wikimedia.org/T380836 (10JAllemandou) 03NEW [10:28:15] FIRING: HdfsCapacityRemainingPercent: Alarmingly low free space on the analytics-hadoop HDFS cluster. - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Alerts#HDFS_Capacity_Remaining - https://grafana.wikimedia.org/d/000000585/hadoop?var-hadoop_cluster=analytics-hadoop&orgId=1&panelId=106&fullscreen - https://alerts.wikimedia.org/?q=alertname%3DHdfsCapacityRemainingPercent [11:16:35] 06Data-Engineering: Reduce `refine_to_hive_hourly` airflow task number - https://phabricator.wikimedia.org/T380856 (10JAllemandou) 03NEW [12:04:38] 06Data-Engineering, 06Data-Platform-SRE, 07Epic: Upgrade Hadoop to version 3.3.x and Hive to version 3.1.x - https://phabricator.wikimedia.org/T379385#10357179 (10BTullis) A point of note is that Hive 3.x has now been classified as EOL as from 2024/10/08 - https://hive.apache.org/general/downloads/ {F5774957... [13:25:25] 14Data-Engineering (Q1 2024 July 1st - September 30th), 10Dumps 2.0 (Kanban Board), 10Event-Platform: [Event Platform] Instrument EventBus with prometheus MW Statslib - https://phabricator.wikimedia.org/T363587#10357493 (10hashar) 05Resolved→03Open >>! In T363587#10356835, @gerritbot wrote: > Change #106... [13:43:50] 14Data-Engineering (Q1 2024 July 1st - September 30th), 10Dumps 2.0 (Kanban Board), 10Event-Platform: [Event Platform] Instrument EventBus with prometheus MW Statslib - https://phabricator.wikimedia.org/T363587#10357613 (10gmodena) >>! In T363587#10357493, @hashar wrote: > I have reverted that patch (https:/... [13:49:21] 06Data-Engineering, 10Data-Platform-SRE (2024.11.09 - 2024.11.29): Requesting access to analytics-privatedata-users group, sql_lab role, Kerberos Principal for Khantstop - https://phabricator.wikimedia.org/T379303#10357640 (10elukey) I am removing the SRE tag on this, Data Platform SREs are the right target fo... [13:58:43] 06Data-Engineering, 10Data-Platform-SRE (2024.11.09 - 2024.11.29): Requesting access to analytics-privatedata-users group, sql_lab role, Kerberos Principal for Khantstop - https://phabricator.wikimedia.org/T379303#10357718 (10BTullis) @Khantstop - could you possibly paste some command output or a screenshot, p... [14:05:34] 10Data-Engineering (Q2 2024 October 1st - December 31th), 06Discovery-Search, 10Dumps 2.0, 10Data-Platform-SRE (2024.11.09 - 2024.11.29), 13Patch-For-Review: Add relevant kafka clusters to defined airflow connections in puppet - https://phabricator.wikimedia.org/T379676#10357744 (10xcollazo) [14:28:16] FIRING: HdfsCapacityRemainingPercent: Alarmingly low free space on the analytics-hadoop HDFS cluster. - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Alerts#HDFS_Capacity_Remaining - https://grafana.wikimedia.org/d/000000585/hadoop?var-hadoop_cluster=analytics-hadoop&orgId=1&panelId=106&fullscreen - https://alerts.wikimedia.org/?q=alertname%3DHdfsCapacityRemainingPercent [16:28:54] (03PS1) 10Jennifer Ebe: Edit Geoeditors Daily Monthly to support Temp Account Changes [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1098083 (https://phabricator.wikimedia.org/T379728) [16:31:00] (03PS2) 10Jennifer Ebe: Edit Geoeditors Daily Monthly to support Temp Account Changes [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1098083 (https://phabricator.wikimedia.org/T379728) [16:46:05] 06Data-Engineering, 06Research: Incremental HTML dataset to support "Who are moderators" SDS 1.2.3 - https://phabricator.wikimedia.org/T380874#10358356 (10fkaelin) [17:06:28] 06Data-Engineering, 10Data-Platform-SRE (2024.11.09 - 2024.11.29): Design a suitable DAG deployment method - https://phabricator.wikimedia.org/T368033#10358512 (10brouberol) After multiple slack discussions (https://wikimedia.slack.com/archives/C02291Z9YQY/p1732524489017109 and https://wikimedia.slack.com/... [17:48:12] 06Data-Engineering, 06Research: Incremental HTML dataset to support "Who are moderators" SDS 1.2.3 - https://phabricator.wikimedia.org/T380874#10358714 (10XiaoXiao-WMF) [17:48:39] 06Data-Engineering, 06Research: Incremental HTML dataset to support "Who are moderators" SDS 1.2.3 - https://phabricator.wikimedia.org/T380874#10358721 (10XiaoXiao-WMF) [17:48:40] 06Data-Engineering, 10Event-Platform: Implement stream of HTML content on mw.page_change event - https://phabricator.wikimedia.org/T360794#10358720 (10XiaoXiao-WMF) [18:28:16] FIRING: HdfsCapacityRemainingPercent: Alarmingly low free space on the analytics-hadoop HDFS cluster. - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Alerts#HDFS_Capacity_Remaining - https://grafana.wikimedia.org/d/000000585/hadoop?var-hadoop_cluster=analytics-hadoop&orgId=1&panelId=106&fullscreen - https://alerts.wikimedia.org/?q=alertname%3DHdfsCapacityRemainingPercent [18:41:47] 06Data-Engineering, 06MediaWiki-Engineering, 10MediaWiki-extensions-WikimediaEvents, 06MediaWiki-Platform-Team, and 5 others: Add Prometheus support to statsd.js via mw.track() - https://phabricator.wikimedia.org/T355837#10358949 (10Michael) In the Growth team, we added the first sub-tasks about migrating... [18:51:58] 06Data-Engineering: Airflow has skipped some canary-event tasks - https://phabricator.wikimedia.org/T380836#10358984 (10mforns) [18:52:27] 06Data-Engineering: Airflow skips canary-event tasks - https://phabricator.wikimedia.org/T380836#10358985 (10mforns) [18:55:08] 06Data-Engineering: Airflow skips canary-event tasks - https://phabricator.wikimedia.org/T380836#10358995 (10mforns) This happened again from 2024-11-23, 07:35:00 UTC to 2024-11-23, 09:02:00 UTC. Many mapped events had multiple retries all the way to 7 retries: https://airflow-analytics.wikimedia.org/dags/canary... [19:13:06] 10Data-Engineering (Q2 2024 October 1st - December 31th), 06Data-Platform, 06Movement-Insights: Modify the automated traffic detection to be applied at the project family level - https://phabricator.wikimedia.org/T377257#10359066 (10Hghani) Uploaded results of this analysis to [[ https://gitlab.wikimedia.org... [19:47:19] 06Data-Engineering, 06MediaWiki-Platform-Team, 10MediaWiki-ResourceLoader, 07Schema-change: Drop unused module_deps table from MediaWiki schema - https://phabricator.wikimedia.org/T379661#10359273 (10Krinkle) p:05Triage→03Medium [19:52:18] 06Data-Engineering, 06MediaWiki-Engineering, 10MediaWiki-extensions-WikimediaEvents, 06MediaWiki-Platform-Team, and 5 others: Add Prometheus support to statsd.js via mw.track() - https://phabricator.wikimedia.org/T355837#10359297 (10Krinkle) [19:54:10] 06Data-Engineering, 06MediaWiki-Engineering, 10MediaWiki-extensions-WikimediaEvents, 06MediaWiki-Platform-Team, and 5 others: Add Prometheus support to statsd.js via mw.track() - https://phabricator.wikimedia.org/T355837#10359299 (10Krinkle) @michael The replacement is not yet ready. We aim to have this re... [20:04:59] 06Data-Engineering: Warning of mismatch in declarations of Webrequest schema - https://phabricator.wikimedia.org/T380916 (10nshahquinn-wmf) 03NEW [21:23:37] 06Data-Engineering, 06Data-Platform-SRE, 07Epic: Upgrade Hadoop to version 3.3.x and Hive to version 3.1.x - https://phabricator.wikimedia.org/T379385#10359632 (10JAllemandou) >>! In T379385#10357179, @BTullis wrote: > As I understand it, we no longer wish to support the Hive/Mapreduce query engine in produc... [21:50:51] (03PS3) 10Jennifer Ebe: Edit Geoeditors Daily Monthly to support Temp Account Changes [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1098083 (https://phabricator.wikimedia.org/T379728) [22:28:16] FIRING: HdfsCapacityRemainingPercent: Alarmingly low free space on the analytics-hadoop HDFS cluster. - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Alerts#HDFS_Capacity_Remaining - https://grafana.wikimedia.org/d/000000585/hadoop?var-hadoop_cluster=analytics-hadoop&orgId=1&panelId=106&fullscreen - https://alerts.wikimedia.org/?q=alertname%3DHdfsCapacityRemainingPercent