[00:01:12] 10Analytics-Canonical-Data, 06Movement-Insights: Add CI checking that the data protection information in the canonical country dataset matches the source - https://phabricator.wikimedia.org/T415817#11564385 (10nshahquinn-wmf) p:05High→03Triage [00:09:46] 10Analytics-Canonical-Data, 06Movement-Insights: Add CI checking that the data protection information in the canonical country dataset matches the source - https://phabricator.wikimedia.org/T415817#11564389 (10nshahquinn-wmf) We discussed this at a team meeting today and decided there might be better ways to a... [01:11:58] 06Data-Engineering, 10AQS2.0: Introduce a new AQS endpoint to expose video plays - https://phabricator.wikimedia.org/T415202#11564588 (10Doc_James) The question is do our readers watch videos on our platform? This is a question we do not know the answer to as we have never had this data. Getting this data is k... [01:25:20] FIRING: GobblinKafkaRecordsExtractedNotEqualRecordsExpected: Gobblin job webrequest_sampled ingested an unexpected number of records for a Kafka topic partition. ... [01:25:20] - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Gobblin - https://grafana.wikimedia.org/d/pAQaJwEnk/gobblin?orgId=1&var-gobblin_job_name=webrequest_sampled&var-kafka_topic=webrequest_sampled&viewPanel=24 - https://alerts.wikimedia.org/?q=alertname%3DGobblinKafkaRecordsExtractedNotEqualRecordsExpected [05:25:20] FIRING: GobblinKafkaRecordsExtractedNotEqualRecordsExpected: Gobblin job webrequest_sampled ingested an unexpected number of records for a Kafka topic partition. ... [05:25:20] - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Gobblin - https://grafana.wikimedia.org/d/pAQaJwEnk/gobblin?orgId=1&var-gobblin_job_name=webrequest_sampled&var-kafka_topic=webrequest_sampled&viewPanel=24 - https://alerts.wikimedia.org/?q=alertname%3DGobblinKafkaRecordsExtractedNotEqualRecordsExpected [06:35:59] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Drop rev_sha1 from revision table in wmf production - https://phabricator.wikimedia.org/T411164#11564830 (10ops-monitoring-bot) Starting pool of db2212 by marostegui@cumin1003: After schema change [06:36:33] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Drop rev_sha1 from revision table in wmf production - https://phabricator.wikimedia.org/T411164#11564831 (10Marostegui) [06:36:38] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Drop ar_sha1 from archive table in wmf production - https://phabricator.wikimedia.org/T411163#11564832 (10Marostegui) [06:38:29] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Update imagelinks primary key on wmf production - https://phabricator.wikimedia.org/T415786#11564835 (10Marostegui) [06:53:28] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Update imagelinks primary key on wmf production - https://phabricator.wikimedia.org/T415786#11564887 (10Marostegui) [07:01:49] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Update imagelinks primary key on wmf production - https://phabricator.wikimedia.org/T415786#11564928 (10Marostegui) [07:21:20] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Drop rev_sha1 from revision table in wmf production - https://phabricator.wikimedia.org/T411164#11564957 (10ops-monitoring-bot) Completed pooling of db2212 by marostegui@cumin1003: After schema change [08:36:10] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Extract bot classification into new repo - https://phabricator.wikimedia.org/T415874 (10Antoine_Quhen) 03NEW [09:21:48] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Update imagelinks primary key on wmf production - https://phabricator.wikimedia.org/T415786#11565186 (10Marostegui) [09:25:20] FIRING: GobblinKafkaRecordsExtractedNotEqualRecordsExpected: Gobblin job webrequest_sampled ingested an unexpected number of records for a Kafka topic partition. ... [09:25:20] - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Gobblin - https://grafana.wikimedia.org/d/pAQaJwEnk/gobblin?orgId=1&var-gobblin_job_name=webrequest_sampled&var-kafka_topic=webrequest_sampled&viewPanel=24 - https://alerts.wikimedia.org/?q=alertname%3DGobblinKafkaRecordsExtractedNotEqualRecordsExpected [09:41:28] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Drop ar_sha1 from archive table in wmf production - https://phabricator.wikimedia.org/T411163#11565239 (10Marostegui) [09:41:35] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Drop rev_sha1 from revision table in wmf production - https://phabricator.wikimedia.org/T411164#11565243 (10Marostegui) [11:00:07] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Drop ar_sha1 from archive table in wmf production - https://phabricator.wikimedia.org/T411163#11565351 (10Marostegui) [11:00:11] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Drop rev_sha1 from revision table in wmf production - https://phabricator.wikimedia.org/T411164#11565352 (10Marostegui) [11:09:08] 06Data-Engineering, 06Data-Engineering-Radar, 10CheckUser, 06Product Safety and Integrity: Changes to the cuc_agent column in the cu_changes table - https://phabricator.wikimedia.org/T361210#11565365 (10GGoncalves-WMF) [11:13:08] 06Data-Engineering, 10CheckUser, 06Product Safety and Integrity: Changes to the cuc_agent column in the cu_changes table - https://phabricator.wikimedia.org/T361210#11565374 (10GGoncalves-WMF) Update (from Slack): `cuc_agent` and `cu_changes` are no longer bring read in MediaWiki, and writing will tentativel... [11:13:19] 06Data-Engineering, 10CheckUser, 06Product Safety and Integrity: Changes to the cuc_agent column in the cu_changes table - https://phabricator.wikimedia.org/T361210#11565375 (10GGoncalves-WMF) [12:05:05] RESOLVED: GobblinKafkaRecordsExtractedNotEqualRecordsExpected: Gobblin job webrequest_sampled ingested an unexpected number of records for a Kafka topic partition. ... [12:05:05] - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Gobblin - https://grafana.wikimedia.org/d/pAQaJwEnk/gobblin?orgId=1&var-gobblin_job_name=webrequest_sampled&var-kafka_topic=webrequest_sampled&viewPanel=24 - https://alerts.wikimedia.org/?q=alertname%3DGobblinKafkaRecordsExtractedNotEqualRecordsExpected [12:55:49] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10DPE-Mediawiki-Content: Missing/inconsistent page_redirect_target field for redirects in Mediawiki content current v1 dumps - https://phabricator.wikimedia.org/T400632#11565596 (10Antoine_Quhen) a:05Antoine_Quhen→03None [13:03:12] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Drop rev_sha1 from revision table in wmf production - https://phabricator.wikimedia.org/T411164#11565618 (10Marostegui) [13:04:00] 06Data-Engineering, 10CheckUser, 06Product Safety and Integrity: Changes to the cuc_agent column in the cu_changes table - https://phabricator.wikimedia.org/T361210#11565622 (10Dreamy_Jazz) [13:04:07] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Drop ar_sha1 from archive table in wmf production - https://phabricator.wikimedia.org/T411163#11565623 (10Marostegui) [13:40:28] 06Data-Engineering, 06Machine-Learning-Team, 10Event-Platform: Add Multilingual RevertRisk predictions to mediawiki.page_revert_risk_prediction_change - https://phabricator.wikimedia.org/T415892 (10achou) 03NEW [13:46:47] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): [Refine Simplification] Remove Schema Merging in Refine Process by Enforcing Backward Compatibility - https://phabricator.wikimedia.org/T381072#11565764 (10Antoine_Quhen) a:05Antoine_Quhen→03None [13:47:15] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): refine_to_hive dag optimizations - https://phabricator.wikimedia.org/T392668#11565766 (10Antoine_Quhen) a:05Antoine_Quhen→03None [13:48:23] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Update imagelinks primary key on wmf production - https://phabricator.wikimedia.org/T415786#11565784 (10Marostegui) [14:18:56] 06Data-Engineering, 07Sustainability: Move some analytics jobs to day time in Virginia - https://phabricator.wikimedia.org/T384166#11565913 (10xcollazo) Since the vast majority of analytic jobs are scheduled via Airflow, we could certainly change the`schedule` cron definitions of each DAG to achieve this, but... [14:26:24] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Update imagelinks primary key on wmf production - https://phabricator.wikimedia.org/T415786#11565935 (10Marostegui) [14:34:34] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Update imagelinks primary key on wmf production - https://phabricator.wikimedia.org/T415786#11565998 (10Marostegui) [14:38:06] 06Data-Engineering: [Iceberg Migration] Extend Iceberg table maintenance mechanism to support multiple Airflow instances - https://phabricator.wikimedia.org/T373693#11566013 (10Snwachukwu) a:03Snwachukwu [15:06:13] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Update imagelinks primary key on wmf production - https://phabricator.wikimedia.org/T415786#11566131 (10Marostegui) [15:09:01] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Update imagelinks primary key on wmf production - https://phabricator.wikimedia.org/T415786#11566137 (10Marostegui) [15:10:03] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Update imagelinks primary key on wmf production - https://phabricator.wikimedia.org/T415786#11566145 (10ops-monitoring-bot) Starting pool of db1210 by marostegui@cumin1003: After schema change [15:10:58] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Update imagelinks primary key on wmf production - https://phabricator.wikimedia.org/T415786#11566146 (10ops-monitoring-bot) Starting pool of db1201 by marostegui@cumin1003: After schema change [15:18:15] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Update imagelinks primary key on wmf production - https://phabricator.wikimedia.org/T415786#11566184 (10Marostegui) [15:40:01] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): [Iceberg Migration] Extend Iceberg table maintenance mechanism to support multiple Airflow instances - https://phabricator.wikimedia.org/T373693#11566301 (10Snwachukwu) [15:55:31] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Update imagelinks primary key on wmf production - https://phabricator.wikimedia.org/T415786#11566350 (10ops-monitoring-bot) Completed pooling of db1210 by marostegui@cumin1003: After schema change [15:56:25] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Update imagelinks primary key on wmf production - https://phabricator.wikimedia.org/T415786#11566351 (10ops-monitoring-bot) Completed pooling of db1201 by marostegui@cumin1003: After schema change [16:32:33] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Troubleshoot duplicates issue in mw_content_merge_events_to_mw_content_history_daily - https://phabricator.wikimedia.org/T410431#11566608 (10APizzata-WMF) We have changed the `pushdown_strategy` to `earliest_revision_dt` and this should avoid the duplicat... [16:35:44] 06Data-Engineering, 06Machine-Learning-Team, 10Event-Platform: Add Multilingual RevertRisk predictions to mediawiki.page_revert_risk_prediction_change - https://phabricator.wikimedia.org/T415892#11566639 (10Ottomata) Sounds easy enough (as long as the model can scale for scoring every revision! :) ) Should... [16:59:30] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Update imagelinks primary key on wmf production - https://phabricator.wikimedia.org/T415786#11566887 (10Marostegui) [16:59:54] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 13Patch-For-Review: Clean up artifacts.yaml - https://phabricator.wikimedia.org/T405379#11566888 (10Ottomata) Sweet! FYI I just added to task description. > [] Manually delete artifacts from HDFS that are no longer referenced in artifacts.yaml files [17:02:34] 06Data-Engineering, 10MediaWiki-extensions-EventLogging, 07Essential-Work, 07Technical-Debt, 06Test Kitchen (Experiment Platform Sprint 19): Migrate 1 instrument using mw.eventLog.newInstrument() to mw.xLab.getInstrument() - https://phabricator.wikimedia.org/T408096#11566917 (10KReid-WMF) [17:05:44] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Update imagelinks primary key on wmf production - https://phabricator.wikimedia.org/T415786#11566971 (10Marostegui) [17:11:43] 06Data-Engineering, 10MediaWiki-extensions-EventLogging, 10MediaWiki-extensions-WikimediaEvents, 07Essential-Work, and 3 others: Update name in EventLogging Extension and WikimediaEvents Extension - https://phabricator.wikimedia.org/T407904#11567024 (10A_smart_kitten) [17:37:52] 06Data-Engineering, 06Reader Growth Team, 06Wikipedia-Android-App-Backlog, 06Wikipedia-iOS-App-Backlog, and 3 others: Add page_id and namespace to X-Analytics header in Mobile App requests (2025 remake) - https://phabricator.wikimedia.org/T409358#11567136 (10Ottomata) Great okay! I think I see some. It l... [17:50:40] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Content-Transform-Team, 06MW-Interfaces-Team, 10Event-Platform: Common event data model for data derived from parsed page revision content - https://phabricator.wikimedia.org/T415158#11567183 (10Ottomata) [17:51:09] 06Data-Engineering, 06Research: Make Enterprise HTML Dumps available in hadoop - https://phabricator.wikimedia.org/T305688#11567187 (10Ottomata) [18:06:10] 06Data-Engineering, 10Commons-Impact-Metrics, 10Commons-Impact-Metrics-Requests: Update Commons Impact Metrics allow-list January 2026 - https://phabricator.wikimedia.org/T415927 (10GFontenelle_WMF) 03NEW [19:26:36] 06Data-Engineering, 07Epic: [EPIC] Move SystemD timer based jobs to Airflow - https://phabricator.wikimedia.org/T415941 (10AKhatun_WMF) 03NEW [19:28:24] 06Data-Engineering, 07Epic: [EPIC] Move SystemD timer based jobs to Airflow - https://phabricator.wikimedia.org/T415941#11567606 (10AKhatun_WMF) [19:28:26] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Inventory of SystemD timer based jobs and pipelines - https://phabricator.wikimedia.org/T414107#11567607 (10AKhatun_WMF) [19:28:28] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Migrate cleanup jobs for snapshot datasets from systemd timers to Airflow - https://phabricator.wikimedia.org/T411999#11567608 (10AKhatun_WMF) [19:28:37] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Wikimedia Enterprise, 10Wikimedia Enterprise - Content Integrity, 06Data-Platform-SRE (2026.01.23 - 2026.02.13), 07Essential-Work: Implement an Airflow operator for moving data from point A to B - https://phabricator.wikimedia.org/T405360#11567609... [19:30:51] 06Data-Engineering, 07Epic: [EPIC] Move SystemD timer based jobs to Airflow - https://phabricator.wikimedia.org/T415941#11567613 (10AKhatun_WMF) [19:36:32] 06Data-Engineering: Move HDFSCleaner timers to Airflow - https://phabricator.wikimedia.org/T415944 (10AKhatun_WMF) 03NEW [19:46:46] 06Data-Engineering: Move import jobs from SystemD timers to Airflow - https://phabricator.wikimedia.org/T415945 (10AKhatun_WMF) 03NEW [19:47:28] 06Data-Engineering: Move import jobs from SystemD timers to Airflow - https://phabricator.wikimedia.org/T415945#11567703 (10AKhatun_WMF) [19:47:46] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Wikimedia Enterprise, 10Wikimedia Enterprise - Content Integrity, 06Data-Platform-SRE (2026.01.23 - 2026.02.13), 07Essential-Work: Implement an Airflow operator for moving data from point A to B - https://phabricator.wikimedia.org/T405360#11567704... [19:48:00] 06Data-Engineering, 07Epic: [EPIC] Move SystemD timer based jobs to Airflow - https://phabricator.wikimedia.org/T415941#11567706 (10AKhatun_WMF) [19:48:09] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Wikimedia Enterprise, 10Wikimedia Enterprise - Content Integrity, 06Data-Platform-SRE (2026.01.23 - 2026.02.13), 07Essential-Work: Implement an Airflow operator for moving data from point A to B - https://phabricator.wikimedia.org/T405360#11567707... [19:57:31] 06Data-Engineering: Move leftover Refine related timers to Airflow - https://phabricator.wikimedia.org/T415947 (10AKhatun_WMF) 03NEW [20:01:26] 06Data-Engineering, 07Epic: [EPIC] Move SystemD timer based jobs to Airflow - https://phabricator.wikimedia.org/T415941#11567769 (10AKhatun_WMF) [20:01:29] 06Data-Engineering, 06Data-Engineering-Icebox, 10Data Pipelines, 13Patch-Needs-Improvement: [Iceberg] Update Refine Sanitize to insert into Iceberg tables - https://phabricator.wikimedia.org/T311739#11567770 (10AKhatun_WMF) [20:07:25] 06Data-Engineering: Move Refinery timers to Airflow - https://phabricator.wikimedia.org/T415949 (10AKhatun_WMF) 03NEW [20:25:57] 06Data-Engineering: Move Sqoop timers to Airflow - https://phabricator.wikimedia.org/T415951 (10AKhatun_WMF) 03NEW [20:29:56] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10DPE-Mediawiki-Content: Inconsistent page title styles in Mediawiki content current v1 dumps - https://phabricator.wikimedia.org/T410405#11567841 (10Isaac) Thanks @xcollazo for picking this up and tracking the issue down! Your explanation makes sense t... [21:24:27] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Commons-Impact-Metrics, 10Commons-Impact-Metrics-Requests: Update Commons Impact Metrics allow-list January 2026 - https://phabricator.wikimedia.org/T415927#11567953 (10xcollazo) a:05mforns→03xcollazo [21:30:29] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Content-Transform-Team, 06MW-Interfaces-Team, 10Event-Platform: Common event data model for data derived from parsed page revision content - https://phabricator.wikimedia.org/T415158#11567998 (10Ottomata) Re repeating data: Within any use of the... [21:30:43] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Event-Platform: Instance-level EventGate configuration to enable/disable functionality - https://phabricator.wikimedia.org/T415549#11568000 (10tchin) I chatted with @Ottomata about this a little bit, here's what I'm going to attempt: 1. Refactor `set... [23:43:13] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10CheckUser, 06Product Safety and Integrity: Changes to the cuc_agent column in the cu_changes table - https://phabricator.wikimedia.org/T361210#11568323 (10Ahoelzl)