[06:02:15] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Drop cuc_agent & cuc_ip from cu_changes, cule_agent & cule_ip from cu_log_event, and cupe_agent & cupe_ip from cu_private_event on WMF wikis - https://phabricator.wikimedia.org/T418465#11661690 (10Marostegui) [06:10:18] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 07Schema-change-in-production: Update imagelinks primary key on wmf production - https://phabricator.wikimedia.org/T415786#11661700 (10Marostegui) [06:20:23] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 07Schema-change-in-production: Update imagelinks primary key on wmf production - https://phabricator.wikimedia.org/T415786#11661719 (10Marostegui) [08:26:09] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Dumps-Generation: Data missing from en.wiktionary.org February 2026 "MediaWiki Content File Exports" compared to "XML Database dump" - https://phabricator.wikimedia.org/T417596#11661901 (10APizzata-WMF) a:03APizzata-WMF [08:32:03] 06Data-Engineering, 06Data-Engineering-Radar, 06Data-Platform-SRE, 06SRE, 10SRE-Access-Requests: Requesting access to analytics-platform-eng-admins for milimetric - https://phabricator.wikimedia.org/T417906#11661910 (10Jelto) @Milimetric can you ping your manager to sign this access request too? [10:41:23] 14Analytics, 06Data-Engineering, 10Data-Engineering-Wikistats: Wikistats pageview annotations are not shown when splitting by a dimension - https://phabricator.wikimedia.org/T418725 (10GGoncalves-WMF) 03NEW [11:13:46] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Drop cuc_agent & cuc_ip from cu_changes, cule_agent & cule_ip from cu_log_event, and cupe_agent & cupe_ip from cu_private_event on WMF wikis - https://phabricator.wikimedia.org/T418465#11662372 (10Marostegui) [11:29:57] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06MW-Interfaces-Team, 06Traffic, 06MediaWiki-Platform-Team (Radar), and 2 others: haproxy: capture x-wmf-* headers in webrequest data set - https://phabricator.wikimedia.org/T417864#11662416 (10Fabfur) Thanks to the @JAllemandou summary, I've wrote... [12:10:04] 06Data-Engineering, 06Data-Engineering-Radar, 07Essential-Work, 10Event-Platform, 06Test Kitchen (Experiment Platform Sprint 20): X-Experiment-Enrollments EventGate handling reinforcement for MalformedHeaderError cases - https://phabricator.wikimedia.org/T409106#11662555 (10phuedx) [12:10:20] 06Data-Engineering, 06Data-Engineering-Radar, 07Essential-Work, 10Event-Platform, 06Test Kitchen (Experiment Platform Sprint 20): X-Experiment-Enrollments EventGate handling reinforcement for MalformedHeaderError cases - https://phabricator.wikimedia.org/T409106#11662557 (10phuedx) a:03phuedx [13:00:25] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 10MediaWiki-Page-derived-data, 13Patch-For-Review: Normalize categorylinks table - https://phabricator.wikimedia.org/T299951#11662735 (10ops-monitoring-bot) Cookbook cookbooks.sre.wikireplicas.update-views run by ladsgroup: Started updating wiki re... [13:07:16] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 10MediaWiki-Page-derived-data, 13Patch-For-Review: Normalize categorylinks table - https://phabricator.wikimedia.org/T299951#11662771 (10ops-monitoring-bot) Cookbook cookbooks.sre.wikireplicas.update-views started by ladsgroup executed with errors:... [13:13:29] 06Data-Engineering, 06Machine-Learning-Team, 10Event-Platform: Emit article quality predictions as a stream and expose in EventStreams API. - https://phabricator.wikimedia.org/T417794#11662794 (10achou) We likely need a new event schema for this use case. The schema that Lift Wing has been using assumes [[ h... [13:29:10] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 10MediaWiki-Page-derived-data, 13Patch-For-Review: Normalize categorylinks table - https://phabricator.wikimedia.org/T299951#11662842 (10ops-monitoring-bot) Cookbook cookbooks.sre.wikireplicas.update-views run by ladsgroup: Started updating wiki re... [13:35:33] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 10MediaWiki-Page-derived-data, 13Patch-For-Review: Normalize categorylinks table - https://phabricator.wikimedia.org/T299951#11662867 (10ops-monitoring-bot) Cookbook cookbooks.sre.wikireplicas.update-views started by ladsgroup executed with errors:... [13:57:22] 06Data-Engineering, 06cloud-services-team, 06Data-Persistence, 10Data-Services, and 3 others: Set up x1 replication to an-redacteddb1001 - https://phabricator.wikimedia.org/T407485#11662960 (10BTullis) a:03BTullis [14:14:08] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 10MediaWiki-Page-derived-data, 13Patch-For-Review: Normalize categorylinks table - https://phabricator.wikimedia.org/T299951#11663009 (10ops-monitoring-bot) Cookbook cookbooks.sre.wikireplicas.update-views run by ladsgroup: Started updating wiki re... [14:23:22] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 10MediaWiki-Page-derived-data, 13Patch-For-Review: Normalize categorylinks table - https://phabricator.wikimedia.org/T299951#11663053 (10ops-monitoring-bot) Cookbook cookbooks.sre.wikireplicas.update-views started by ladsgroup executed with errors:... [14:26:18] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 10MediaWiki-Page-derived-data, 13Patch-For-Review: Normalize categorylinks table - https://phabricator.wikimedia.org/T299951#11663059 (10ops-monitoring-bot) Cookbook cookbooks.sre.wikireplicas.update-views run by ladsgroup: Started updating wiki re... [14:31:22] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 10MediaWiki-Page-derived-data: Normalize categorylinks table - https://phabricator.wikimedia.org/T299951#11663092 (10ops-monitoring-bot) Cookbook cookbooks.sre.wikireplicas.update-views started by ladsgroup executed with errors: - an-redacteddb1001.e... [14:38:22] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06MW-Interfaces-Team, 06Traffic, 06MediaWiki-Platform-Team (Radar), and 2 others: haproxy: capture x-wmf-* headers in webrequest data set - https://phabricator.wikimedia.org/T417864#11663137 (10Tgr) > Can be set in X-Analytics directly in the backen... [15:14:51] 06Data-Engineering, 06Machine-Learning-Team, 10Event-Platform: Emit article quality predictions as a stream and expose in EventStreams API. - https://phabricator.wikimedia.org/T417794#11663262 (10Ottomata) > The schema that Lift Wing has been using assumes classification outputs and no more. The article qual... [15:42:26] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Movement-Insights: Investigate and repair pageviews and unique devices spike starting in Nov 2025 - https://phabricator.wikimedia.org/T416933#11663417 (10Ahoelzl) a:05Ahoelzl→03mforns [15:46:33] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Drop cuc_agent & cuc_ip from cu_changes, cule_agent & cule_ip from cu_log_event, and cupe_agent & cupe_ip from cu_private_event on WMF wikis - https://phabricator.wikimedia.org/T418465#11663447 (10Marostegui) [16:06:08] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Do multiple code and data clean ups for content tables - https://phabricator.wikimedia.org/T418754 (10xcollazo) 03NEW [16:33:49] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Do multiple code and data clean ups for content tables - https://phabricator.wikimedia.org/T418754#11663909 (10xcollazo) [16:49:30] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): haproxykafka and varnishkafka sent different uri_paths - https://phabricator.wikimedia.org/T418767 (10Milimetric) 03NEW [16:49:47] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): haproxykafka and varnishkafka sent different uri_paths - https://phabricator.wikimedia.org/T418767#11664068 (10Milimetric) p:05Triage→03Medium a:03Milimetric [16:51:47] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 07Epic, 05MW-1.46-notes (1.46.0-wmf.18; 2026-03-03): Roll instrument out to 100% of enwiki - https://phabricator.wikimedia.org/T418385#11664088 (10Milimetric) a:05tchin→03Milimetric [16:52:05] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 07Epic, 05MW-1.46-notes (1.46.0-wmf.18; 2026-03-03): Roll instrument out to 100% of enwiki - https://phabricator.wikimedia.org/T418385#11664095 (10Milimetric) p:05Triage→03High [16:52:13] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Attribution Research First Experiment - https://phabricator.wikimedia.org/T416200#11664096 (10Milimetric) p:05Triage→03High [17:01:50] 06Data-Engineering, 06Data-Engineering-Radar, 06Data-Platform-SRE, 06SRE, 10SRE-Access-Requests: Requesting access to analytics-platform-eng-admins for milimetric - https://phabricator.wikimedia.org/T417906#11664154 (10calbon) I approve of this. [17:04:52] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Create a data product of IP range to owner/provenance label - https://phabricator.wikimedia.org/T418466#11664182 (10Ahoelzl) a:03JAllemandou [17:10:29] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Research, 10Event-Platform, 13Patch-For-Review: Event stream with latest revision HTML & parent revision HTML diff - https://phabricator.wikimedia.org/T360794#11664216 (10Ottomata) [17:11:36] FIRING: MediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag: ... [17:11:36] High Kafka consumer lag for mw_content_history_reconcile_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-content-history-reconcile-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_content_history_reconcile_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag [17:11:45] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Research, 10Event-Platform, 13Patch-For-Review: Event stream with latest revision HTML & parent revision HTML diff - https://phabricator.wikimedia.org/T360794#11664228 (10Ottomata) I've edited the description with a table that will explain what sh... [17:12:07] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Research, 10Event-Platform, 13Patch-For-Review: Event stream with latest revision HTML & parent revision HTML diff - https://phabricator.wikimedia.org/T360794#11664229 (10Ottomata) [17:12:55] 06Data-Engineering, 06SRE, 10SRE-Access-Requests: Requesting access to analytics-private-users for maxbinderWMF - https://phabricator.wikimedia.org/T417655#11664238 (10MBinder_WMF) Hmm, weird. I could have sworn we already disabled that old account because we've discovered this confusion before. I just went... [17:16:23] 06Data-Engineering, 06SRE, 10SRE-Access-Requests: Requesting access to analytics-private-users for maxbinderWMF - https://phabricator.wikimedia.org/T417655#11664288 (10MoritzMuehlenhoff) Ok! I've just enabled "wmf" for your "mbinder" account. Let me know if all works fine, then I'll go ahead and disable the... [17:31:36] FIRING: [2x] MediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag: ... [17:31:36] High Kafka consumer lag for mw_content_history_reconcile_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-content-history-reconcile-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_content_history_reconcile_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag [17:37:19] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Research, 10Event-Platform, 13Patch-For-Review: Event stream with latest revision HTML & parent revision HTML diff - https://phabricator.wikimedia.org/T360794#11664477 (10Ottomata) The [[ https://gitlab.wikimedia.org/repos/data-engineering/mediawi... [17:38:31] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Research, 10Event-Platform, 13Patch-For-Review: Event stream with latest revision HTML & parent revision HTML diff - https://phabricator.wikimedia.org/T360794#11664491 (10Ottomata) [17:42:33] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10DPE-Mediawiki-Content, 13Patch-For-Review: Inconsistent page title styles in Mediawiki content current v1 dumps - https://phabricator.wikimedia.org/T410405#11664524 (10xcollazo) 05Open→03Resolved [17:58:05] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Do multiple code and data clean ups for content tables - https://phabricator.wikimedia.org/T418754#11664610 (10xcollazo) I wonder whether the calculation for the dashboards are off, as an `hdfs dfs -count` says that we have 136TB? ` !hdfs dfs -count -h /... [18:05:29] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Do multiple code and data clean ups for content tables - https://phabricator.wikimedia.org/T418754#11664647 (10xcollazo) Indeed the definition of the Superset virtual dataset seems to be double (or triple?) counting: Dataset definition: https://superset.... [18:14:27] 06Data-Engineering: HDFS usage dashboard is quadruple counting file counts and file sizes - https://phabricator.wikimedia.org/T418780 (10xcollazo) 03NEW [18:15:22] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Do multiple code and data clean ups for content tables - https://phabricator.wikimedia.org/T418754#11664711 (10xcollazo) Opened {T418780} to tackle the dashboard issue separately. [18:49:14] 06Data-Engineering, 06SRE, 10SRE-Access-Requests: Requesting access to analytics-private-users for maxbinderWMF - https://phabricator.wikimedia.org/T417655#11664963 (10MBinder_WMF) I still can't access https://superset.wikimedia.org/superset/dashboard/686/?native_filters_key=z1JEuUqeuYwWhAoOsDa2e4Fz57XRH5kXU... [18:50:15] 06Data-Engineering, 06SRE, 10SRE-Access-Requests: Requesting access to analytics-private-users for maxbinderWMF - https://phabricator.wikimedia.org/T417655#11664967 (10MBinder_WMF) Actually, I take that back, it just worked! Guess it needed a second, or perhaps a credential refresh. All good now, please disa... [19:00:42] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Do multiple code and data clean ups for content tables - https://phabricator.wikimedia.org/T418754#11665009 (10xcollazo) Confirmed that these paths are the old copies of these datasets that were replaced on {T405944}: ` /wmf/data/wmf_content/mediawiki_con... [19:07:08] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Do multiple code and data clean ups for content tables - https://phabricator.wikimedia.org/T418754#11665096 (10mforns) LGTM @xcollazo [19:13:18] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Do multiple code and data clean ups for content tables - https://phabricator.wikimedia.org/T418754#11665121 (10xcollazo) In prod: ` $ hostname -f an-launcher1003.eqiad.wmnet $ sudo -u analytics bash $ kerberos-run-command analytics hive hive (default)>... [19:41:51] FIRING: [2x] MediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag: ... [19:41:51] High Kafka consumer lag for mw_content_history_reconcile_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-content-history-reconcile-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_content_history_reconcile_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag [19:45:02] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Do multiple code and data clean ups for content tables - https://phabricator.wikimedia.org/T418754#11665229 (10xcollazo) Next thing that is weird is the size of `mediawiki_revision_history_v1`: ` $ kerberos-run-command analytics hdfs dfs -count -h /wmf/d... [19:46:36] FIRING: [2x] MediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag: ... [19:46:36] High Kafka consumer lag for mw_content_history_reconcile_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-content-history-reconcile-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_content_history_reconcile_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag [19:47:09] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 10MediaWiki-Page-derived-data: Normalize categorylinks table - https://phabricator.wikimedia.org/T299951#11665241 (10Izno) In the future, please add "it's finally going away" to tech news at least a week in advance. [19:52:12] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 10MediaWiki-Page-derived-data: Normalize categorylinks table - https://phabricator.wikimedia.org/T299951#11665253 (10Zabe) >>! In T299951#11665241, @Izno wrote: > In the future, please add "it's finally going away" to tech news at least a week in adv... [19:52:50] 06Data-Engineering, 06MW-Interfaces-Team, 10Event-Platform: Expose MediaWiki Parser render_id as a response header in relevant MW REST API endpoints - https://phabricator.wikimedia.org/T418792 (10Ottomata) 03NEW [20:29:26] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Do multiple code and data clean ups for content tables - https://phabricator.wikimedia.org/T418754#11665393 (10xcollazo) Just realized we were inadvertently using `set_of_page_ids` on DAG [[ https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags... [20:29:56] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Do multiple code and data clean ups for content tables - https://phabricator.wikimedia.org/T418754#11665396 (10xcollazo) [20:36:57] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Do multiple code and data clean ups for content tables - https://phabricator.wikimedia.org/T418754#11665429 (10xcollazo) The following snapshot investigation shows a better picture of the churn: ` spark.sql(""" SELECT committed_at, snapshot_id,... [20:42:22] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Do multiple code and data clean ups for content tables - https://phabricator.wikimedia.org/T418754#11665441 (10xcollazo) In prod: ` $ spark3-sql spark-sql (default)> ALTER TABLE wmf_content.mediawiki_revision_history_v1 > SET TBLPROP... [21:03:30] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Content-Transform-Team, 06MW-Interfaces-Team, 10Event-Platform, 13Patch-For-Review: Common event data model for data derived from parsed page revision html (and more!) - https://phabricator.wikimedia.org/T415158#11665527 (10Ottomata) Alright, [[... [21:42:30] 06Data-Engineering: table_maintenance_iceberg_monthly permission issue fails task due to permission on Ivy cache artifact - https://phabricator.wikimedia.org/T418804 (10dr0ptp4kt) 03NEW [22:25:16] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Movement-Insights: Investigate and repair pageviews and unique devices spike starting in Nov 2025 - https://phabricator.wikimedia.org/T416933#11665920 (10mforns) Hi all! To discard that this data spike could have been caused or affected by an infrastr... [23:48:53] FIRING: [2x] MediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag: ... [23:48:54] High Kafka consumer lag for mw_content_history_reconcile_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-content-history-reconcile-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_content_history_reconcile_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag