[03:17:42] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 07Epic: Attribution Research: Instrument pageviews - https://phabricator.wikimedia.org/T417050#11671652 (10Milimetric) Typing out loud the rationalization for the instrumentation here: * `page_load` - sent only if logged out ** attributes: `page_id, pa... [03:49:13] FIRING: [2x] MediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag: ... [03:49:13] High Kafka consumer lag for mw_content_history_reconcile_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-content-history-reconcile-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_content_history_reconcile_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag [06:18:06] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Drop cuc_agent & cuc_ip from cu_changes, cule_agent & cule_ip from cu_log_event, and cupe_agent & cupe_ip from cu_private_event on WMF wikis - https://phabricator.wikimedia.org/T418465#11671755 (10Marostegui) [06:18:34] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Drop cuc_agent & cuc_ip from cu_changes, cule_agent & cule_ip from cu_log_event, and cupe_agent & cupe_ip from cu_private_event on WMF wikis - https://phabricator.wikimedia.org/T418465#11671756 (10Marostegui) [06:18:44] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Drop cuc_agent & cuc_ip from cu_changes, cule_agent & cule_ip from cu_log_event, and cupe_agent & cupe_ip from cu_private_event on WMF wikis - https://phabricator.wikimedia.org/T418465#11671757 (10Marostegui) This is all done [06:18:52] 06Data-Engineering, 06DBA, 07Schema-change-in-production: Drop cuc_agent & cuc_ip from cu_changes, cule_agent & cule_ip from cu_log_event, and cupe_agent & cupe_ip from cu_private_event on WMF wikis - https://phabricator.wikimedia.org/T418465#11671758 (10Marostegui) 05Open→03Resolved [07:49:13] FIRING: [2x] MediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag: ... [07:49:13] High Kafka consumer lag for mw_content_history_reconcile_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-content-history-reconcile-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_content_history_reconcile_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag [08:14:50] FIRING: [2x] MediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag: ... [08:14:56] High Kafka consumer lag for mw_content_history_reconcile_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-content-history-reconcile-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_content_history_reconcile_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag [08:18:57] FIRING: [2x] MediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag: ... [08:19:03] High Kafka consumer lag for mw_content_history_reconcile_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-content-history-reconcile-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_content_history_reconcile_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag [08:22:45] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 07User-notice-archive: Publish Dumps 2 to dumps.wikimedia.org and provide only monthly dumps - https://phabricator.wikimedia.org/T414389#11671963 (10Blahma) > May I offer our mediawiki_history dataset as an alternative though? This was put together... [10:03:34] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Create a data product of IP range to owner/provenance label - https://phabricator.wikimedia.org/T418466#11672327 (10GGoncalves-WMF) Thanks KC, this looks really good! We'll discuss with @Ahoelzl about getting this prioritized, I think we do need to get it... [10:13:06] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Data-Engineering-Radar, 06Data-Platform-SRE (2026-02-13 - 2026-03-06), 13Patch-For-Review: Reduce noise from HdfsRpcQueueLength alert - https://phabricator.wikimedia.org/T418152#11672372 (10JAllemandou) 05Open→03Resolved [10:13:22] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Data-Platform-SRE (2026-02-13 - 2026-03-06), 13Patch-For-Review: HdfsTotalFilesHeap warning - https://phabricator.wikimedia.org/T418551#11672373 (10JAllemandou) 05Open→03Resolved [10:16:59] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06MW-Interfaces-Team, 06Traffic, 06MediaWiki-Platform-Team (Radar), and 2 others: haproxy: capture x-wmf-* headers in webrequest data set - https://phabricator.wikimedia.org/T417864#11672387 (10JAllemandou) a:03JAllemandou [10:37:28] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06MW-Interfaces-Team, 06Traffic, 06MediaWiki-Platform-Team (Radar), 07OKR-Work: haproxy: capture x-wmf-* headers in webrequest data set - https://phabricator.wikimedia.org/T417864#11672485 (10Fabfur) The [[ https://gerrit.wikimedia.org/r/1247034 |... [10:58:04] 06Data-Engineering, 06Data-Engineering-Radar, 06Test Kitchen, 10Wikidata, 10Wikidata Analytics (Kanban): Add rcshowwikidata property to the existing PrefUpdate instrumentation for wmf_raw.mediawiki_user_properties - https://phabricator.wikimedia.org/T418246#11672587 (10AndrewTavis_WMDE) [11:01:34] 06Data-Engineering, 06Data-Platform-SRE, 10Wikidata, 10Wikidata Analytics: Bug: Published datasets/reports directory sync fails when incompatible files are saved - https://phabricator.wikimedia.org/T412108#11672593 (10AndrewTavis_WMDE) Checking through our surveillance tasks and am realizing I didn't respo... [11:02:22] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Data-Platform-SRE (2026-02-13 - 2026-03-06), 13Patch-For-Review: Deploy turnilo to dse-k8s-eqiad - https://phabricator.wikimedia.org/T416113#11672598 (10brouberol) As of right now, https://turnilo-next.wikimedia.org is hosted in Kubernetes, is CAS-a... [11:02:30] 06Data-Engineering, 06Data-Engineering-Radar, 06Test Kitchen, 10Wikidata, 10Wikidata Analytics (Kanban): Add rcshowwikidata property to the existing PrefUpdate instrumentation for wmf_raw.mediawiki_user_properties - https://phabricator.wikimedia.org/T418246#11672599 (10AndrewTavis_WMDE) a:03AndrewTavis_... [11:03:36] 06Data-Engineering, 06Data-Engineering-Radar, 06Test Kitchen, 10Wikidata, 10Wikidata Analytics (Kanban): Add rcshowwikidata property to the existing PrefUpdate instrumentation for wmf_raw.mediawiki_user_properties - https://phabricator.wikimedia.org/T418246#11672602 (10AndrewTavis_WMDE) Thanks @Ottomata... [11:05:30] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Create a data product of IP range to owner/provenance label - https://phabricator.wikimedia.org/T418466#11672612 (10KCVelaga_WMF) > My understanding is that this is not currently a high priority in WE5, so we have a little bit of freedom to schedule it in... [11:14:39] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Event-Platform: Logs and Monitoring for the HTML pipeline - https://phabricator.wikimedia.org/T418996 (10JMonton-WMF) 03NEW [12:19:13] FIRING: [2x] MediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag: ... [12:19:19] High Kafka consumer lag for mw_content_history_reconcile_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-content-history-reconcile-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_content_history_reconcile_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag [12:54:22] 06Data-Engineering, 06Data-Platform-SRE (2026-02-13 - 2026-03-06): Transfer ownership of Watchlist CTR dashboard to Mikhail - https://phabricator.wikimedia.org/T418485#11672991 (10brouberol) a:03brouberol [13:01:47] 06Data-Engineering, 06Data-Platform-SRE (2026-02-13 - 2026-03-06): Transfer ownership of Watchlist CTR dashboard to Mikhail - https://phabricator.wikimedia.org/T418485#11673014 (10brouberol) @mpopov I've set you as the owner of the chart in superset, all of the individual charts and the data sources. Can you c... [13:01:56] 06Data-Engineering, 06Data-Platform-SRE (2026-02-13 - 2026-03-06): Transfer ownership of Watchlist CTR dashboard to Mikhail - https://phabricator.wikimedia.org/T418485#11673016 (10brouberol) 05Open→03In progress [13:09:56] (03CR) 10Hasan Akgün (WMDE): [C:03+1] Add BSD 3-Clause License to the repo backdated to first commit year [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/1243749 (owner: 10Andrew McAllister (WMDE)) [14:01:03] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 13Patch-For-Review: Refactor our existing Airflow dags to use EasyDAG & DagProperties - https://phabricator.wikimedia.org/T336738#11673305 (10xcollazo) We took this longstanding technical debt as an opportunity to leverage an AI assisted refactor. Here... [14:12:05] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 13Patch-For-Review: Do multiple code and data clean ups for content tables - https://phabricator.wikimedia.org/T418754#11673383 (10xcollazo) [14:19:54] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 13Patch-For-Review: Do multiple code and data clean ups for content tables - https://phabricator.wikimedia.org/T418754#11673427 (10xcollazo) On T418754#11665121, we deleted the old content backup tables. But for some very strange reason, the old tables... [14:28:46] 06Data-Engineering, 07OKR-Work (WE1 FY2025-26): [Spike] Adding access_method metadata to moderator action event streams - https://phabricator.wikimedia.org/T419019 (10ldelench_wmf) 03NEW [14:40:29] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Sqlufluff on Stat hosts - https://phabricator.wikimedia.org/T417171#11673539 (10Ottomata) Btw, just so we don't use it, I expounded a little more on why we chose conda + setuptools instead of poetry for managing and deploying python job repos: https:... [14:47:28] 06Data-Engineering, 07OKR-Work (WE1 FY2025-26): [Spike] Adding access_method metadata to moderator action event streams - https://phabricator.wikimedia.org/T419019#11673567 (10Ottomata) > Option A: Add access_method to the MediaWiki core logging table (MW core schema change; likely high-effort; flagged as pote... [15:17:23] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Movement-Insights: Investigate and repair pageviews and unique devices spike starting in Nov 2025 - https://phabricator.wikimedia.org/T416933#11673744 (10GGoncalves-WMF) [15:20:13] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Event-Platform: Logs and Monitoring for the HTML pipeline - https://phabricator.wikimedia.org/T418996#11673760 (10Ottomata) > Ensure there is an Index Pattern that can understand our logs and extract valuable variables. This should be ECS. {T234565} [15:39:03] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 13Patch-For-Review: Do multiple code and data clean ups for content tables - https://phabricator.wikimedia.org/T418754#11673847 (10xcollazo) [15:41:27] 06Data-Engineering, 07OKR-Work (WE1 FY2025-26): [Spike] Adding access_method metadata to moderator action event streams - https://phabricator.wikimedia.org/T419019#11673866 (10Ottomata) >> Option B (preferred starting point): Add access_method to existing event streams: > Let's find out if this is technically... [16:19:13] FIRING: [2x] MediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag: ... [16:19:13] High Kafka consumer lag for mw_content_history_reconcile_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-content-history-reconcile-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_content_history_reconcile_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag [16:28:09] 06Data-Engineering, 06Data-Engineering-Radar, 10FR-Tech-Analytics, 06Data-Platform-SRE (2026-02-13 - 2026-03-06): Create FR Tech Airflow instance - https://phabricator.wikimedia.org/T417213#11674091 (10AStein-WMF) a:03brouberol [16:40:34] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Datasets-General-or-Unknown: Get dump mirrors to use new dumps-rsync service name - https://phabricator.wikimedia.org/T415193#11674152 (10xcollazo) [17:05:42] 06Data-Engineering, 06Data-Engineering-Radar, 06Privacy Engineering, 06Security-Team, and 2 others: Privacy review of x1 tables in preparation of adding them to wikireplicas - https://phabricator.wikimedia.org/T415219#11674229 (10Ottomata) [17:07:17] 06Data-Engineering, 10Dumps-Generation: wikidatawiki fails dumps of the wbt_* tables, also lagging on XML Dumps - https://phabricator.wikimedia.org/T396125#11674247 (10Ottomata) 05Open→03Declined Declining, no plan to fix. Feel free to reopen if there is need. [17:12:42] 06Data-Engineering, 06MW-Interfaces-Team, 10Event-Platform: EventBus: Invalid mediawiki signature error caused by meta.dt field - https://phabricator.wikimedia.org/T418573#11674330 (10Ottomata) p:05Triage→03Medium [17:12:50] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06MW-Interfaces-Team, 10Event-Platform: EventBus: Invalid mediawiki signature error caused by meta.dt field - https://phabricator.wikimedia.org/T418573#11674334 (10Ottomata) [17:13:05] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06MW-Interfaces-Team, 10Event-Platform: EventBus: Invalid mediawiki signature error caused by meta.dt field - https://phabricator.wikimedia.org/T418573#11674335 (10Ottomata) a:03Ottomata [17:13:27] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06MW-Interfaces-Team, 10Event-Platform: EventBus: Invalid mediawiki signature error caused by meta.dt field - https://phabricator.wikimedia.org/T418573#11674336 (10Ottomata) If MW team doesn't get to it before me, I'll try to find some time to investi... [17:13:35] 06Data-Engineering: Requesting Kerberos access for SCardenas (WMF) - https://phabricator.wikimedia.org/T418664#11674338 (10Ahoelzl) Approved. [17:13:52] 06Data-Engineering, 06Data-Platform-SRE: Requesting Kerberos access for SCardenas (WMF) - https://phabricator.wikimedia.org/T418664#11674341 (10Ottomata) [17:14:51] 06Data-Engineering, 06Data-Engineering-Radar, 06Data-Platform-SRE: Requesting Kerberos access for SCardenas (WMF) - https://phabricator.wikimedia.org/T418664#11674348 (10Ahoelzl) [17:16:28] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): HDFS usage dashboard is quadruple counting file counts and file sizes - https://phabricator.wikimedia.org/T418780#11674355 (10Ahoelzl) [17:17:31] 14Analytics, 06Data-Engineering, 10Data-Engineering-Wikistats: Wikistats pageview annotations are not shown when splitting by a dimension - https://phabricator.wikimedia.org/T418725#11674357 (10Ottomata) p:05Triage→03Medium [17:20:03] 06Data-Engineering, 06Data-Engineering-Radar, 06Content-Transform-Team, 06MW-Interfaces-Team, 10Event-Platform: Expose MediaWiki Parser render_id as a response header in relevant MW REST API endpoints - https://phabricator.wikimedia.org/T418792#11674367 (10Ottomata) [17:21:31] 06Data-Engineering, 06Data-Engineering-Radar, 06Data-Platform-SRE (2026-02-13 - 2026-03-06): Transfer ownership of Watchlist CTR dashboard to Mikhail - https://phabricator.wikimedia.org/T418485#11674374 (10Ottomata) [17:28:21] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 07OKR-Work (WE1 FY2025-26): [Spike] Adding access_method metadata to moderator action event streams - https://phabricator.wikimedia.org/T419019#11674406 (10Ahoelzl) [18:19:41] 06Data-Engineering: Optimize enqueueing of refine_webrequest_hourly pipeline - https://phabricator.wikimedia.org/T419050 (10Antoine_Quhen) 03NEW [19:13:40] 06Data-Engineering: Optimize enqueueing of refine_webrequest_hourly pipeline - https://phabricator.wikimedia.org/T419050#11674988 (10mforns) In theory, the default sensor timeout is 7 days. I haven't found anywhere in the code where we override this value. Do you know why our sensors timeout so early? [19:21:46] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 07OKR-Work (WE1 FY2025-26): [Spike] Adding access_method metadata to moderator action event streams - https://phabricator.wikimedia.org/T419019#11674996 (10CMyrick-WMF) It looks like the change tagging system already supports tags on log actions, and mo... [19:31:19] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Monthly reconcile continues to emit a really large amount of events after user_id changes - https://phabricator.wikimedia.org/T419055#11675035 (10xcollazo) [19:45:48] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Monthly reconcile continues to emit a really large amount of events after user_id changes - https://phabricator.wikimedia.org/T419055#11675081 (10xcollazo) Are we fixing over and over the same set of revisions? ` # notice how dewiki consistency has 3.3M i... [19:47:16] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 13Patch-For-Review: Do multiple code and data clean ups for content tables - https://phabricator.wikimedia.org/T418754#11675084 (10xcollazo) [20:19:13] FIRING: [2x] MediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag: ... [20:19:13] High Kafka consumer lag for mw_content_history_reconcile_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-content-history-reconcile-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_content_history_reconcile_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag [20:27:41] 06Data-Engineering, 06Data-Engineering-Radar, 06Commons, 06Data-Persistence, and 5 others: Migrate file tables to a modern layout (image/oldimage; file/filerevision; add primary keys) - https://phabricator.wikimedia.org/T28741#11675232 (10Zabe) [20:45:41] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Content-Transform-Team, 06MW-Interfaces-Team, 10Event-Platform, 13Patch-For-Review: Common event data model for data derived from parsed page revision html (and more!) - https://phabricator.wikimedia.org/T415158#11675282 (10Ottomata) I tried to... [21:42:15] !log Test Kitchen edge-unique experiments (poll 211325) - adds: minerva-experiment-aaa; removes: none; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD [21:42:17] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [21:59:23] 06Data-Engineering, 06Infrastructure-Foundations: Package thirdparty/opensearch1 for bookworm - https://phabricator.wikimedia.org/T418809#11675708 (10Raine) attempting to route this to the correct team :-)