[05:40:02] 06Data-Engineering, 10Beta-Cluster-Infrastructure, 10Event-Platform, 07Wikimedia-production-error: cirrusSearchCheckerJob JobQueueErrors (Could not enqueue jobs) on Beta Cluster - https://phabricator.wikimedia.org/T322491#10660414 (10A_smart_kitten) Adding #wikimedia-production-error, as per that tag’s des... [08:10:40] 10Data-Engineering (Q3 2025 January 1st - March 31th), 06Data-Platform-SRE, 06Movement-Insights: Fail Spark job or airflow task if unexpected number of output files - https://phabricator.wikimedia.org/T377006#10660721 (10Antoine_Quhen) [09:02:32] 10Data-Engineering (Q3 2025 January 1st - March 31th), 06Traffic, 10DPE HAProxy Migration: Update HAProxyKafka kafka-timestamp type - https://phabricator.wikimedia.org/T389521#10660812 (10Fabfur) We tried (many thanks to @brouberol ) to explicitly set the DLQ topic to `LogAppend` and seems to work (meaning... [09:02:51] 06Data-Engineering, 06SRE, 06Traffic-Icebox, 10MobileFrontend (Tracking): RFC: Remove .m. subdomain, serve mobile and desktop variants through the same URL - https://phabricator.wikimedia.org/T214998#10660810 (10Krinkle) [09:07:16] 10Data-Engineering (Q3 2025 January 1st - March 31th), 06Traffic, 10DPE HAProxy Migration: Update HAProxyKafka kafka-timestamp type - https://phabricator.wikimedia.org/T389521#10660833 (10Fabfur) [09:07:19] 06Data-Engineering, 06Data-Engineering-Radar, 06Traffic, 10MediaWiki-Platform-Team (Radar), 13Patch-For-Review: New software: haproxykafka - https://phabricator.wikimedia.org/T370668#10660834 (10Fabfur) [09:08:13] 10Data-Engineering (Q3 2025 January 1st - March 31th), 06Traffic, 10DPE HAProxy Migration: Update HAProxyKafka kafka-timestamp type - https://phabricator.wikimedia.org/T389521#10660835 (10brouberol) > Problem here is that producer overrides are ignored Ish. When the producer set a timestamp type override t... [09:13:10] 10Data-Engineering (Q3 2025 January 1st - March 31th), 06Traffic, 10DPE HAProxy Migration: Update HAProxyKafka kafka-timestamp type - https://phabricator.wikimedia.org/T389521#10660855 (10brouberol) ` brouberol@kafka-jumbo1014:~$ kafka topics --topic webrequest_text --alter --config message.timestamp.type=L... [09:18:27] 10Data-Engineering (Q3 2025 January 1st - March 31th), 06Traffic, 10DPE HAProxy Migration: Update HAProxyKafka kafka-timestamp type - https://phabricator.wikimedia.org/T389521#10660861 (10brouberol) Traffic on `webrequest_text` is stable. I'm going to apply the config change on `webrequest_upload` now. [09:19:30] 10Data-Engineering (Q3 2025 January 1st - March 31th), 06Traffic, 10DPE HAProxy Migration: Update HAProxyKafka kafka-timestamp type - https://phabricator.wikimedia.org/T389521#10660865 (10brouberol) ` brouberol@kafka-jumbo1014:~$ kafka topics --topic webrequest_upload --alter --config message.timestamp.type... [09:19:38] 10Data-Engineering (Q3 2025 January 1st - March 31th), 06Traffic, 10DPE HAProxy Migration: Update HAProxyKafka kafka-timestamp type - https://phabricator.wikimedia.org/T389521#10660866 (10brouberol) [09:32:21] 10Data-Engineering (Q3 2025 January 1st - March 31th), 06Traffic, 10DPE HAProxy Migration: Update HAProxyKafka kafka-timestamp type - https://phabricator.wikimedia.org/T389521#10660881 (10Fabfur) I'd say that this is now done, will wait confirmation from @JAllemandou to check that on their side all is fine... [09:35:59] 10Data-Engineering (Q3 2025 January 1st - March 31th), 06Traffic, 10DPE HAProxy Migration: Update HAProxyKafka kafka-timestamp type - https://phabricator.wikimedia.org/T389521#10660889 (10JAllemandou) Actually we need to change `webrequest_frontend_text` and `webrequest_frontend_upload` topics, as those are... [09:39:05] 10Data-Engineering (Q3 2025 January 1st - March 31th), 06Traffic, 10DPE HAProxy Migration: Update HAProxyKafka kafka-timestamp type - https://phabricator.wikimedia.org/T389521#10660893 (10brouberol) Sure, I can do that. Should I also remove the topic config override on the `webrequest_errors`, `webrequest_t... [09:39:06] 06Data-Engineering, 06Structured-Data-Backlog: Fix structured_data.commons_entity snapshot date - https://phabricator.wikimedia.org/T389601 (10matthiasmullie) 03NEW [09:39:36] 10Data-Engineering (Q3 2025 January 1st - March 31th), 10Growth-Structured-Tasks, 06Growth-Team, 10Image-Suggestions, and 6 others: wmf.wikidata_item_page_link and wmf.wikidata_entity snapshots stuck at 2025-01-20 - https://phabricator.wikimedia.org/T386255#10660909 (10matthiasmullie) >>! In T386255#10... [09:43:03] 10Data-Engineering (Q3 2025 January 1st - March 31th), 06Traffic, 10DPE HAProxy Migration: Update HAProxyKafka kafka-timestamp type - https://phabricator.wikimedia.org/T389521#10660921 (10Fabfur) >>! In T389521#10660893, @brouberol wrote: > Sure, I can do that. Should I also remove the topic config override... [09:44:17] 10Data-Engineering (Q3 2025 January 1st - March 31th), 06Traffic, 10DPE HAProxy Migration: Update HAProxyKafka kafka-timestamp type - https://phabricator.wikimedia.org/T389521#10660937 (10brouberol) ` [09:44:54] 10Data-Engineering (Q3 2025 January 1st - March 31th), 10Growth-Structured-Tasks, 06Growth-Team, 10Image-Suggestions, and 7 others: wmf.wikidata_item_page_link and wmf.wikidata_entity snapshots stuck at 2025-01-20 - https://phabricator.wikimedia.org/T386255#10660942 (10Gehel) [09:46:25] 06Data-Engineering, 10DPE-Mediawiki-Content, 10Data-Platform-SRE (2025.03.01 - 2025.03.21): Consider writing Spark files to Ceph (S3) instead of Hadoop - https://phabricator.wikimedia.org/T384500#10660945 (10Gehel) 05Resolved→03Declined [09:48:54] 10Data-Engineering (Q3 2025 January 1st - March 31th), 06Traffic, 10DPE HAProxy Migration: Update HAProxyKafka kafka-timestamp type - https://phabricator.wikimedia.org/T389521#10660964 (10Fabfur) Can confirm that both topics now have messages with `tstype: logappend` Thanks for all the work (again)! [09:50:12] 10Data-Engineering (Q3 2025 January 1st - March 31th), 06Traffic, 10DPE HAProxy Migration: Update HAProxyKafka kafka-timestamp type - https://phabricator.wikimedia.org/T389521#10660966 (10JAllemandou) Thanks folks! [09:56:53] 06Data-Engineering, 10Data-Platform-SRE (2025.03.22 - 2025.04.11), 07Epic: HDFS capacity needs FY24/25 - https://phabricator.wikimedia.org/T384098#10661013 (10Gehel) [09:58:32] 06Data-Engineering, 10Data-Platform-SRE (2025.03.22 - 2025.04.11): Draft a project plan for the Hadoop version 3 upgrade - https://phabricator.wikimedia.org/T379748#10661055 (10Gehel) [09:59:41] 06Data-Engineering, 06Data-Engineering-Icebox, 10Data-Platform-SRE (2025.03.22 - 2025.04.11): Some wikibase tables not available in commonswiki_p - https://phabricator.wikimedia.org/T298452#10661091 (10Gehel) [10:00:05] 06Data-Engineering, 06Data-Engineering-Radar, 10DPE-Mediawiki-Content, 10Data-Platform-SRE (2025.03.22 - 2025.04.11): Test if an existing conda environment with Spark 3.1.2 clients works fine with Spark 3.5.3 - https://phabricator.wikimedia.org/T380417#10661101 (10Gehel) [10:00:23] 06Data-Engineering, 06Data-Engineering-Radar, 10CirrusSearch, 10Structured Data Engineering, and 3 others: Migrate image recommendation to use page_weighted_tags_changed stream - https://phabricator.wikimedia.org/T372912#10661103 (10Gehel) [10:00:39] 06Data-Engineering, 10Data-Platform-SRE (2025.03.22 - 2025.04.11): Airflow UI sometimes shows no response for a DAG run task with many mapped tasks - https://phabricator.wikimedia.org/T381479#10661111 (10Gehel) [10:02:06] 06Data-Engineering, 06Data-Engineering-Radar, 10Data-Platform-SRE (2025.03.22 - 2025.04.11): Unable to trigger dag with config - https://phabricator.wikimedia.org/T384805#10661138 (10Gehel) [10:50:38] 06Data-Engineering, 06Data-Engineering-Radar, 10Data-Platform-SRE (2025.03.22 - 2025.04.11): Unable to trigger dag with config - https://phabricator.wikimedia.org/T384805#10661313 (10brouberol) 05In progress→03Declined David mentioned on slack: > I found a workaround with running a command so it's no... [11:32:58] 10Data-Engineering (Q3 2025 January 1st - March 31th), 06Traffic, 10DPE HAProxy Migration: Update HAProxyKafka kafka-timestamp type - https://phabricator.wikimedia.org/T389521#10661402 (10Fabfur) 05Open→03Resolved I think this can be closed now and leave it as reference for this Kafka peculiarity, i... [13:32:23] 06Data-Engineering, 06Structured-Data-Backlog: Fix structured_data.commons_entity snapshot date - https://phabricator.wikimedia.org/T389601#10661873 (10matthiasmullie) Note: once this is resolved, the workarounds in https://gitlab.wikimedia.org/repos/structured-data/image-suggestions/-/commit/e1cac659b2dca17e4... [13:46:03] 06Data-Engineering, 06Experimentation Lab: NEW/CHANGE FEATURE REQUEST: Documentation for v1 Enterprise endpoint deprecation - https://phabricator.wikimedia.org/T389542#10661911 (10Ahoelzl) @Cpetrillo can the changes be made ahead of time? [13:46:28] 10Data-Engineering (Q3 2025 January 1st - March 31th), 06Experimentation Lab: NEW/CHANGE FEATURE REQUEST: Documentation for v1 Enterprise endpoint deprecation - https://phabricator.wikimedia.org/T389542#10661912 (10Ahoelzl) [14:18:13] 06Data-Engineering, 06Structured-Data-Backlog: Fix structured_data.commons_entity snapshot date - https://phabricator.wikimedia.org/T389601#10662062 (10xcollazo) Ah, I think I figured this out, it was (my) user error. Both of the DAGs ([[ https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/blob... [14:22:37] 06Data-Engineering, 06Structured-Data-Backlog: Fix structured_data.commons_entity snapshot date - https://phabricator.wikimedia.org/T389601#10662077 (10matthiasmullie) We have a DAG running now that is processing the existing data. I expect it to complete (hopefully successful) within about an hour. As long as... [14:35:20] 06Data-Engineering, 06Data-Engineering-Radar, 10ConfirmEdit (CAPTCHA extension), 10MediaWiki-extensions-EventLogging, and 2 others: Send hCaptcha API response data to event platform - https://phabricator.wikimedia.org/T379179#10662148 (10acooper) 05Open→03In progress p:05Triage→03Medium [15:38:15] 06Data-Engineering, 06Structured-Data-Backlog: Fix structured_data.commons_entity snapshot date - https://phabricator.wikimedia.org/T389601#10662491 (10xcollazo) DAG is https://airflow-platform-eng.wikimedia.org/dags/SLIS/grid?dag_run_id=scheduled__2025-03-13T00%3A00%3A00%2B00%3A00, and it is now done. Will p... [15:39:11] 06Data-Engineering, 06Structured-Data-Backlog: Fix structured_data.commons_entity snapshot date - https://phabricator.wikimedia.org/T389601#10662497 (10xcollazo) 05Open→03In progress a:03xcollazo [15:46:42] 06Data-Engineering, 06Structured-Data-Backlog: Fix structured_data.commons_entity snapshot date - https://phabricator.wikimedia.org/T389601#10662562 (10xcollazo) Now running wikidata_dump_to_hive_weekly for `2025-03-03` (which pulls data from `2025-03-10` dump): https://airflow.wikimedia.org/dags/wikidata_dump... [16:51:22] 06Data-Engineering, 06Structured-Data-Backlog: Fix structured_data.commons_entity snapshot date - https://phabricator.wikimedia.org/T389601#10662961 (10xcollazo) >>! In T389601#10662562, @xcollazo wrote: > Now running wikidata_dump_to_hive_weekly for `2025-03-03` (which pulls data from `2025-03-10` dump): http... [16:55:06] 06Data-Engineering, 06Structured-Data-Backlog: Fix structured_data.commons_entity snapshot date - https://phabricator.wikimedia.org/T389601#10662993 (10xcollazo) ` spark-sql (default)> show partitions wmf.wikidata_entity; partition snapshot=2024-12-16 snapshot=2024-12-23 snapshot=2024-12-30 snapshot=2025-01-06... [17:19:06] 10Data-Engineering (Q3 2025 January 1st - March 31th), 06Experimentation Lab: NEW/CHANGE FEATURE REQUEST: Documentation for v1 Enterprise endpoint deprecation - https://phabricator.wikimedia.org/T389542#10663158 (10Cpetrillo) @Ahoelzl changes to the service or the site? [17:19:44] 10Data-Engineering (Q3 2025 January 1st - March 31th), 10DPE-Mediawiki-Content: Investigate and fix duplicate data on wmf_content.mediawiki_content_history_v1 for muswiki - https://phabricator.wikimedia.org/T388715#10663164 (10xcollazo) For completeness, I've rerun the bad data detection SQL below just now, an... [17:28:26] 06Data-Engineering, 06Structured-Data-Backlog: Fix structured_data.commons_entity snapshot date - https://phabricator.wikimedia.org/T389601#10663222 (10xcollazo) > 3. Perhaps then delete wmf.wikidata_entity's and wmf.wikidata_item_page_link's snapshot=2025-03-10 and clear the DAGs so that we properly fill that... [17:43:25] 06Data-Engineering, 06Structured-Data-Backlog: Fix structured_data.commons_entity snapshot date - https://phabricator.wikimedia.org/T389601#10663364 (10xcollazo) Ran the following: ` sudo -u analytics kerberos-run-command analytics hdfs dfs -mkdir /wmf/data/raw/commons/dumps/mediainfo-json/20250317 sudo -u a... [17:46:04] 10Data-Engineering (Q3 2025 January 1st - March 31th): Some search entries in wmf.webrequest have their query appended to their uri_path - https://phabricator.wikimedia.org/T383135#10663384 (10Antoine_Quhen) `sql SELECT hour, normalized_host.project_family as project_family, count(1) as count FROM wm... [17:50:42] 06Data-Engineering: NEW/CHANGE FEATURE REQUEST: Make Event Registration Tool's data available in Data Lake - https://phabricator.wikimedia.org/T389662 (10Arinaigu) 03NEW [17:53:25] 06Data-Engineering: NEW/CHANGE FEATURE REQUEST: Make Event Registration Tool's data available in Data Lake - https://phabricator.wikimedia.org/T389662#10663439 (10Arinaigu) [18:18:33] 06Data-Engineering, 06Experimentation Lab: NEW/CHANGE FEATURE REQUEST: make available the centralauth.globaluser table in Data Lake - https://phabricator.wikimedia.org/T389666 (10Arinaigu) 03NEW [18:20:30] 06Data-Engineering, 06Structured-Data-Backlog: Fix structured_data.commons_entity snapshot date - https://phabricator.wikimedia.org/T389601#10663549 (10xcollazo) Run for `commons_structured_data_dump_to_hive_weekly` finished: ` spark-sql (default)> show partitions structured_data.commons_entity; partition snap... [18:20:59] 10Data-Engineering (Q3 2025 January 1st - March 31th), 06Structured-Data-Backlog: Fix structured_data.commons_entity snapshot date - https://phabricator.wikimedia.org/T389601#10663550 (10xcollazo) 05In progress→03Resolved [18:23:28] 06Data-Engineering, 06Experimentation Lab: NEW/CHANGE FEATURE REQUEST: make available the centralauth.globaluser table in Data Lake - https://phabricator.wikimedia.org/T389666#10663569 (10Arinaigu) This ticket could override the need for T365648 . [18:50:21] 06Data-Engineering: NEW/CHANGE FEATURE REQUEST: Make Event Registration Tool's data available in Data Lake - https://phabricator.wikimedia.org/T389662#10663654 (10mpopov) Since both Event Registration and Content Translation store their application data in x1, I wonder if one solution could solve this and {T3827... [19:13:26] 06Data-Engineering, 10Data Pipelines: Add user Central ID to mediawiki_history table in Hive - https://phabricator.wikimedia.org/T365648#10663757 (10Arinaigu) If the request in T389666 is implemented, it would solve the need for this ticket, at least on my end. It would be a broader solution that can be used w... [20:39:53] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 10Charts (Sprint 18), 07Schema-change-in-production: Deploy patch-gjlw_namespace_text.sql on x1.commonswiki for JsonConfig - https://phabricator.wikimedia.org/T385917#10664028 (10CCiufo-WMF) I don't know exactly how to test + sign off on this. @bvibber... [21:30:07] 10Data-Engineering-Roadmap, 10MediaWiki-DomainEvents, 06MW-Interfaces-Team, 07Epic: DomainEvents - Broadcasting and receiving cross-process events - https://phabricator.wikimedia.org/T379935#10664196 (10Ahoelzl) [21:48:43] 10Data-Engineering (Q3 2025 January 1st - March 31th), 06Experimentation Lab: NEW/CHANGE FEATURE REQUEST: Documentation for v1 Enterprise endpoint deprecation - https://phabricator.wikimedia.org/T389542#10664233 (10creynolds) enterprise dumps html page change review => https://gerrit.wikimedia.org/r/c/operatio... [21:58:11] 10Data-Engineering (Q3 2025 January 1st - March 31th), 07Essential-Work: Merge and deploy WME dumps webpage reference update - https://phabricator.wikimedia.org/T389690 (10Ahoelzl) 03NEW [21:59:21] 10Data-Engineering (Q3 2025 January 1st - March 31th), 07Essential-Work: Merge and deploy WME dumps webpage reference update - https://phabricator.wikimedia.org/T389690#10664248 (10Ahoelzl) 05Open→03Invalid [21:59:46] 10Data-Engineering (Q3 2025 January 1st - March 31th), 06Experimentation Lab: NEW/CHANGE FEATURE REQUEST: Documentation for v1 Enterprise endpoint deprecation - https://phabricator.wikimedia.org/T389542#10664250 (10Ahoelzl) a:03Ahoelzl [22:27:25] 10Data-Engineering (Q3 2025 January 1st - March 31th): Assess data platform implications for RFC domain unification - https://phabricator.wikimedia.org/T389696 (10Ahoelzl) 03NEW [22:28:59] 10Data-Engineering (Q3 2025 January 1st - March 31th): Assess data platform implications for RFC domain unification - https://phabricator.wikimedia.org/T389696#10664375 (10Ahoelzl) [22:53:19] 06Data-Engineering, 06Traffic: GeoDNS: Pipeline from event.development_network_probe to operations/dns.git - https://phabricator.wikimedia.org/T380626#10664431 (10CDobbins) A concern that's been brought up is that some of the results from the Probenet data are unexpected (e.g., Paraguay [[ https://gerrit.wikim... [23:12:45] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 10Charts (Sprint 18), 07Schema-change-in-production: Deploy patch-gjlw_namespace_text.sql on x1.commonswiki for JsonConfig - https://phabricator.wikimedia.org/T385917#10664453 (10bvibber) 05Open→03Resolved >>! In T385917#10664028, @CCiufo-WMF w...