[00:00:14] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 06Data-Platform-SRE: Add terms of use to https://dumps.wikimedia.org/index.html - https://phabricator.wikimedia.org/T408881#11330125 (10Ahoelzl) It looks like this is the file that needs to be updated: https://gerrit.wikimedia.org/r/plugins/gitiles/o... [00:20:39] (03PS22) 10Ottomata: Add HQL for edit_per_editor_per_page_daily and pageview_per_editor_per_page_daily [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1196892 (https://phabricator.wikimedia.org/T407559) [03:50:06] (03PS5) 10Snwachukwu: Add Data quality check for Pageview Human-Bot ratio anomaly [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1199485 (https://phabricator.wikimedia.org/T407239) [06:00:10] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 10AbuseFilter, 06DBA, 07Schema-change-in-production: Drop the afl_ip column and the afl_ip_timestamp index from the abuse_filter_log table - https://phabricator.wikimedia.org/T407997#11330393 (10Marostegui) [06:00:21] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 10AbuseFilter, 06DBA, 07Schema-change-in-production: Drop the afl_ip column and the afl_ip_timestamp index from the abuse_filter_log table - https://phabricator.wikimedia.org/T407997#11330396 (10Marostegui) [09:07:00] 06Data-Engineering, 06Data-Engineering-Radar, 10CirrusSearch, 06Discovery-Search (2025.10.20 - 2025.11.07), and 2 others: Eventutilities Flink: port SerDe tests from SUP - https://phabricator.wikimedia.org/T404597#11330731 (10pfischer) a:05pfischer→03None [09:34:42] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 10Event-Platform: mediawiki_event_enrichment - update default params and tests to use mediawiki/page_change 1.3.0 (latest) schema - https://phabricator.wikimedia.org/T407779#11330800 (10JMonton-WMF) [09:36:03] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 10Event-Platform: mediawiki_event_enrichment - update default params and tests to use mediawiki/page_change 1.3.0 (latest) schema - https://phabricator.wikimedia.org/T407779#11330804 (10JMonton-WMF) a:05Ottomata→03JMonton-WMF [10:22:19] 06Data-Engineering, 10MediaWiki-Core-Hooks, 06MW-Interfaces-Team, 06MediaWiki-Platform-Team (Radar): Spike: investigate incorrect page_id values in pageview_hourly - https://phabricator.wikimedia.org/T408798#11330938 (10JAllemandou) Thank you so much for the very clear summary @Milimetric ! Some notes:... [10:30:21] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 06Movement-Insights, 06Data-Platform-SRE (2025.10.17 - 2025.11.07), 07Epic, 07Essential-Work: Create example dbt models using Iceberg - https://phabricator.wikimedia.org/T408687#11330948 (10JAllemandou) I don't think the micro_batch strategy ch... [10:34:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-logging-external in codfw. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=codfw%2Bprometheus/k8s&var-service=eventgate-logging-external - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [10:49:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-logging-external in codfw. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=codfw%2Bprometheus/k8s&var-service=eventgate-logging-external - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [11:32:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-logging-external in codfw. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=codfw%2Bprometheus/k8s&var-service=eventgate-logging-external - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [11:52:17] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-logging-external in codfw. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=codfw%2Bprometheus/k8s&var-service=eventgate-logging-external - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [12:02:47] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 06Data-Persistence, 10Data-Persistence-Design-Review, 06Growth-Team, and 3 others: Data Persistence Design Review: Improve Tone Suggested Edits newcomer task - https://phabricator.wikimedia.org/T401021#11331071 (10achou) >>! In T401021#11329384,... [12:11:50] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 10Event-Platform: mediawiki_event_enrichment should enrich all events for the page_content_change stream - https://phabricator.wikimedia.org/T408850#11331090 (10JMonton-WMF) a:03JMonton-WMF [12:15:25] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 10Event-Platform: mediawiki_event_enrichment - update default params and tests to use mediawiki/page_change 1.3.0 (latest) schema - https://phabricator.wikimedia.org/T407779#11331097 (10JMonton-WMF) I've created a [[ https://gitlab.wikimedia.org/repo... [12:25:03] 06Data-Engineering, 10MediaWiki-Core-Hooks, 06MW-Interfaces-Team, 06MediaWiki-Platform-Team (Radar): Spike: investigate incorrect page_id values in pageview_hourly - https://phabricator.wikimedia.org/T408798#11331106 (10Milimetric) > * The join with `mediawiki_history` can be done on all needed time, b... [13:15:18] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 07Essential-Work, 10Event-Platform: Upgrade mediawiki-event-enrichment jobs to Flink 1.20.2 and Java 17 - https://phabricator.wikimedia.org/T408918 (10tchin) 03NEW [14:51:06] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 10Event-Platform: mediawiki_event_enrichment should enrich all events for the page_content_change stream - https://phabricator.wikimedia.org/T408850#11331450 (10xcollazo) [15:04:17] (03PS6) 10Snwachukwu: Add Data quality check for Pageview Human-Bot ratio anomaly [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1199485 (https://phabricator.wikimedia.org/T407235) [15:04:46] (03PS7) 10Snwachukwu: Add Data quality check for Pageview Human-Bot ratio anomaly [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1199485 (https://phabricator.wikimedia.org/T407239) [15:18:18] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 10Event-Platform: mediawiki_event_enrichment should enrich all events for the page_content_change stream - https://phabricator.wikimedia.org/T408850#11331535 (10JMonton-WMF) Just to confirm if this is right. The `content_history.py` that processes... [15:38:50] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 06Data-Persistence, 10Data-Persistence-Design-Review, 06Growth-Team, and 3 others: Data Persistence Design Review: Improve Tone Suggested Edits newcomer task - https://phabricator.wikimedia.org/T401021#11331583 (10achou) @Eevans I wanted to follo... [15:42:49] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th): Update MediaWiki Content History SLO draft for SRE review - https://phabricator.wikimedia.org/T401892#11331595 (10APizzata-WMF) I have updated the qeury: ` from datetime import datetime, timezone,timedelta wiki_id_name= 'dewiki' utc_now= datetime.no... [16:31:53] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 06Data-Persistence, 10Data-Persistence-Design-Review, 06Growth-Team, and 3 others: Data Persistence Design Review: Improve Tone Suggested Edits newcomer task - https://phabricator.wikimedia.org/T401021#11331700 (10Eevans) >>! In T401021#11331583,... [16:53:28] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 06Data-Platform-SRE, 13Patch-For-Review: Add terms of use to https://dumps.wikimedia.org/index.html and https://dumps.wikimedia.org/backup-index.html - https://phabricator.wikimedia.org/T408881#11331733 (10Ahoelzl) [16:54:08] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 06Data-Platform-SRE, 13Patch-For-Review: Add terms of use to https://dumps.wikimedia.org/index.html and https://dumps.wikimedia.org/backup-index.html - https://phabricator.wikimedia.org/T408881#11331735 (10Ahoelzl) Updated description to clarify th... [17:04:10] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 06Data-Platform-SRE, 13Patch-For-Review: Add terms of use to https://dumps.wikimedia.org/index.html and https://dumps.wikimedia.org/backup-index.html - https://phabricator.wikimedia.org/T408881#11331803 (10Ahoelzl) The `https://dumps.wikimedia.org/... [17:04:32] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 06Data-Platform-SRE, 13Patch-For-Review: Add terms of use to https://dumps.wikimedia.org/index.html and https://dumps.wikimedia.org/backup-index.html - https://phabricator.wikimedia.org/T408881#11331804 (10Ahoelzl) a:03Ahoelzl [17:10:21] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 07OKR-Work, 13Patch-For-Review: SDS 1.3.2 Implementation - https://phabricator.wikimedia.org/T407239#11331828 (10xcollazo) >Implement whatever approach we decide in the analysis ticket Can we link to the analysis ticket? [17:20:01] (03CR) 10Mforns: Add HQL for edit_per_editor_per_page_daily and pageview_per_editor_per_page_daily (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1196892 (https://phabricator.wikimedia.org/T407559) (owner: 10Ottomata) [17:25:36] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 07OKR-Work, 13Patch-For-Review: SDS 1.3.2 Implementation - https://phabricator.wikimedia.org/T407239#11331862 (10Snwachukwu) [17:29:35] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 07OKR-Work, 13Patch-For-Review: SDS 1.3.2 Implementation - https://phabricator.wikimedia.org/T407239#11331863 (10Snwachukwu) @xcollazo . I have linked it. [17:41:50] (03CR) 10Xcollazo: [C:03+1] "Great stuff @snwachukwu@wikimedia.org!" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1199485 (https://phabricator.wikimedia.org/T407239) (owner: 10Snwachukwu) [18:56:17] 06Data-Engineering: Fix iceberg table location in hive metastore - https://phabricator.wikimedia.org/T408939 (10fkaelin) 03NEW [18:58:17] 06Data-Engineering: Fix iceberg table location in hive metastore - https://phabricator.wikimedia.org/T408939#11332082 (10fkaelin) List of tables with a non-fully qualified uri location (leaving out the tables in users personal databases) ` wmde.campaign_banner_impressions_quarter_hourly wmde.tmp_mwh_wiki_editor_... [19:56:10] 06Data-Engineering: Fix iceberg table location in hive metastore - https://phabricator.wikimedia.org/T408939#11332252 (10Ottomata) When fixing this, we should use a fully fully qualified URL, not just "hdfs:///..." but "hdfs://analytics-hadoop/..." , specifying the specific Hadoop cluster where the location is.... [19:59:57] 06Data-Engineering: Fix iceberg table location in hive metastore - https://phabricator.wikimedia.org/T408939#11332257 (10Ottomata) I also wonder if there is a real need to always use a specific (external) location for hive managed iceberg tables. We did this with Hive tables in the past especially because not al... [20:03:09] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th): SDS 1.3.6 SPUR bot detection analysis - https://phabricator.wikimedia.org/T407103#11332278 (10Hghani) Updated my last [[ https://gitlab.wikimedia.org/hghani/movement-insights-requests/-/blob/main/SDS%201.3/spur_bot_traffic_analysis_part2_hap_proxy.ipyn... [20:08:28] 06Data-Engineering, 10MediaWiki-Core-Hooks, 06MW-Interfaces-Team, 06MediaWiki-Platform-Team (Radar): Spike: investigate incorrect page_id values in pageview_hourly - https://phabricator.wikimedia.org/T408798#11332305 (10Ottomata) For diffs: could we not just modify the pageview algorithm and add an `is... [20:25:45] (03CR) 10Ottomata: Add HQL for edit_per_editor_per_page_daily and pageview_per_editor_per_page_daily (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1196892 (https://phabricator.wikimedia.org/T407559) (owner: 10Ottomata) [20:26:17] (03PS23) 10Ottomata: Add HQL for edit_per_editor_per_page_daily and pageview_per_editor_per_page_daily [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1196892 (https://phabricator.wikimedia.org/T407559) [20:32:27] 06Data-Engineering: Fix iceberg table location in hive metastore - https://phabricator.wikimedia.org/T408939#11332406 (10xcollazo) >I don't expect Iceberg tables created with SQL and managed by HMS to ever need to vary their location, we just do so by historical convention. Maybe we should just let Hive stick th... [20:47:51] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th): Add robots.txt to dumps.wikimedia.org - https://phabricator.wikimedia.org/T408954 (10Ahoelzl) 03NEW [20:48:31] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 06Data-Platform-SRE, 07Essential-Work, 13Patch-For-Review: Add terms of use to https://dumps.wikimedia.org/index.html and https://dumps.wikimedia.org/backup-index.html - https://phabricator.wikimedia.org/T408881#11332442 (10Ahoelzl) [20:48:44] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 07Essential-Work: Add robots.txt to dumps.wikimedia.org - https://phabricator.wikimedia.org/T408954#11332444 (10Ahoelzl) [21:39:23] 06Data-Engineering, 07Essential-Work: Add code styles rules to analytics-refinery-source - https://phabricator.wikimedia.org/T408942#11332559 (10amastilovic) We definitely already have the `maven-checkstyle-plugin` set up in the main `pom.xml` - I know because it's very annoying since the codebase doesn't seem... [22:03:47] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 07Essential-Work: Revive data engineering alert metrics dashboard - https://phabricator.wikimedia.org/T399518#11332589 (10Ahoelzl) [22:03:51] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 07Essential-Work: Revive data engineering alert metrics dashboard - https://phabricator.wikimedia.org/T399518#11332590 (10Ahoelzl)