[06:25:16] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 10AbuseFilter, 06DBA, 07Schema-change-in-production: Drop the afl_ip column and the afl_ip_timestamp index from the abuse_filter_log table - https://phabricator.wikimedia.org/T407997#11334212 (10Marostegui) [06:25:54] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 10AbuseFilter, 06DBA, 07Schema-change-in-production: Drop the afl_ip column and the afl_ip_timestamp index from the abuse_filter_log table - https://phabricator.wikimedia.org/T407997#11334213 (10Marostegui) [08:40:53] 06Data-Engineering, 10MediaWiki-Core-Hooks, 06MW-Interfaces-Team, 06MediaWiki-Platform-Team (Radar): Spike: investigate incorrect page_id values in pageview_hourly - https://phabricator.wikimedia.org/T408798#11334383 (10JAllemandou) >>! In T408798#11332305, @Ottomata wrote: > For diffs: could we not ju... [08:42:16] 06Data-Engineering: Fix iceberg table location in hive metastore - https://phabricator.wikimedia.org/T408939#11334385 (10JAllemandou) Another reason for which we used the external-table mechanism was to prevent data-dropping errors. I think it's worth keeping it as is :) [09:34:55] 06Data-Engineering, 06Data-Engineering-Radar: Requesting Kerberos access for slyngshede - https://phabricator.wikimedia.org/T408696#11334551 (10SLyngshede-WMF) 05Open→03Resolved p:05Triage→03Low a:03SLyngshede-WMF Turns out I can just handle this myself :-) [09:41:00] (03PS8) 10Joal: Update referer classification patterns [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1198313 (https://phabricator.wikimedia.org/T406531) [10:04:06] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th): Create dbt folder structure - https://phabricator.wikimedia.org/T407322#11334763 (10JMonton-WMF) 05Open→03Resolved [10:05:59] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 06Data-Platform-SRE (2025.10.17 - 2025.11.07), 07OKR-Work: Set up a working, usable dbt installation on stat boxes - https://phabricator.wikimedia.org/T406634#11334783 (10JMonton-WMF) [10:25:24] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 06Data-Persistence, 10Data-Persistence-Design-Review, 06Growth-Team, and 3 others: Data Persistence Design Review: Improve Tone Suggested Edits newcomer task - https://phabricator.wikimedia.org/T401021#11334869 (10achou) > I think so, yes. If you... [10:46:20] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th): Explore a local dbt environment setup (independent from Conda) - https://phabricator.wikimedia.org/T409054 (10JMonton-WMF) 03NEW [11:43:31] (03PS9) 10Joal: Update referer classification patterns [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1198313 (https://phabricator.wikimedia.org/T406531) [13:09:16] (03PS10) 10Joal: Update referer classification patterns [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1198313 (https://phabricator.wikimedia.org/T406531) [13:28:28] (03CR) 10Joal: [C:03+1] "Ok for me too, let me know when you wish this applied" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1200087 (owner: 10CDanis) [13:38:10] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 06Data-Persistence, 10Data-Persistence-Design-Review, 06Growth-Team, and 3 others: Data Persistence Design Review: Improve Tone Suggested Edits newcomer task - https://phabricator.wikimedia.org/T401021#11335508 (10dcausse) >>! In T401021#11334869... [13:40:05] (03CR) 10Joal: [C:03+1] "LGTM! Let me know when you wish this merged" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1199521 (https://phabricator.wikimedia.org/T309738) (owner: 10Zabe) [13:49:53] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 10Data-Platform, 07Essential-Work, 06Movement-Insights (FY25-26 H1), 13Patch-For-Review: NEWFEATURE REQUEST: Add new referral sources to pageview data - https://phabricator.wikimedia.org/T406531#11335567 (10JAllemandou) I have vetted the data w... [14:03:11] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 06Data-Platform-SRE (2025.10.17 - 2025.11.07), 07Essential-Work, 13Patch-For-Review: Add terms of use to https://dumps.wikimedia.org/index.html and https://dumps.wikimedia.org/backup-index.html - https://phabricator.wikimedia.org/T408881#11335623 (... [14:26:58] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 10Event-Platform: mediawiki_event_enrichment - update default params and tests to use mediawiki/page_change 1.3.0 (latest) schema - https://phabricator.wikimedia.org/T407779#11335761 (10Ottomata) > I have a concern about changing the job name as I d... [14:29:36] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 06Data-Platform-SRE (2025.10.17 - 2025.11.07), 07Essential-Work, 13Patch-For-Review: Add terms of use to https://dumps.wikimedia.org/index.html and https://dumps.wikimedia.org/backup-index.html - https://phabricator.wikimedia.org/T408881#11335775 (... [14:39:27] 06Data-Engineering: Fix iceberg table location in hive metastore - https://phabricator.wikimedia.org/T408939#11335864 (10Ottomata) > prevent data-dropping errors Does Iceberg delete external table location data on `DELETE` statements? Or does it just do its metadata file switching stuff? [14:40:01] 06Data-Engineering: Fix iceberg table location in hive metastore - https://phabricator.wikimedia.org/T408939#11335866 (10Ottomata) > location would have changed as part of the ALTER TABLE RENAME, and it would have broken the Iceberg table because Iceberg keeps track of fully qualified file names. Ah, okay! Soun... [14:45:41] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 10Event-Platform: mediawiki_event_enrichment should enrich all events for the page_content_change stream - https://phabricator.wikimedia.org/T408850#11335907 (10Ottomata) @xcollazo do you want content also on `page_change_kind == visibility_change` a... [14:49:09] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 06Data-Persistence, 10Data-Persistence-Design-Review, 06Growth-Team, and 3 others: Data Persistence Design Review: Improve Tone Suggested Edits newcomer task - https://phabricator.wikimedia.org/T401021#11335930 (10Ottomata) >> Could we go with pa... [14:58:54] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 10AbuseFilter, 06DBA, 07Schema-change-in-production: Drop the afl_ip column and the afl_ip_timestamp index from the abuse_filter_log table - https://phabricator.wikimedia.org/T407997#11335962 (10Marostegui) [14:59:45] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 10Event-Platform: mediawiki_event_enrichment should enrich all events for the page_content_change stream - https://phabricator.wikimedia.org/T408850#11335965 (10JMonton-WMF) @gmodena also added some [[ https://gitlab.wikimedia.org/repos/data-engineer... [15:42:50] 06Data-Engineering: Fix iceberg table location in hive metastore - https://phabricator.wikimedia.org/T408939#11336134 (10JAllemandou) >>! In T408939#11335864, @Ottomata wrote: >> prevent data-dropping errors > > Does Iceberg delete external table location data on `DELETE` statements? Or does it just do its met... [15:51:46] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 10Event-Platform: mediawiki_event_enrichment should enrich all events for the page_content_change stream - https://phabricator.wikimedia.org/T408850#11336177 (10xcollazo) Sorry, I split the conversation. Let me get it back here. From MR: @xcollazo:... [15:54:47] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 10Event-Platform: mediawiki_event_enrichment should enrich all events for the page_content_change stream - https://phabricator.wikimedia.org/T408850#11336194 (10xcollazo) >>! In T408850#11335965, @JMonton-WMF wrote: > @gmodena also added some [[ http... [15:56:22] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 07Essential-Work, 10Event-Platform: Upgrade mediawiki-event-enrichment jobs to Flink 1.20.2 and Java 17 - https://phabricator.wikimedia.org/T408918#11336199 (10tchin) a:03tchin [15:59:28] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 06Data-Platform-SRE (2025.10.17 - 2025.11.07), 07Essential-Work, 13Patch-For-Review: Add terms of use to https://dumps.wikimedia.org/index.html and https://dumps.wikimedia.org/backup-index.html - https://phabricator.wikimedia.org/T408881#11336220 (... [16:00:14] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 06Data-Platform-SRE (2025.10.17 - 2025.11.07), 07Essential-Work, 13Patch-For-Review: Add terms of use to https://dumps.wikimedia.org/index.html and https://dumps.wikimedia.org/backup-index.html - https://phabricator.wikimedia.org/T408881#11336222 (... [16:00:47] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 06Data-Platform-SRE (2025.10.17 - 2025.11.07), 07Essential-Work, 13Patch-For-Review: Add terms of use to https://dumps.wikimedia.org/index.html and https://dumps.wikimedia.org/backup-index.html - https://phabricator.wikimedia.org/T408881#11336228 (... [16:13:50] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 06Product-Analytics, 13Patch-For-Review: Propagate field descriptions from event schemas to Hive event tables and into DataHub - https://phabricator.wikimedia.org/T307040#11336370 (10Antoine_Quhen) [16:20:25] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th): Vet JA3N data in webrequest and pageview_actor - https://phabricator.wikimedia.org/T408404#11336439 (10mforns) Summary of JA3N data vetting: - Data is present in webrequest and pageview_actor and looks good overall. - All webrequests marked as is_page... [17:00:54] 06Data-Engineering, 10CheckUser-SuggestedInvestigations, 10ConfirmEdit (CAPTCHA extension), 06DBA, and 3 others: Add columns to store associated log ID or revision ID that caused a signal to match - https://phabricator.wikimedia.org/T409093 (10Dreamy_Jazz) 03NEW [17:02:40] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 10MediaWiki-Core-Revision-backend, 10MediaWiki-DomainEvents, 06MW-Interfaces-Team, and 4 others: MediaWiki\Revision\RevisionAccessException: Unable to load fresh row for rev_id: {rev_id} - https://phabricator.wikimedia.org/T400380#11336703 (10Otto... [17:09:31] 06Data-Engineering: Fix iceberg table location in hive metastore - https://phabricator.wikimedia.org/T408939#11336725 (10Ottomata) > Both data and metadata get deleted when dropping an Iceberg managed table. Hm, okay just asking for my education. This is different than regular Hive external tables then, yes?... [17:12:44] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 10Event-Platform: mediawiki_event_enrichment should enrich all events for the page_content_change stream - https://phabricator.wikimedia.org/T408850#11336746 (10Ottomata) Ya move is good. The only question is if we want `visibility_change` or not.... [17:14:12] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 10Event-Platform: mediawiki_event_enrichment should enrich all events for the page_content_change stream - https://phabricator.wikimedia.org/T408850#11336748 (10Ottomata) ...This conversation makes me think it would be useful to have a property in th... [17:17:04] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 10MediaWiki-Page-derived-data, 07OKR-Work: Global Editor Metrics - backfill pageview metric data - https://phabricator.wikimedia.org/T405040#11336774 (10Ottomata) [17:30:25] 06Data-Engineering, 10CheckUser-SuggestedInvestigations, 10ConfirmEdit (CAPTCHA extension), 06DBA, and 3 others: Add columns to store associated log ID or revision ID that caused a signal to match - https://phabricator.wikimedia.org/T409093#11336842 (10mszwarc) Some of the signals that we support are merge... [17:40:20] 06Data-Engineering, 10CheckUser-SuggestedInvestigations, 10ConfirmEdit (CAPTCHA extension), 06DBA, and 3 others: Add columns to store associated log ID or revision ID that caused a signal to match - https://phabricator.wikimedia.org/T409093#11336904 (10Dreamy_Jazz) >>! In T409093#11336842, @mszwarc wrote:... [17:42:52] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 10Event-Platform: mediawiki_event_enrichment should enrich all events for the page_content_change stream - https://phabricator.wikimedia.org/T408850#11336910 (10JMonton-WMF) Then, it would be like: create: Enrich ✅ undelete: Enrich ✅ edit: Enrich ✅... [18:02:39] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 10Event-Platform: mediawiki_event_enrichment should enrich all events for the page_content_change stream - https://phabricator.wikimedia.org/T408850#11336975 (10xcollazo) +1 from me to do enrich if `page_change_kind != delete`. [18:07:44] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th): Iceberg Merge strategies with dbt - https://phabricator.wikimedia.org/T409099 (10JMonton-WMF) 03NEW [18:10:44] (03CR) 10Neil Shah-Quinn (WMF): Update referer classification patterns (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1198313 (https://phabricator.wikimedia.org/T406531) (owner: 10Joal) [18:11:15] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 10Data-Platform, 07Essential-Work, 06Movement-Insights (FY25-26 H1), 13Patch-For-Review: NEWFEATURE REQUEST: Add new referral sources to pageview data - https://phabricator.wikimedia.org/T406531#11337020 (10nshahquinn-wmf) >>! In T406531#113355... [18:11:51] 06Data-Engineering, 10CheckUser-SuggestedInvestigations, 10ConfirmEdit (CAPTCHA extension), 06DBA, and 3 others: Add columns to store associated log ID or revision ID that caused a signal to match - https://phabricator.wikimedia.org/T409093#11337025 (10Dreamy_Jazz) Having discussed this with @mszwarc a bit... [18:27:17] 06Data-Engineering, 10CampaignEvents, 06DBA, 06Connection-Team (Connection-Current-Sprint), 07Schema-change-in-production: Apply ce_address cleanup schema changes in production (x1) - https://phabricator.wikimedia.org/T409101 (10Daimona) 03NEW [18:27:35] 06Data-Engineering, 10CampaignEvents, 06DBA, 06Connection-Team (Connection-Current-Sprint), 07Schema-change-in-production: Apply ce_address cleanup schema changes in production (x1) - https://phabricator.wikimedia.org/T409101#11337097 (10Daimona) [18:27:53] 06Data-Engineering, 10CampaignEvents, 06DBA, 06Connection-Team (Connection-Current-Sprint), 07Schema-change-in-production: Apply ce_address cleanup schema changes in production (x1) - https://phabricator.wikimedia.org/T409101#11337101 (10Marostegui) a:03Marostegui [18:28:29] 06Data-Engineering, 10CampaignEvents, 06DBA, 06Connection-Team (Connection-Current-Sprint), 07Schema-change-in-production: Apply ce_address cleanup schema changes in production (x1) - https://phabricator.wikimedia.org/T409101#11337103 (10Daimona) 05Open→03Stalled Marking as blocked until wmf.1 is eve... [18:59:13] (03CR) 10CDanis: [C:03+1] "please go ahead!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1200087 (owner: 10CDanis) [19:03:37] 06Data-Engineering (Q2 FY25/26 October 1st - December 31th), 06Data-Persistence, 10Data-Persistence-Design-Review, 06Growth-Team, and 3 others: Data Persistence Design Review: Improve Tone Suggested Edits newcomer task - https://phabricator.wikimedia.org/T401021#11337255 (10Eevans) >>! In T401021#11335930,... [19:31:28] 06Data-Engineering, 06MW-Interfaces-Team, 10Event-Platform: mediawiki.page_change.v1 event stream - Investigate mistmatched meta.dt and dt (and rev_dt) fields - https://phabricator.wikimedia.org/T409105 (10Ottomata) 03NEW [19:32:15] 06Data-Engineering, 10Event-Platform: X-Experiment-Enrollments EventGate handling reinforcement for MalformedHeaderError cases - https://phabricator.wikimedia.org/T409106 (10dr0ptp4kt) 03NEW [19:45:15] 06Data-Engineering, 06SRE, 10SRE-Access-Requests, 13Patch-For-Review: Add dpogorzelski to ML and Data Platform posix groups - https://phabricator.wikimedia.org/T408579#11337419 (10Dzahn) a:03calbon Hello @calbon can we have one more approval over here for the ml-team-admins and analytics-privatedata part? [19:45:39] 06Data-Engineering, 06SRE, 10SRE-Access-Requests, 13Patch-For-Review: Add dpogorzelski to ML and Data Platform posix groups - https://phabricator.wikimedia.org/T408579#11337422 (10Dzahn) 05Open→03In progress