[07:05:14] 06Data-Engineering, 10Event-Platform: [Event Platform] eventutilites-python: improve consistency guarantees of async process functions - https://phabricator.wikimedia.org/T347282#10858139 (10dcausse) Here's my understanding of the possible solutions: * use the keyed state: will probably have a huge impact on t... [09:11:20] (03PS5) 10Giuseppe Lavagetto: Allow loading an external regex.yaml file for ua-parser [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1149334 (https://phabricator.wikimedia.org/T394794) [13:27:35] 10Data-Engineering (Q4 2025 April 1st - June 30th), 13Patch-For-Review: Enable Spark data lineage for all Airflow instances - https://phabricator.wikimedia.org/T386862#10859370 (10brouberol) @mforns I didn't report back, but you should be gtg by the way! [13:55:28] 10Data-Engineering (Q4 2025 April 1st - June 30th), 10Event-Platform: mediawiki.page_change.v1 should not contain events for undelete into existing pages. - https://phabricator.wikimedia.org/T395327 (10gmodena) 03NEW [14:00:36] 06Data-Engineering, 10Event-Platform: [Event Platform] eventutilites-python: improve consistency guarantees of async process functions - https://phabricator.wikimedia.org/T347282#10859528 (10gmodena) > use the keyed state: will probably have a huge impact on throughput & latency, a new batch will be created pe... [14:07:53] 10Data-Engineering (Q4 2025 April 1st - June 30th), 10Event-Platform: [Event Platform] eventutilites-python: improve consistency guarantees of async process functions - https://phabricator.wikimedia.org/T347282#10859554 (10gmodena) [14:16:57] 10Data-Engineering (Q4 2025 April 1st - June 30th), 10Event-Platform: mediawiki.page_change.v1 should not contain events for undelete into existing pages. - https://phabricator.wikimedia.org/T395327#10859570 (10gmodena) a:03gmodena [14:23:38] 10Data-Engineering (Q4 2025 April 1st - June 30th), 10DPE-Mediawiki-Content: MediaWiki Content History alerts too much for minor reconcile issues - https://phabricator.wikimedia.org/T395139#10859586 (10xcollazo) a:03xcollazo [14:26:51] 10Data-Engineering (Q4 2025 April 1st - June 30th), 10Event-Platform: [Event Platform] eventutilites-python: improve consistency guarantees of async process functions - https://phabricator.wikimedia.org/T347282#10859602 (10dcausse) >>! In T347282#10859528, @gmodena wrote: > In this case, would you suggest mak... [14:32:10] 06Data-Engineering, 10GlobalBlocking, 06Trust and Safety Product Team, 07Schema-change-in-production: Update gb_id to be unsigned in the globalblocks table on WMF production - https://phabricator.wikimedia.org/T395333 (10Dreamy_Jazz) 03NEW [14:32:44] 06Data-Engineering, 10GlobalBlocking, 06Trust and Safety Product Team, 07Schema-change-in-production: Update gb_id to be unsigned in the globalblocks table on WMF production - https://phabricator.wikimedia.org/T395333#10859629 (10Marostegui) a:05Dreamy_Jazz→03Marostegui [14:35:41] 06Data-Engineering, 10GlobalBlocking, 06Trust and Safety Product Team, 07Schema-change-in-production: Update gb_id to be unsigned in the globalblocks table on WMF production - https://phabricator.wikimedia.org/T395333#10859645 (10Marostegui) 05Open→03Resolved Done on the master with replication ` c... [14:36:49] 06Data-Engineering, 06DBA, 10GlobalBlocking, 06Trust and Safety Product Team, 07Schema-change-in-production: Update gb_id to be unsigned in the globalblocks table on WMF production - https://phabricator.wikimedia.org/T395333#10859651 (10Marostegui) [14:37:50] 06Data-Engineering, 06DBA, 10GlobalBlocking, 06Trust and Safety Product Team, 07Schema-change-in-production: Update gb_id to be unsigned in the globalblocks table on WMF production - https://phabricator.wikimedia.org/T395333#10859654 (10Dreamy_Jazz) Thanks for the quick deployment! [14:42:15] 06Data-Engineering, 06DBA, 10GlobalBlocking, 06Trust and Safety Product Team, 07Schema-change-in-production: Make gbw_id in global_block_whitelist table unsigned on WMF wikis - https://phabricator.wikimedia.org/T395335 (10Dreamy_Jazz) 03NEW [15:09:05] 06Data-Engineering, 06DBA, 10GlobalBlocking, 06Trust and Safety Product Team, 07Schema-change-in-production: Make gbw_id in global_block_whitelist table unsigned on WMF wikis - https://phabricator.wikimedia.org/T395335#10859801 (10Marostegui) p:05Triage→03Medium a:03Marostegui [15:26:24] 06Data-Engineering, 06Data-Platform-SRE, 06Java-Scala-Standardization, 10Discovery-Search (2025.05.24 - 2025.06.13), 13Patch-For-Review: Migrate existing Java packages to deploying to Gitlab, including new version of parent pom, validation that all depen... - https://phabricator.wikimedia.org/T367405#10859853 [16:41:25] 06Data-Engineering, 10LDAP-Access-Requests, 06SRE, 13Patch-For-Review: Grant Access to Product's Superset & Turnilo for SKivlehan - https://phabricator.wikimedia.org/T393626#10860346 (10spatton) Approved! Thanks for the reminder, @MoritzMuehlenhoff :) [16:42:20] 07Analytics-Data-Problem, 10Data-Engineering (Q4 2025 April 1st - June 30th), 10Event-Platform, 10MediaWiki-Platform-Team (Radar), 05SUL3: Some events in mediawiki.page_change.v1 refers to auth.wikimedia.org in meta.uri and meta.domain - https://phabricator.wikimedia.org/T388825#10860347 (10nshahquinn-wmf) [16:50:44] 06Data-Engineering: Airflow job to load Knowledge Gap metrics into Cassandra - https://phabricator.wikimedia.org/T337060#10860420 (10Aklapper) @Milimetric: Is this supposed to stay open and should get unassigned, or is this resolved? Asking as this has been assigned for two years to a now inactive assignee. [17:04:59] 10Data-Engineering (Q4 2025 April 1st - June 30th), 10Dumps-Generation, 07Essential-Work: clouddumps1001 (and hence /public/dumps) missing latest dumps - https://phabricator.wikimedia.org/T395174#10860495 (10xcollazo) [17:28:08] 10Data-Engineering (Q4 2025 April 1st - June 30th), 10Event-Platform: [Event Platform] eventutilites-python: improve consistency guarantees of async process functions - https://phabricator.wikimedia.org/T347282#10860676 (10gmodena) >>! In T347282#10859602, @dcausse wrote: >>>! In T347282#10859528, @gmodena wro... [17:29:19] 06Data-Engineering: Airflow job to load Knowledge Gap metrics into Cassandra - https://phabricator.wikimedia.org/T337060#10860688 (10Milimetric) a:05nickifeajika→03None This has not been resolved, the work was completed but is now irrelevant due to system migration. What the task was hoping to accomplish ha... [17:42:20] 10Data-Engineering (Q4 2025 April 1st - June 30th): Determine how many admins are there in English Wikipedia and French Wikipedia - https://phabricator.wikimedia.org/T395279#10860769 (10Ahoelzl) [17:43:24] 10Data-Engineering (Q4 2025 April 1st - June 30th), 10FR-Tech-Analytics, 10FR-tech-data-integrity, 10fundraising-tech-ops: Low volume in new webrequest feed - https://phabricator.wikimedia.org/T395089#10860771 (10Ahoelzl) [17:43:32] 10Data-Engineering (Q4 2025 April 1st - June 30th), 10FR-Tech-Analytics, 10FR-tech-data-integrity, 10fundraising-tech-ops: Low volume in new webrequest feed - https://phabricator.wikimedia.org/T395089#10860772 (10Ahoelzl) p:05Triage→03High [17:49:40] 06Data-Engineering, 10ChangeProp, 10Observability-Tracing, 13Patch-For-Review: Implement tracing across changeprop-jobqueue - https://phabricator.wikimedia.org/T395038#10860811 (10Ahoelzl) @gmodena can you help assess the effort and impact? [17:50:48] 06Data-Engineering, 10ChangeProp, 10Observability-Tracing, 13Patch-For-Review: Implement tracing across changeprop-jobqueue - https://phabricator.wikimedia.org/T395038#10860827 (10Milimetric) @mszabo subscribing to stay in the loop as well, this work is very relevant to our attempts at SLOs for the new exp... [17:52:52] 06Data-Engineering, 10Data-Services: Create a view for existencelinks table - https://phabricator.wikimedia.org/T394898#10860849 (10Milimetric) @fnegri just to help with prioritization, when do you need us to sign off on this? [17:53:30] 06Data-Engineering, 06Data-Engineering-Radar, 10Data-Platform-SRE (2025.05.24 - 2025.06.13), 13Patch-For-Review: Fix the hard dependency between the Airflow scheduler and the DataHub GMS service - https://phabricator.wikimedia.org/T395106#10860853 (10Milimetric) [17:57:14] 06Data-Engineering, 06Movement-Insights: event.editattemptstep is not logging some revisions that appear in mediawiki_history - https://phabricator.wikimedia.org/T394961#10860879 (10Ahoelzl) @gmodena can you investigate? [17:57:22] 10Data-Engineering (Q4 2025 April 1st - June 30th), 06Movement-Insights: event.editattemptstep is not logging some revisions that appear in mediawiki_history - https://phabricator.wikimedia.org/T394961#10860881 (10Ahoelzl) [17:58:22] 06Data-Engineering, 06SRE, 10SRE-Access-Requests: Requesting access to analytics-privatedata-users for Neslihan_Turan_WMDE - https://phabricator.wikimedia.org/T394395#10860887 (10Milimetric) approved from our side (data engineering as stewards of the data) [17:58:38] 06Data-Engineering, 06Data-Engineering-Radar, 06SRE, 10SRE-Access-Requests: Requesting access to analytics-privatedata-users for Neslihan_Turan_WMDE - https://phabricator.wikimedia.org/T394395#10860889 (10Milimetric) [17:59:53] 06Data-Engineering, 10LDAP-Access-Requests, 06SRE, 13Patch-For-Review: Grant Access to Product's Superset & Turnilo for SKivlehan - https://phabricator.wikimedia.org/T393626#10860892 (10Milimetric) (I don't think we need to action this further, but I may be forgetting some steps, do ping us if so) [18:39:34] 10Data-Engineering (Q4 2025 April 1st - June 30th): Determine how many admins are there in English Wikipedia and French Wikipedia from sub-Saharan Africa - https://phabricator.wikimedia.org/T395279#10861033 (10Asaf) [19:08:20] 06Data-Engineering, 07Documentation: [Documentation] Update and synchronize Data Platform/Engineering Contact Us and Intake Process docs - https://phabricator.wikimedia.org/T364572#10861155 (10Aklapper) a:05odimitrijevic→03None @ahoelzl: Removing inactive assignee. Please do so as part of team offboarding... [19:47:35] 10Data-Engineering (Q4 2025 April 1st - June 30th), 10FR-Tech-Analytics, 10FR-tech-data-integrity, 10fundraising-tech-ops: Low volume in new webrequest feed - https://phabricator.wikimedia.org/T395089#10861217 (10Jgreen) Rerunning this using the full broker list, so we get all the partitions. [20:57:59] 10Data-Engineering (Q4 2025 April 1st - June 30th), 10Data-Platform-SRE (2025.05.24 - 2025.06.13), 13Patch-For-Review: Remove `analytics` instance folder in airflow repo - https://phabricator.wikimedia.org/T394015#10861404 (10amastilovic) I think this ticket should be closed, as the correlated MR has been me... [21:31:57] 07Analytics-Data-Problem, 10Data-Engineering (Q4 2025 April 1st - June 30th), 10Event-Platform, 10MediaWiki-Platform-Team (Radar), 05SUL3: Some events in mediawiki.page_change.v1 refers to auth.wikimedia.org in meta.uri and meta.domain - https://phabricator.wikimedia.org/T388825#10861480 (10mpopov) I'll... [22:14:33] 10Data-Engineering (Q4 2025 April 1st - June 30th), 10Dumps-Generation, 07Essential-Work: clouddumps1001 (and hence /public/dumps) missing latest dumps - https://phabricator.wikimedia.org/T395174#10861560 (10Ahoelzl) p:05Triage→03High a:03xcollazo [22:16:47] 10Data-Engineering (Q4 2025 April 1st - June 30th): 2025-04-01 run of mediawiki_wikitext_history is stuck (20d running) - https://phabricator.wikimedia.org/T394954#10861571 (10Ahoelzl) The pipeline meanwhile failed: https://airflow.wikimedia.org/dags/mediawiki_wikitext_history/grid?search=mediawiki_wikitext_hist... [22:16:58] 10Data-Engineering (Q4 2025 April 1st - June 30th): 2025-04-01 run of mediawiki_wikitext_history is stuck (20d running) - https://phabricator.wikimedia.org/T394954#10861572 (10Ahoelzl) p:05Triage→03Medium a:03JAllemandou [22:21:05] 10Data-Engineering (Q4 2025 April 1st - June 30th), 07Essential-Work: 8 new wikis missing from mediawiki_history - https://phabricator.wikimedia.org/T368788#10861592 (10Ahoelzl) a:05mforns→03JAllemandou [22:21:37] 10Data-Engineering (Q4 2025 April 1st - June 30th), 07Essential-Work: 8 new wikis missing from mediawiki_history - https://phabricator.wikimedia.org/T368788#10861595 (10Ahoelzl) 05Open→03Resolved Resolved. [22:46:59] 06Data-Engineering, 07Essential-Work: Support for 4.3.11 - webrequest based scraping detection - https://phabricator.wikimedia.org/T388721#10861646 (10Ahoelzl) [23:08:47] 10Data-Engineering (Q4 2025 April 1st - June 30th): Analytics Cluster Dataset Usage Discovery Task - https://phabricator.wikimedia.org/T389903#10861682 (10Ahoelzl) Initial extraction results based on analytics dag instances: https://docs.google.com/spreadsheets/d/16jOOv1niXO4x2vxiG1GmfG-tFsfZfMUXnNB64-GPCfM/edi... [23:35:06] 10Data-Engineering (Q4 2025 April 1st - June 30th), 07Essential-Work, 13Patch-For-Review: [Data Quality] Implement wiki completeness check for MediaWiki History - https://phabricator.wikimedia.org/T365203#10861718 (10Ahoelzl) a:05Snwachukwu→03mforns [23:59:32] 10Data-Engineering (Q4 2025 April 1st - June 30th), 06Experimentation Lab: NEW/CHANGE FEATURE REQUEST: make available the centralauth.globaluser table in Data Lake - https://phabricator.wikimedia.org/T389666#10861743 (10Ahoelzl) p:05Triage→03Medium