[04:16:49] 06Data-Engineering, 06Data-Engineering-Icebox, 10Data-Engineering-Wikistats, 10PageViewInfo, and 2 others: Pageviews Analysis 3.0 (Vue + Codex) - https://phabricator.wikimedia.org/T378549#12052453 (10MusikAnimal) [08:04:11] 06Data-Engineering: Setup and populate initial version of user_agents_info table - https://phabricator.wikimedia.org/T430020#12052912 (10Pablo) Attached is a JSON file containing the metadata (including `category`) for all bots listed in the [[ https://radar.cloudflare.com/bots/directory | Cloudflare Radar Bots... [08:06:08] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06Data-Platform-SRE (2026-06-05 - 2026-06-26), 13Patch-For-Review: Presto cluster improvements for concurrency and workload - https://phabricator.wikimedia.org/T424112#12052917 (10BTullis) OK, this is now working on the test cluster. We can see that re... [08:14:25] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10DPE-MediaWiki-Incremental-History: Right-size Spark resource config using History Server data - https://phabricator.wikimedia.org/T428966#12052956 (10APizzata-WMF) Before this discussion >By discussing with @JAllemandou we decided to test: I had already... [08:17:00] 06Data-Engineering: Setup and populate initial version of user_agents_info table - https://phabricator.wikimedia.org/T430020#12052957 (10mforns) Exciting work! My thoughts: - Are we going to store only bot-like User-Agents or all of them? Maybe, since we can not extract contact/agent/policy information from them... [08:17:02] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06Data-Platform-SRE (2026-06-05 - 2026-06-26), 13Patch-For-Review: Presto cluster improvements for concurrency and workload - https://phabricator.wikimedia.org/T424112#12052958 (10BTullis) It's not a great start. ` btullis@an-test-client1002:~$ presto --... [08:38:42] 06Data-Engineering: Setup and populate initial version of user_agents_info table - https://phabricator.wikimedia.org/T430020#12053033 (10GGoncalves-WMF) > Are we going to store only bot-like User-Agents or all of them? Maybe, since we can not extract contact/agent/policy information from them and they are also n... [09:05:50] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06Data-Platform-SRE (2026-06-05 - 2026-06-26), 13Patch-For-Review: Presto cluster improvements for concurrency and workload - https://phabricator.wikimedia.org/T424112#12053134 (10BTullis) Much better. ` presto> SHOW SCHEMAS; Schema... [09:30:26] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06Data-Platform-SRE (2026-06-05 - 2026-06-26), 13Patch-For-Review: Presto cluster improvements for concurrency and workload - https://phabricator.wikimedia.org/T424112#12053199 (10BTullis) The resource groups also look fine, when I run tests with the `pr... [10:27:33] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform: Relative Trending - Milestone 3 - Page Trending Flink app - https://phabricator.wikimedia.org/T430134 (10JMonton-WMF) 03NEW [10:28:00] 14Analytics-Radar, 06Data-Engineering, 06Data-Platform-SRE, 06serviceops-radar, and 2 others: Configuration Management for Kafka settings - https://phabricator.wikimedia.org/T276088#12053445 (10elukey) @RKemper I added you to the `kafka-infrastructure` cloud project, you should see it in Horizon! At this... [10:30:42] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06Data-Platform-SRE (2026-06-05 - 2026-06-26), 13Patch-For-Review: Presto cluster improvements for concurrency and workload - https://phabricator.wikimedia.org/T424112#12053461 (10BTullis) This is how we currently configure the growthbook connection to P... [10:33:40] (03PS1) 10Joal: Update MWH - fail if duplicated revisions [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1305625 (https://phabricator.wikimedia.org/T425734) [10:33:59] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform: Relative Trending - Milestone 3 - Page Trending Flink app - https://phabricator.wikimedia.org/T430134#12053482 (10JMonton-WMF) [10:40:41] (03PS2) 10Joal: Update MWH - fail if duplicated revisions [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1305625 (https://phabricator.wikimedia.org/T425734) [10:41:04] (03CR) 10A-pizzata: [C:03+2] "LGTM!" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1305625 (https://phabricator.wikimedia.org/T425734) (owner: 10Joal) [10:48:03] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06Product-Analytics: dbt-jobs backfill: PP3 API hourly and known clients aggregate jobs - https://phabricator.wikimedia.org/T429341#12053524 (10amastilovic) Status update: mrt_api_requests_hourly: [x] 2026-02-01 to to 2026-03-31 (Feb and Mar can be done... [10:48:08] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform: Relative Trending - Milestone 3 - K8s recoures - https://phabricator.wikimedia.org/T430136 (10JMonton-WMF) 03NEW [10:54:40] (03Merged) 10jenkins-bot: Update MWH - fail if duplicated revisions [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1305625 (https://phabricator.wikimedia.org/T425734) (owner: 10Joal) [11:09:02] (03PS1) 10Joal: Update Inc-MWH splitting SQL into smaller files [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1305633 (https://phabricator.wikimedia.org/T425734) [11:09:26] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: Relative Trending - Milestone 3 - K8s recoures - https://phabricator.wikimedia.org/T430136#12053594 (10JMonton-WMF) [11:17:40] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: Relative Trending - Milestone 3 - K8s recoures - https://phabricator.wikimedia.org/T430136#12053664 (10JMonton-WMF) [11:21:01] (03CR) 10CI reject: [V:04-1] Update Inc-MWH splitting SQL into smaller files [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1305633 (https://phabricator.wikimedia.org/T425734) (owner: 10Joal) [12:29:29] 06Data-Engineering, 06Data-Engineering-Radar, 06Data-Platform-SRE (2026-06-05 - 2026-06-26), 10Event-Platform: Delete some unused development topics on Kafka Jumbo - https://phabricator.wikimedia.org/T427951#12053939 (10Ottomata) > JobManager auto-recovers it on startup regardless of upgradeMode Huh!... [12:33:33] (03PS2) 10Joal: Update Inc-MWH splitting SQL into smaller files [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1305633 (https://phabricator.wikimedia.org/T425734) [12:37:05] (03PS3) 10Joal: Update Inc-MWH splitting SQL into smaller files [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1305633 (https://phabricator.wikimedia.org/T425734) [12:44:29] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 06Machine-Learning-Team (Q4 FY2025-26), 13Patch-For-Review: `mediawiki.page_change.v1`: negative namespace_id schema validation errors - https://phabricator.wikimedia.org/T421237#12054002 (10Ottomata) [12:47:15] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 06Machine-Learning-Team (Q4 FY2025-26), 13Patch-For-Review: `mediawiki.page_change.v1`: negative namespace_id schema validation errors - https://phabricator.wikimedia.org/T421237#12054015 (10Ottomata) [12:52:36] 06Data-Engineering, 10Event-Platform: eventgate-analytics - allow refetching of stream config - https://phabricator.wikimedia.org/T430154 (10Ottomata) 03NEW [12:52:48] 06Data-Engineering, 10Event-Platform: eventgate-analytics - allow refetching of stream config - https://phabricator.wikimedia.org/T430154#12054035 (10Ottomata) p:05Triage→03Medium [13:07:19] (03PS1) 10A-pizzata: Distribute and sort snapshot rows on write [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1305674 (https://phabricator.wikimedia.org/T428966) [13:18:39] 06Data-Engineering, 06Data-Engineering-Icebox, 06DBA: Move Mostcategories computation to Hadoop - https://phabricator.wikimedia.org/T413362#12054164 (10Zabe) 05Open→03Resolved [13:20:31] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: mediawiki.page_html_content_change.v1 stream content_uri field uses localhost instead of wiki hostname - https://phabricator.wikimedia.org/T427598#12054169 (10Ottomata) Deployed and verified: ` kafkacat -u -C -b ka... [13:26:50] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Test Kitchen, 07Essential-Work: Implement/enforce 90 day data retention policy in derived Iceberg tables - https://phabricator.wikimedia.org/T429548#12054193 (10AKhatun_WMF) [13:28:29] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Test Kitchen, 07Essential-Work: Implement/enforce 90 day data retention policy in derived Iceberg tables - https://phabricator.wikimedia.org/T429548#12054199 (10AKhatun_WMF) The data older than 90 days are now gone ` akhatun@stat1008:~$ hdfs dfs -ls /w... [13:29:21] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10DPE-MediaWiki-Incremental-History, 13Patch-For-Review: Ingest wmf_mediawiki tables to datahub - https://phabricator.wikimedia.org/T429931#12054206 (10Ottomata) [13:30:16] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: Relative Trending - Milestone 3 - K8s recoures - https://phabricator.wikimedia.org/T430136#12054219 (10JMonton-WMF) [14:34:27] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10DPE-MediaWiki-Incremental-History, 10Event-Platform, 13Patch-For-Review: mediawiki.page_change.v1 events - add wikidata id for pages - https://phabricator.wikimedia.org/T428176#12054484 (10Ottomata) Hm. For correctness: what if we only set `prior_st... [14:40:43] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10DPE-MediaWiki-Incremental-History, 13Patch-For-Review: Right-size Spark resource config using History Server data - https://phabricator.wikimedia.org/T428966#12054513 (10APizzata-WMF) Tried the configuration here described: >>! In T428966#12051132, @... [14:42:54] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10DPE-MediaWiki-Incremental-History, 13Patch-For-Review: Quality verification for mediawiki_history_incremental_v1 using Iceberg time travel - https://phabricator.wikimedia.org/T425734#12054528 (10APizzata-WMF) With @JAllemandou today we discussed that:... [14:43:30] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06Product-Analytics: dbt-jobs backfill: PP3 API hourly and known clients aggregate jobs - https://phabricator.wikimedia.org/T429341#12054533 (10Ahoelzl) @amastilovic this should complete by ~Monday? [14:46:36] (03CR) 10Joal: "2 nits :)" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1305674 (https://phabricator.wikimedia.org/T428966) (owner: 10A-pizzata) [15:06:49] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10DPE-MediaWiki-Incremental-History, 10Event-Platform, 13Patch-For-Review: mediawiki.page_change.v1 events - add wikidata id for pages - https://phabricator.wikimedia.org/T428176#12054712 (10Ottomata) Nah. prior_state has a similar awkwardness. The v... [15:15:44] 06Data-Engineering, 10MediaWiki-extensions-EventLogging, 07Epic, 07Essential-Work, and 2 others: [EPIC] Deprecate and remove mw.eventLog.newInstrument() - https://phabricator.wikimedia.org/T408091#12054776 (10KReid-WMF) [15:29:39] (03CR) 10A-pizzata: [C:03+1] "LGTM!" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1305633 (https://phabricator.wikimedia.org/T425734) (owner: 10Joal) [15:44:06] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: Relative Trending - Milestone 3 - K8s recoures - https://phabricator.wikimedia.org/T430136#12054982 (10JMonton-WMF) [16:20:34] 06Data-Engineering, 10DPE-MediaWiki-Incremental-History, 06MW-Interfaces-Team: MediaWiki DomainEvents - Include LogEntry - https://phabricator.wikimedia.org/T427815#12055286 (10JMoore-WMF) @Ottomata would you like me to set up some discussions with the mw team so we can kick off collaboration? [16:20:41] 06Data-Engineering, 10DPE-MediaWiki-Incremental-History, 06MW-Interfaces-Team, 10Event-Platform: MediaWiki DomainEvents - Create new User related DomainEvents - https://phabricator.wikimedia.org/T427817#12055290 (10JMoore-WMF) @Ottomata would you like me to set up some discussions with the mw team so we ca... [16:32:26] 06Data-Engineering, 10DPE-MediaWiki-Incremental-History, 06MW-Interfaces-Team: MediaWiki DomainEvents - Include LogEntry - https://phabricator.wikimedia.org/T427815#12055354 (10Ottomata) > @Ottomata did you intent to move this one to Phase II?! Yes! And I did. [16:32:44] 06Data-Engineering, 10DPE-MediaWiki-Incremental-History, 06MW-Interfaces-Team: MediaWiki DomainEvents - Include LogEntry - https://phabricator.wikimedia.org/T427815#12055356 (10Ottomata) > @Ottomata would you like me to set up some discussions with the mw team so we can kick off collaboration? Yes please th... [16:34:29] 10Data-Engineering-Roadmap, 07Epic: [Epic] KAPOW: The next generation of bot detection in the Data Platform. - https://phabricator.wikimedia.org/T425661#12055365 (10JMoore-WMF) which user group do i need access to in order to see the restriced tasks? [16:35:36] 06Data-Engineering, 10MediaWiki-extensions-EventLogging, 07Epic, 07Essential-Work, and 2 others: [EPIC] Deprecate and remove mw.eventLog.newInstrument() - https://phabricator.wikimedia.org/T408091#12055378 (10KReid-WMF) p:05Triage→03Medium [17:32:55] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06MW-Interfaces-Team, 10Event-Platform: EventBus JobQueue: Invalid mediawiki signature error caused by meta.dt field - https://phabricator.wikimedia.org/T418573#12055798 (10Ottomata) [17:39:12] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06MW-Interfaces-Team, 10Event-Platform: EventBus JobQueue: Invalid mediawiki signature error caused by meta.dt field - https://phabricator.wikimedia.org/T418573#12055838 (10Ottomata) This does not break WMF production because WMF does not use EventBus `R... [17:39:50] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06MW-Interfaces-Team, 10Event-Platform: EventBus JobQueue: Invalid mediawiki signature error caused by meta.dt field - https://phabricator.wikimedia.org/T418573#12055842 (10Ottomata) CC @tchin since we have been considering merging base EventGate eventga... [17:50:52] (03CR) 10Snwachukwu: [C:03+2] Add globalimagelinks table to sqoop [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1295064 (https://phabricator.wikimedia.org/T427532) (owner: 10Dr0ptp4kt) [17:58:22] (03CR) 10Snwachukwu: [C:03+2] Add filerevision table to sqoop [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1295069 (https://phabricator.wikimedia.org/T427532) (owner: 10Dr0ptp4kt) [17:58:32] (03CR) 10Snwachukwu: [V:03+2 C:03+2] Add filerevision table to sqoop [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1295069 (https://phabricator.wikimedia.org/T427532) (owner: 10Dr0ptp4kt) [17:58:45] (03CR) 10Snwachukwu: [V:03+2 C:03+2] Add globalimagelinks table to sqoop [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1295064 (https://phabricator.wikimedia.org/T427532) (owner: 10Dr0ptp4kt) [18:10:17] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06MW-Interfaces-Team, 10Event-Platform: EventBus JobQueue: Invalid mediawiki signature error caused by meta.dt field - https://phabricator.wikimedia.org/T418573#12056020 (10Ottomata) I see. Lots of context in {T175146}. TIL that the `mediawiki_signature... [20:33:25] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st): Operationalize dbt jobs: all moderator models - https://phabricator.wikimedia.org/T429997#12056500 (10CMyrick-WMF) [21:59:33] 06Data-Engineering, 06Java-Scala-Standardization, 07Essential-Work: Ignore MacOS .DS_Store in parent pom - https://phabricator.wikimedia.org/T407514#12056707 (10TheDJ) 05Open→03Resolved assuming yes.