[00:07:30] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th): Backfill datasets affected by automated traffic detection issues - https://phabricator.wikimedia.org/T405667#11235410 (10Ahoelzl) @mforns please pause the Druid backfilling for the moment. [02:47:10] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10MediaWiki-Page-derived-data, 07OKR-Work: Global Editor Metrics - Druid mediawiki_history_reduced changes - https://phabricator.wikimedia.org/T406069#11235551 (10Ottomata) === Druid query load testing I wanted to just use `ab`, but IIRC, Druid has... [02:54:22] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10MediaWiki-Page-derived-data, 07OKR-Work: Global Editor Metrics - Druid mediawiki_history_reduced changes - https://phabricator.wikimedia.org/T406069#11235559 (10Ottomata) === Druid query load testing results |query|concurrency|# reqs| # fails|Avg... [03:14:18] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10MediaWiki-Page-derived-data, 07OKR-Work: Global Editor Metrics - Druid mediawiki_history_reduced changes - https://phabricator.wikimedia.org/T406069#11235565 (10Ottomata) I wanted to see if we [[ https://wikimedia.slack.com/archives/C08RA9DJS14/p17... [03:35:45] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10MediaWiki-Page-derived-data, 07OKR-Work: Global Editor Metrics - Druid mediawiki_history_reduced changes - https://phabricator.wikimedia.org/T406069#11235581 (10Ottomata) Oh but, what if I reduce the time range we are looking at? Restricting the... [03:41:12] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10MediaWiki-Page-derived-data, 07OKR-Work: Global Editor Metrics - Druid mediawiki_history_reduced changes - https://phabricator.wikimedia.org/T406069#11235584 (10Ottomata) Okay, ^ T406069#11235581 is good news. To me this means: - We can use a sin... [03:59:17] 07Analytics-Data-Problem, 06Data-Engineering (Q1 FY25/26 July 1st - September 30th): Unique devices data uses non-standard domains for Wikidata, Wikifunctions, and MediaWiki.org - https://phabricator.wikimedia.org/T405533#11235615 (10nshahquinn-wmf) >>! In T405533#11213571, @JAllemandou wrote: > We have change... [05:50:09] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th): Backfill datasets affected by automated traffic detection issues - https://phabricator.wikimedia.org/T405667#11235719 (10mforns) @Ahoelzl OK! [05:50:42] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th): Backfill datasets affected by automated traffic detection issues - https://phabricator.wikimedia.org/T405667#11235720 (10mforns) [05:53:28] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th): Backfill datasets affected by automated traffic detection issues - https://phabricator.wikimedia.org/T405667#11235721 (10mforns) [06:27:15] (03CR) 10Joal: [C:03+1] "LGTM!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1192577 (https://phabricator.wikimedia.org/T405783) (owner: 10CDanis) [06:35:00] 06Data-Engineering, 06SRE, 06Traffic-Icebox, 10MobileFrontend (Tracking): RFC: Serve mobile and desktop variants through the same URL (unified mobile routing) - https://phabricator.wikimedia.org/T214998#11235768 (10Krinkle) [06:41:33] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10MediaWiki-Page-derived-data, 07OKR-Work: Global Editor Metrics - Druid mediawiki_history_reduced changes - https://phabricator.wikimedia.org/T406069#11235769 (10JAllemandou) I have been thinking about the 1 versus 2 solutions above, and I have more... [07:08:33] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10Event-Platform, 05MW-1.45-notes (1.45.0-wmf.20; 2025-09-23), 13Patch-For-Review: Update event-producing tools to overwrite `meta.dt` - https://phabricator.wikimedia.org/T376026#11235805 (10pfischer) [08:24:36] 07Analytics-Data-Problem, 06Data-Engineering (Q1 FY25/26 July 1st - September 30th): Unique devices data uses non-standard domains for Wikidata, Wikifunctions, and MediaWiki.org - https://phabricator.wikimedia.org/T405533#11235980 (10JAllemandou) >>! In T405533#11235615, @nshahquinn-wmf wrote: >>>! In T405533#... [08:25:40] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th): Mitigate consequences of Gobblin hiccups generating late events and alerts - https://phabricator.wikimedia.org/T402324#11235983 (10JAllemandou) a:03JAllemandou [08:29:07] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th): Mitigate consequences of Gobblin hiccups generating late events and alerts - https://phabricator.wikimedia.org/T402324#11235987 (10JAllemandou) I think we should go for option 1, it allows the delay to be based on data rather than calendar time, making i... [09:52:39] 06Data-Engineering, 06Java-Scala-Standardization, 06Discovery-Search (2025.09.26 - 2025.10.17), 07Essential-Work: Create Gitlab CI templates for JVM packages - https://phabricator.wikimedia.org/T386406#11236168 (10pfischer) [09:53:56] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 06Discovery-Search (2025.09.26 - 2025.10.17), 10Event-Platform, 05MW-1.45-notes (1.45.0-wmf.20; 2025-09-23), 13Patch-For-Review: Update event-producing tools to overwrite `meta.dt` - https://phabricator.wikimedia.org/T376026#11236184 (10pfischer) [09:54:51] 06Data-Engineering, 06Data-Platform-SRE, 06Java-Scala-Standardization, 06Discovery-Search (2025.09.26 - 2025.10.17), and 2 others: Migrate existing Java packages to deploying to Gitlab, including new version of parent pom, validation that all dependencies ... - https://phabricator.wikimedia.org/T367405#11236198 [09:55:20] 06Data-Engineering, 10CirrusSearch, 06Data-Platform-SRE, 10DPE-Mediawiki-Content, and 3 others: Source the CirrusSearch index dumps from hadoop instead of a MW maintenance script - https://phabricator.wikimedia.org/T366248#11236205 (10pfischer) [09:55:45] 06Data-Engineering, 06Data-Engineering-Radar, 10CirrusSearch, 06Discovery-Search (2025.09.26 - 2025.10.17), and 2 others: Eventutilities Flink: port SerDe tests from SUP - https://phabricator.wikimedia.org/T404597#11236217 (10pfischer) [10:09:45] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 07Schema-change-in-production: Drop rc_new from recentchanges table in wmf production - https://phabricator.wikimedia.org/T402763#11236290 (10Ladsgroup) The index in those hosts has a different name: ` KEY `new_name_timestamp` (`rc_new`,`rc_namespace`,`... [10:29:14] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 07Schema-change-in-production: Drop rc_new from recentchanges table in wmf production - https://phabricator.wikimedia.org/T402763#11236401 (10Ladsgroup) [10:29:21] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 07Schema-change-in-production: Drop rc_new from recentchanges table in wmf production - https://phabricator.wikimedia.org/T402763#11236402 (10Ladsgroup) 05Open→03Resolved [14:30:24] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10MediaWiki-Page-derived-data, 07OKR-Work: Global Editor Metrics - Druid mediawiki_history_reduced changes - https://phabricator.wikimedia.org/T406069#11237136 (10Ottomata) Discussed with Druid a bit. I didn't like Option 1 because I don't want to d... [14:36:44] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10MediaWiki-Page-derived-data, 07OKR-Work: Global Editor Metrics - Druid mediawiki_history_reduced changes - https://phabricator.wikimedia.org/T406069#11237165 (10Ottomata) @JAllemandou @Milimetric should we consider Option 3? === 3. Lambda-ize only... [14:46:05] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10MediaWiki-Page-derived-data, 07OKR-Work: Global Editor Metrics - Druid mediawiki_history_reduced changes - https://phabricator.wikimedia.org/T406069#11237174 (10Milimetric) hm, I'm not sure I see why this is easier than Option 2 But it does make m... [14:55:01] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10MediaWiki-Page-derived-data, 07OKR-Work: Global Editor Metrics - Druid mediawiki_history_reduced changes - https://phabricator.wikimedia.org/T406069#11237227 (10Ottomata) > hm, I'm not sure I see why this is easier than Option 2 Option 3 is the sa... [15:02:13] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th): Mitigate consequences of Gobblin hiccups generating late events and alerts - https://phabricator.wikimedia.org/T402324#11237257 (10Mayakp.wiki) speaking for our needs, I think that's totally fine! :) [15:09:13] joal: dumb q, when/how often do we do a refinery release? [15:09:58] cdanis: we used to do it weekly with a train. Lately it's less regular, but happens at least once a month I'd say [15:10:32] ah okay, as-is my Varnish patch will 'break' the existing wmfuniq x-analytics field, not sure if that's a concern for alerts/pipeline health on your side or anything [15:11:20] but I was planning on just proceeding and waiting for refinery to catch up [15:15:35] cdanis: It shouldn't break anything on out side I think. The x_analytics is parsed as a map, and on the key will change - no more. I don't think we (DE) have downstream jobs using this field as of today. Possibly other teams' jobs? [15:15:42] Ok to proceed on my side [15:15:44] I don't think so [15:15:49] afaik it's only referenced in Turnilo [15:15:52] thanks! [15:26:44] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10MediaWiki-Page-derived-data, 07OKR-Work: Global Editor Metrics - Druid mediawiki_history_reduced changes - https://phabricator.wikimedia.org/T406069#11237369 (10JAllemandou) I'd be ok with Option 3, as it makes a long time we didn't have to rollbac... [16:22:05] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th): Backfill datasets affected by automated traffic detection issues - https://phabricator.wikimedia.org/T405667#11237652 (10mforns) [16:22:45] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th): Backfill datasets affected by automated traffic detection issues - https://phabricator.wikimedia.org/T405667#11237653 (10mforns) [16:38:50] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10MediaWiki-Page-derived-data, 07OKR-Work: Global Editor Metrics - Druid mediawiki_history_reduced changes - https://phabricator.wikimedia.org/T406069#11237731 (10Ottomata) I think I prefer Option 3. Rollback would be just as (maybe more?) complicat... [16:41:00] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10MediaWiki-Page-derived-data, 07OKR-Work: Global Editor Metrics - Druid mediawiki_history_reduced changes - https://phabricator.wikimedia.org/T406069#11237736 (10Ottomata) The easy path to Option 3 is to do Option 2 now, then migrate usages of month... [16:44:52] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th): Backfill datasets affected by automated traffic detection issues - https://phabricator.wikimedia.org/T405667#11237766 (10mforns) [16:51:57] 06Data-Engineering, 06SRE, 10SRE-Access-Requests, 13Patch-For-Review: Requesting access to analytics-privatedata-users for tais-lessa - https://phabricator.wikimedia.org/T405129#11237799 (10cmadeo) Apologies for the delay! As @TLessa-WMF's manager I am happy to approve. Thank you and very sorry for holding... [16:52:44] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10MediaWiki-Page-derived-data, 07OKR-Work: Global Editor Metrics - Druid mediawiki_history_reduced changes - https://phabricator.wikimedia.org/T406069#11237801 (10JAllemandou) The only concern I have is data duplication (possibly not optimal performa... [16:55:35] 06Data-Engineering, 06SRE, 10SRE-Access-Requests, 13Patch-For-Review: Requesting access to analytics-privatedata-users for tais-lessa - https://phabricator.wikimedia.org/T405129#11237824 (10Dzahn) 05Stalled→03In progress [16:55:59] 06Data-Engineering, 06SRE, 10SRE-Access-Requests, 13Patch-For-Review: Requesting access to analytics-privatedata-users for tais-lessa - https://phabricator.wikimedia.org/T405129#11237825 (10Dzahn) [17:02:24] 06Data-Engineering, 06SRE, 10SRE-Access-Requests, 13Patch-For-Review: Requesting access to analytics-privatedata-users for tais-lessa - https://phabricator.wikimedia.org/T405129#11237852 (10Dzahn) Thank you @cmadeo @TLessa-WMF you have been added to the requested group just now. You should be able to see... [17:03:09] 06Data-Engineering, 06SRE, 10SRE-Access-Requests, 13Patch-For-Review: Requesting access to analytics-privatedata-users for tais-lessa - https://phabricator.wikimedia.org/T405129#11237854 (10Dzahn) 05In progress→03Resolved a:05cmadeo→03None [17:12:37] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th): Backfill datasets affected by automated traffic detection issues - https://phabricator.wikimedia.org/T405667#11237903 (10mforns) **Backfilling plan updated** - Pause all loading to Cassandra and do not start any loading to Druid until further notice. -... [17:14:24] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10MediaWiki-Page-derived-data, 07OKR-Work: Global Editor Metrics - Druid mediawiki_history_reduced changes - https://phabricator.wikimedia.org/T406069#11237904 (10Ottomata) Hm, yeah. I think going full option 3 is not that hard once we have Option 2,... [17:16:17] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-logging-external in codfw. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=codfw%2Bprometheus/k8s&var-service=eventgate-logging-external - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [17:21:17] FIRING: [3x] EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-analytics in codfw. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [17:21:23] 06Data-Engineering, 06Data-Persistence, 10MediaWiki-Core-Revision-backend, 10Wikidata, and 4 others: Rethink rev_sha1 field - https://phabricator.wikimedia.org/T389026#11237922 (10Ladsgroup) It would be nice to have an entry in tech news along the lines of: > The field rev_sha1 in revision table is being r... [17:35:09] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th): Backfill datasets affected by automated traffic detection issues - https://phabricator.wikimedia.org/T405667#11237969 (10mforns) [17:39:32] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10MediaWiki-Page-derived-data, 07OKR-Work: Global Editor Metrics - Druid mediawiki_history_reduced changes - https://phabricator.wikimedia.org/T406069#11237972 (10JAllemandou) Works for me :) [17:45:16] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10Observability-Alerting, 10Event-Platform: EventgateProduceRateStop / EventGateProduceRate alert should be active datacenter aware - https://phabricator.wikimedia.org/T405952#11237987 (10dr0ptp4kt) [17:46:14] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10Observability-Alerting, 10Event-Platform: EventgateProduceRateStop / EventGateProduceRateAnomaly alert should be active datacenter aware - https://phabricator.wikimedia.org/T405952#11237989 (10dr0ptp4kt) [17:50:58] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 06Growth-Team, 10MediaWiki-Page-derived-data, 06Wikipedia-Android-App-Backlog, and 2 others: WE3.3.7 Year in Review and Activity Tab Services - Global Editor Metrics - https://phabricator.wikimedia.org/T403660#11237995 (10Ottomata) [17:57:44] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 06Growth-Team, 10MediaWiki-Page-derived-data, 06Wikipedia-Android-App-Backlog, and 2 others: WE3.3.7 Year in Review and Activity Tab Services - Global Editor Metrics - https://phabricator.wikimedia.org/T403660#11238004 (10Ottomata) Next product que... [18:01:17] FIRING: [3x] EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-analytics in codfw. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [18:06:17] RESOLVED: [3x] EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-analytics in codfw. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [18:08:52] 06Data-Engineering, 06Moderator-Tools-Team, 10PageTriage, 07Schema-change, 07Spike: refactor code to not use pagetriage_tags table. hard code it in PHP instead. - https://phabricator.wikimedia.org/T406177#11238056 (10jsn.sherman) [18:09:58] 06Data-Engineering, 06Moderator-Tools-Team, 10PageTriage, 07Schema-change, 07Spike: [SPIKE] investigate best alternative to pagetriage_tags table. - https://phabricator.wikimedia.org/T406177#11238063 (10jsn.sherman) [18:13:54] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10Observability-Alerting, 10Event-Platform: EventgateProduceRateStop / EventGateProduceRateAnomaly alert should be active datacenter aware - https://phabricator.wikimedia.org/T405952#11238089 (10dr0ptp4kt) [18:17:56] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th): Mitigate consequences of Gobblin hiccups generating late events and alerts - https://phabricator.wikimedia.org/T402324#11238142 (10nshahquinn-wmf) >>! In T402324#11237257, @Mayakp.wiki wrote: > speaking for our needs, I think that's totally fine! :) Ag... [18:56:31] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10MediaWiki-Page-derived-data, 07OKR-Work: Global Editor Metrics - Druid mediawiki_history_reduced changes - https://phabricator.wikimedia.org/T406069#11238368 (10Ottomata) Ah but @JAllemandou what do we do about the digests? I guess for Option 3, we... [18:59:13] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10MediaWiki-Page-derived-data, 07OKR-Work: Global Editor Metrics - Druid mediawiki_history_reduced changes - https://phabricator.wikimedia.org/T406069#11238374 (10Ottomata) [19:04:02] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10MediaWiki-Page-derived-data, 07OKR-Work: Global Editor Metrics - Druid mediawiki_history_reduced changes - https://phabricator.wikimedia.org/T406069#11238390 (10Ottomata) [19:06:45] 06Data-Engineering, 10Data Pipelines: Add user_central_id to mediawiki_history and mediawiki_history_reduced Hive tables - https://phabricator.wikimedia.org/T365648#11238397 (10Ottomata) [19:08:21] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10MediaWiki-Page-derived-data, 07OKR-Work: mediawiki_history_reduced - add page_id and user_central_id fields - https://phabricator.wikimedia.org/T406263 (10Ottomata) 03NEW [19:32:01] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10DPE-Mediawiki-Content, 13Patch-For-Review: Investigate reasons for remaining inconsistencies - https://phabricator.wikimedia.org/T385112#11238467 (10xcollazo) The visibility bug identified on T385112#11207669 is indeed fixed. Note how we no longer... [19:56:09] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10MediaWiki-Page-derived-data, 07OKR-Work: mediawiki_history_reduced - add page_id and user_central_id fields - https://phabricator.wikimedia.org/T406263#11238535 (10Ottomata) [19:56:15] 06Data-Engineering (Q1 FY25/26 July 1st - September 30th), 10MediaWiki-Page-derived-data, 07OKR-Work: mediawiki_history_reduced - add page_id and user_central_id fields - https://phabricator.wikimedia.org/T406263#11238536 (10Ottomata) a:03amastilovic [20:27:53] 07Analytics-Data-Problem, 06Data-Engineering (Q1 FY25/26 July 1st - September 30th): Unique devices data uses non-standard domains for Wikidata, Wikifunctions, and MediaWiki.org - https://phabricator.wikimedia.org/T405533#11238664 (10nshahquinn-wmf) >>! In T405533#11235980, @JAllemandou wrote: > I'm very sorry...