[00:01:25] 06Data-Engineering, 06Product-Analytics: Analyze differences between checksum-based and revert-tag based reverts in mediawiki_history - https://phabricator.wikimedia.org/T266374#11747224 (10Ahoelzl) [00:01:55] 06Data-Engineering, 06Product-Analytics: Analyze differences between checksum-based and revert-tag based reverts in mediawiki_history - https://phabricator.wikimedia.org/T266374#11747227 (10Ahoelzl) p:05Low→03Medium [00:20:41] 10Data-Engineering-Roadmap, 07Epic: dbt DPE work - https://phabricator.wikimedia.org/T416679#11747250 (10Ahoelzl) a:05JMonton-WMF→03None [00:21:28] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Movement-Insights (FY25-26 H2): dbt repository structure (Milestone 3) - https://phabricator.wikimedia.org/T416672#11747252 (10Ahoelzl) p:05Triage→03High a:05JMonton-WMF→03amastilovic [01:27:32] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Research, 10Event-Platform, 13Patch-For-Review: Event stream with latest revision HTML & parent revision HTML diff - https://phabricator.wikimedia.org/T360794#11747319 (10Ottomata) @JMonton-WMF looks like we should just set [[ https://wikimedia.sl... [06:37:07] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-logging-external. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=000000026&var-service=eventgate-logging-external - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [07:02:07] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-logging-external. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=000000026&var-service=eventgate-logging-external - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [08:36:15] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Research, 10Event-Platform, 13Patch-For-Review: Event stream with latest revision HTML & parent revision HTML diff - https://phabricator.wikimedia.org/T360794#11747828 (10JMonton-WMF) I'll do the change and we'll see! [09:01:10] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Implement list of JA3N-JA4H pairs to be tagged as automated into the bot detection pipeline - https://phabricator.wikimedia.org/T420412#11747912 (10mforns) I started the tests on Jan 14th. I will regenerate all the pipeline from webrequest_actor_metrics_h... [09:26:33] 06Data-Engineering, 06SRE, 10SRE-Access-Requests: Requesting access to Superset for keren.ramirezWMDE - https://phabricator.wikimedia.org/T420896#11747976 (10kera_wmde) @KFrancis thank you! I just signed the NDA and send it back. [09:52:36] 06Data-Engineering, 06Wikimedia Enterprise: Access Required For DAta Engineering Airflow Instance - https://phabricator.wikimedia.org/T421214 (10LDlulisa-WMF) 03NEW [10:05:19] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Growth-Team, 10Image-Suggestions: Add an Image: filtering by suggestion "kind" or "confidence" - https://phabricator.wikimedia.org/T368987#11748100 (10APizzata-WMF) Merged the [[ https://gitlab.wikimedia.org/repos/structured-data/image-suggestions-d... [10:07:00] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Event-Platform: HTML Enrichment - Backfilling configuration - https://phabricator.wikimedia.org/T421216 (10JMonton-WMF) 03NEW [10:15:35] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Event-Platform: HTML Enrichment - Backfilling configuration - https://phabricator.wikimedia.org/T421216#11748132 (10JMonton-WMF) I think we could try to reduce the parallelism inside the PyFlink and increase it in K8s. It is hard to manage low memory... [10:48:46] 06Data-Engineering, 06DBA, 10GlobalBlocking, 06Product Safety and Integrity, 07Schema-change-in-production: Drop global_block_whitelist from closed wikis - https://phabricator.wikimedia.org/T420525#11748224 (10Marostegui) p:05Triage→03Medium a:03Marostegui [10:49:02] 06Data-Engineering, 06DBA, 10GlobalBlocking, 06Product Safety and Integrity, 07Schema-change-in-production: Drop global_block_whitelist from closed wikis - https://phabricator.wikimedia.org/T420525#11748232 (10Marostegui) @Dreamy_Jazz - we should first rename the table and let it be for a few days. [11:04:00] 06Data-Engineering, 06DBA, 10GlobalBlocking, 06Product Safety and Integrity, 07Schema-change-in-production: Drop global_block_whitelist from closed wikis - https://phabricator.wikimedia.org/T420525#11748266 (10Dreamy_Jazz) Sure, if that is the preference of the DBAs. No rush to drop these tables [11:17:12] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10MediaWiki-extensions-CentralAuth, 06MediaWiki-Platform-Team: CentralAuth's localuser table contains many nulls and duplicate mappings - https://phabricator.wikimedia.org/T411116#11748310 (10APizzata-WMF) forgot to run the same command on snapshot= 2... [11:37:24] 06Data-Engineering, 06Data-Engineering-Radar, 06Data-Platform-SRE (2026-03-06 - 2026-03-27), 07Essential-Work, 13Patch-For-Review: Provide an access to MaxMind GeoIP in DSE K8S pods - https://phabricator.wikimedia.org/T405509#11748364 (10BTullis) This is blocked on {T414484} - since we need to use the Va... [11:51:52] 06Data-Engineering, 06Data-Engineering-Radar, 10MediaWiki-extensions-EventLogging, 13Patch-For-Review, and 2 others: Deprecate and remove mw.eventLog.submitClick() - https://phabricator.wikimedia.org/T415210#11748436 (10phuedx) @Sfaci: I like the idea of tidying up `repos/data-engineering/metrics-platform`... [11:52:19] 06Data-Engineering, 06Data-Engineering-Radar, 10MediaWiki-extensions-EventLogging, 13Patch-For-Review, and 2 others: Deprecate and remove mw.eventLog.submitClick() - https://phabricator.wikimedia.org/T415210#11748438 (10phuedx) [13:02:03] 06Data-Engineering, 06Wikimedia Enterprise: Access Required For DAta Engineering Airflow Instance - https://phabricator.wikimedia.org/T421214#11748716 (10LDlulisa-WMF) [13:08:13] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 06Machine-Learning-Team, and 2 others: Drop ORES tables from wikis without ORES - https://phabricator.wikimedia.org/T420093#11748765 (10Marostegui) a:03Marostegui We should rename these tables first before dropping. [13:19:52] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Dumps-Generation: Data missing from en.wiktionary.org February 2026 "MediaWiki Content File Exports" compared to "XML Database dump" - https://phabricator.wikimedia.org/T417596#11748785 (10APizzata-WMF) Pipeline has been merged, will wait for monthly... [13:25:00] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Research, 10Event-Platform, 13Patch-For-Review: Event stream with latest revision HTML & parent revision HTML diff - https://phabricator.wikimedia.org/T360794#11748819 (10Ottomata) > Fabian is using the Wikimedia REST API, instead of the (newer) M... [13:32:53] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Event-Platform: HTML Enrichment - Backfilling configuration - https://phabricator.wikimedia.org/T421216#11748896 (10Ottomata) > I think we could try to reduce the parallelism inside the PyFlink and increase it in K8s. FWIW, I am still sometimes skept... [13:34:32] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10AQS2.0, 13Patch-For-Review: Alter AQS Cassandra tables in support of video plays endpoints - https://phabricator.wikimedia.org/T420008#11748905 (10Snwachukwu) Hello @Eevans. I would like to alter`local_group_default_T_mediarequest_top_files.data` ta... [13:36:29] 06Data-Engineering, 06Data-Platform-SRE, 06Wikimedia Enterprise: Access Required For DAta Engineering Airflow Instance - https://phabricator.wikimedia.org/T421214#11748917 (10Ottomata) [13:37:42] 06Data-Engineering, 06Data-Platform-SRE, 06Wikimedia Enterprise: Access Required For DAta Engineering Airflow Instance - https://phabricator.wikimedia.org/T421214#11748928 (10Ottomata) Hi! I am not up to date on our current airflow instances and admin group names, so I've tagged #Data-Platform-SRE for help.... [13:52:14] 06Data-Engineering, 10Event-Platform: `mediawiki.page_change.v1`: two schema validation errors causing events to be silently dropped by EventGate - https://phabricator.wikimedia.org/T421237 (10xcollazo) 03NEW [13:53:13] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Visualizing inconsistencies and reconciles via Superset - https://phabricator.wikimedia.org/T420787#11749062 (10xcollazo) Minor bug on EventGate found: {T421237}. [14:06:50] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 13Patch-For-Review, 07Schema-change-in-production: Drop il_to column from imagelinks table in wmf production - https://phabricator.wikimedia.org/T419635#11749112 (10Marostegui) a:03FCeratto-WMF To be started after the freeze. [14:24:25] 06Data-Engineering, 06SRE, 10SRE-Access-Requests: Requesting access to data and Superset for Daria-WMDE (Daria Ammalainen (WMDE)) - https://phabricator.wikimedia.org/T420716#11749170 (10Scott_French) [14:26:17] 06Data-Engineering, 06SRE, 10SRE-Access-Requests: Requesting access to data and Superset for Daria-WMDE (Daria Ammalainen (WMDE)) - https://phabricator.wikimedia.org/T420716#11749181 (10Scott_French) Great, thanks @Daria-WMDE! Once @KFrancis confirms everything is all set, I believe that should be everythin... [14:28:40] 06Data-Engineering, 06SRE, 10SRE-Access-Requests: Requesting access to superset for alice.moutinho - https://phabricator.wikimedia.org/T420751#11749190 (10Scott_French) @Alice.moutinho - Did you receive a new NDA link or are you still awaiting that? [14:30:21] !log Test Kitchen mw-user experiment (poll 25312) - adds: none; removes: temp-accounts-enrollment-test; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD [14:30:24] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:31:24] 06Data-Engineering, 06SRE, 10SRE-Access-Requests: Requesting access to superset for alice.moutinho - https://phabricator.wikimedia.org/T420751#11749201 (10Alice.moutinho) Hi @Scott_French , i did, and signed, this monday! [14:43:15] 06Data-Engineering, 10Event-Platform: `mediawiki.page_change.v1`: two schema validation errors causing events to be silently dropped by EventGate - https://phabricator.wikimedia.org/T421237#11749272 (10xcollazo) [14:49:09] 06Data-Engineering, 10Event-Platform: `mediawiki.page_change.v1`: two schema validation errors causing events to be silently dropped by EventGate - https://phabricator.wikimedia.org/T421237#11749292 (10xcollazo) One (redacted) example from logstash: ` {"invalid":[..., page_title":"User:OzmoOzmo/sandbox","name... [14:56:31] 06Data-Engineering, 06SRE, 10SRE-Access-Requests: Requesting access to superset for alice.moutinho - https://phabricator.wikimedia.org/T420751#11749333 (10Scott_French) Great, thank you! @KFrancis - If you could confirm when the NDA is accepted / complete, that would be greatly appreciated. [15:26:08] 06Data-Engineering: Request for UA compliance table to created under wmf_traffic database - https://phabricator.wikimedia.org/T421247 (10KCVelaga_WMF) 03NEW [15:27:02] 06Data-Engineering, 06Product-Analytics: Request for UA compliance table to created under wmf_traffic database - https://phabricator.wikimedia.org/T421247#11749495 (10KCVelaga_WMF) The request has been created for documentation, the request has already been completed by @JAllemandou. [15:27:06] 06Data-Engineering, 06Product-Analytics: Request for UA compliance table to created under wmf_traffic database - https://phabricator.wikimedia.org/T421247#11749497 (10KCVelaga_WMF) 05Open→03Resolved [16:26:05] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10AQS2.0, 13Patch-For-Review: Introduce a new AQS endpoint to expose video plays - https://phabricator.wikimedia.org/T415202#11749830 (10Eevans) Applied https://gerrit.wikimedia.org/r/c/generated-data-platform/aqs/media-analytics/+/1260282 to staging... [16:33:53] !log Test Kitchen edge-unique experiments (poll 25680) - adds: none; removes: none; fields: synth-aa-test-traffic-impact-1 - xLab/MPIC/TK tips at https://w.wiki/FwuD [16:33:54] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:34:13] !log Test Kitchen edge-unique experiments (poll 25681) - adds: none; removes: none; fields: synth-aa-test-traffic-impact-2, synth-aa-test-traffic-impact-3 - xLab/MPIC/TK tips at https://w.wiki/FwuD [16:34:14] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:35:32] 06Data-Engineering, 10Event-Platform: EventBus: Unable to deliver all events: 503: Service Unavailable - https://phabricator.wikimedia.org/T421257 (10xcollazo) 03NEW [16:36:14] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Visualizing inconsistencies and reconciles via Superset - https://phabricator.wikimedia.org/T420787#11749895 (10xcollazo) More serious EventBus error rate: {T421257}. [16:41:10] 06Data-Engineering, 06SRE, 10SRE-Access-Requests: Requesting access to superset for alice.moutinho - https://phabricator.wikimedia.org/T420751#11749923 (10KFrancis) @Scott_French I'm waiting on legal counsel. I pinged him again! [16:41:53] !log Deploying Refinery as part of weekly deployment train [16:41:55] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:58:55] 06Data-Engineering, 10Event-Platform: EventBus: Unable to deliver all events: 503: Service Unavailable - https://phabricator.wikimedia.org/T421257#11749989 (10Ottomata) Possibly related / cause: {T364245} ? I'm not following closely but it has something to do with lost requests on MW deploy/restarts? [17:30:59] 06Data-Engineering, 06Data-Engineering-Radar, 10MediaWiki-DomainEvents, 05MW-1.45-release, and 3 others: Page-related DomainEvent classes with "@deprecated temporary alias, remove before 1.45 release" - https://phabricator.wikimedia.org/T417721#11750241 (10Ottomata) [17:32:22] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 10GlobalBlocking, and 2 others: Drop global_block_whitelist from closed wikis - https://phabricator.wikimedia.org/T420525#11750253 (10Ottomata) [17:32:49] 06Data-Engineering, 06Data-Engineering-Radar, 10Dumps-Generation, 06Wikimedia Enterprise: Include more namespaces in Wiktionary HTML dumps - https://phabricator.wikimedia.org/T303652#11750256 (10Ottomata) [17:33:11] 06Data-Engineering, 06Data-Engineering-Radar, 06SRE, 10SRE-Access-Requests: Requesting access to data and Superset for Daria-WMDE (Daria Ammalainen (WMDE)) - https://phabricator.wikimedia.org/T420716#11750258 (10Ottomata) [17:33:33] 06Data-Engineering, 06Data-Engineering-Radar, 06SRE, 10SRE-Access-Requests: Requesting access to superset for alice.moutinho - https://phabricator.wikimedia.org/T420751#11750260 (10Ottomata) [17:33:42] 06Data-Engineering, 06Data-Engineering-Radar, 06SRE, 10SRE-Access-Requests: Requesting access to Superset for keren.ramirezWMDE - https://phabricator.wikimedia.org/T420896#11750262 (10Ottomata) [17:34:20] 06Data-Engineering, 10Dumps-Generation: when analyzing a Wikifunctions dump, parent_id in page creation revisions is sometimes 0 and sometimes None - https://phabricator.wikimedia.org/T420974#11750263 (10Ottomata) [17:37:56] 06Data-Engineering, 06Growth-Team, 10MediaWiki-extensions-WikimediaEvents, 06Test Kitchen, and 2 others: Could not hoist data into experiment.subject_id for event - https://phabricator.wikimedia.org/T421152#11750269 (10Ottomata) [17:37:57] 06Data-Engineering, 10Event-Platform: `mediawiki.page_change.v1`: two schema validation errors causing events to be silently dropped by EventGate - https://phabricator.wikimedia.org/T421237#11750270 (10xcollazo) Note further that there seems to be other streams with `ValidationError`s over last 30 days, but I... [17:38:03] 06Data-Engineering, 06Data-Engineering-Radar, 06Growth-Team, 10MediaWiki-extensions-WikimediaEvents, and 3 others: Could not hoist data into experiment.subject_id for event - https://phabricator.wikimedia.org/T421152#11750272 (10Ottomata) [17:42:54] 06Data-Engineering, 06Data-Engineering-Radar, 06cloud-services-team, 06Data-Persistence, and 4 others: Set up x1 replication to an-redacteddb1001 - https://phabricator.wikimedia.org/T407485#11750287 (10Ahoelzl) [17:43:42] 06Data-Engineering, 06Data-Engineering-Radar, 06SRE, 06Traffic: Add pageview information to turnilo's webrequest_sampled_live (is_pageview is always "-") - https://phabricator.wikimedia.org/T402612#11750292 (10Ottomata) [17:44:47] 06Data-Engineering, 06Data-Platform-SRE, 10Dumps-Generation, 10Wikidata: Wikidata full .json.gz dumps not published since 20250625 - https://phabricator.wikimedia.org/T412428#11750294 (10Ottomata) [17:46:55] 06Data-Engineering, 10Dumps-Generation, 10Wikidata: Recent Wikidata dumps missing “All pages with complete edit history (.7z)” (job marked failed) - https://phabricator.wikimedia.org/T414526#11750305 (10Ahoelzl) Please check new XML dump location / pipeline at https://dumps.wikimedia.org/other/mediawiki_cont... [17:47:50] 06Data-Engineering, 10Dumps-Generation, 10Wikidata: Recent Wikidata dumps missing “All pages with complete edit history (.7z)” (job marked failed) - https://phabricator.wikimedia.org/T414526#11750307 (10Ottomata) 05Open→03Declined Declining, please use the mediawiki_content_history AKA 'dumps 2' file... [17:51:54] 06Data-Engineering, 06Privacy Engineering: The soon-to-be-released pageview datasets should be linked from dumps page - https://phabricator.wikimedia.org/T335958#11750316 (10Ottomata) cc @GGoncalves-WMF (we are grooming and putting this in backlog for now). [18:09:09] 06Data-Engineering, 06Privacy Engineering: The differential privacy per country pageview datasets should be linked from dumps.wikmedia.org - https://phabricator.wikimedia.org/T335958#11750376 (10Ottomata) [18:29:01] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Event-Platform: Debug edit type pipeline for production readiness - https://phabricator.wikimedia.org/T421026#11750495 (10AKhatun_WMF) Another thing done: https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/1260091 Make spark job size `medi... [18:32:32] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Event-Platform: HTML Enrichment - Backfilling configuration - https://phabricator.wikimedia.org/T421216#11750522 (10AKhatun_WMF) [18:32:41] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Research, 10Event-Platform, 13Patch-For-Review: Event stream with latest revision HTML & parent revision HTML diff - https://phabricator.wikimedia.org/T360794#11750523 (10AKhatun_WMF) [20:16:41] 06Data-Engineering: Manage druid `webrequest_sampled_live` data size - https://phabricator.wikimedia.org/T398236#11750868 (10JAllemandou) Current status: The 5 hosts are full at ~75%, with almost 2Tb used from 2.75Tb each. This represents ~10Tb used. From those 10Tb, `webrequest_sampled_live` account for ~4Tb (2... [20:25:45] (03PS1) 10Astein: move hql to new fr_tech dir and remove fundraising dir [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1260793 [20:48:39] 06Data-Engineering: Manage druid `webrequest_sampled_live` data size - https://phabricator.wikimedia.org/T398236#11751068 (10CDanis) Thanks Joseph <3 If we need it in the future, I think we could reduce the wmf_netflow sampling rate (unless @ayounsi has any objections to that). [21:30:52] (03CR) 10Aleksandar Mastilovic: [V:03+2 C:03+2] "LGTM!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1260793 (owner: 10Astein) [21:47:46] 06Data-Engineering, 10Dumps-Generation: Legacy XML Dumps HTML advertises availability before the rsync kicks in, resulting in temporary 404s - https://phabricator.wikimedia.org/T413767#11751631 (10Ahoelzl) p:05Triage→03Low [21:51:10] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Movement-Insights, 06Data-Platform-SRE (2026-03-06 - 2026-03-27), 07OKR-Work, 13Patch-For-Review: Run dbt from Airflow - https://phabricator.wikimedia.org/T410268#11751688 (10amastilovic) 05Open→03Resolved [21:51:23] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Implement more fine-grained selection of DBT models in DbtSkeinOperator - https://phabricator.wikimedia.org/T419594#11751693 (10amastilovic) 05Open→03Resolved [21:53:56] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Build a set of configurable pre-scheduled DBT Airflow DAGs executing dbt-jobs models - https://phabricator.wikimedia.org/T419925#11751703 (10amastilovic) [22:01:31] 06Data-Engineering: Manage druid `webrequest_sampled_live` data size - https://phabricator.wikimedia.org/T398236#11751797 (10Ahoelzl) p:05Triage→03Low Thanks @JAllemandou . Sounds like this will be irrelevant soon. [22:03:35] 06Data-Engineering, 10Dumps-Generation: when analyzing a Wikifunctions dump, parent_id in page creation revisions is sometimes 0 and sometimes None - https://phabricator.wikimedia.org/T420974#11751808 (10Ahoelzl) [22:03:40] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Research: MediaWiki content history dataset issues - https://phabricator.wikimedia.org/T415311#11751809 (10Ahoelzl) [22:03:54] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Dumps-Generation: when analyzing a Wikifunctions dump, parent_id in page creation revisions is sometimes 0 and sometimes None - https://phabricator.wikimedia.org/T420974#11751813 (10Ahoelzl) [22:28:53] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Reader Experience Team, 06Test Kitchen, 05MW-1.46-notes (1.46.0-wmf.21; 2026-03-24): Logged in reader retention logging - https://phabricator.wikimedia.org/T420621#11751881 (10Ahoelzl) Measurement plan: https://docs.google.com/spreadsheets/d/1rmsF...