[02:10:47] 06Data-Engineering, 06Data-Engineering-Radar, 10MediaWiki-DomainEvents, 05MW-1.45-release, and 3 others: Page-related DomainEvent classes with "@deprecated temporary alias, remove before 1.45 release" - https://phabricator.wikimedia.org/T417721#11776127 (10Ottomata) No objections! I haven't followed this c... [03:37:32] 06Data-Engineering, 10Commons-Impact-Metrics, 10Commons-Impact-Metrics-Requests: Update Commons Impact Metrics allow-list March 2026 - https://phabricator.wikimedia.org/T421982 (10GFontenelle_WMF) 03NEW [05:27:33] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 10GlobalBlocking, and 2 others: Drop global_block_whitelist from closed wikis - https://phabricator.wikimedia.org/T420525#11776274 (10Marostegui) 05Open→03Resolved Dropped [05:32:34] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 06Machine-Learning-Team, and 2 others: Drop ORES tables from wikis without ORES - https://phabricator.wikimedia.org/T420093#11776278 (10Marostegui) They are indeed empty, so I will just go ahead and drop them ` [05:31:34] marostegui@cumin1003:~/git/media... [05:33:49] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 06Machine-Learning-Team, and 2 others: Drop ORES tables from wikis without ORES - https://phabricator.wikimedia.org/T420093#11776281 (10Marostegui) 05Open→03Resolved Dropped [05:55:30] 06Data-Engineering, 10CheckUser-SuggestedInvestigations, 06DBA, 06Product Safety and Integrity, 07Schema-change-in-production: Drop cusi_case, cusi_signal, and cusi_user tables from wikis where they are unused - https://phabricator.wikimedia.org/T421353#11776295 (10Marostegui) Confirmed all empty, so goi... [05:56:48] 06Data-Engineering, 10CheckUser-SuggestedInvestigations, 06DBA, 06Product Safety and Integrity, 07Schema-change-in-production: Drop cusi_case, cusi_signal, and cusi_user tables from wikis where they are unused - https://phabricator.wikimedia.org/T421353#11776297 (10Marostegui) 05Open→03Resolved Done [08:02:18] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Growth-Team, 10Image-Suggestions, 13Patch-For-Review: Add an Image: filtering by suggestion "kind" or "confidence" - https://phabricator.wikimedia.org/T368987#11776496 (10APizzata-WMF) The task [[ https://airflow-search.wikimedia.org/dags/image_su... [08:32:41] 06Data-Engineering: table_maintenance_iceberg_monthly permission issue fails task due to permission on Ivy cache artifact - https://phabricator.wikimedia.org/T418804#11776562 (10JAllemandou) I have experienced again the same issue today: ` Exception in thread "main" java.io.FileNotFoundException: /tmp/table_main... [08:47:15] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Growth-Team, 10Image-Suggestions, 13Patch-For-Review: Add an Image: filtering by suggestion "kind" or "confidence" - https://phabricator.wikimedia.org/T368987#11776663 (10dcausse) >>! In T368987#11776496, @APizzata-WMF wrote: > The task [[ https:/... [09:07:44] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Event-Platform: HTML Enrichment - Backfilling configuration - https://phabricator.wikimedia.org/T421216#11776710 (10JMonton-WMF) After some conversations yesterday, with the help of @JAllemandou, we confirmed that the current approach is running the s... [09:09:36] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Growth-Team, 10Image-Suggestions, 13Patch-For-Review: Add an Image: filtering by suggestion "kind" or "confidence" - https://phabricator.wikimedia.org/T368987#11776719 (10APizzata-WMF) perfect! Will keep the ticket open for tomorrow run, if everyt... [10:10:58] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Event-Platform: HTML Enrichment - Backfilling configuration - https://phabricator.wikimedia.org/T421216#11777029 (10JMonton-WMF) I'd like to try the opposite approach, fewer pods, more resources, let Flink manage resources inside each TaskManager. Som... [10:17:31] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Visualizing inconsistencies and reconciles via Superset - https://phabricator.wikimedia.org/T420787#11777075 (10APizzata-WMF) > @xcollazo could you remind me the semantics of `missing_from_source`? I think I can help you with that! The [[ https://gitlab.w... [11:09:53] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10AQS2.0: Consider updating our heuristics for media type classification in AQS / wikistats - https://phabricator.wikimedia.org/T419882#11777263 (10GGoncalves-WMF) Thanks for the great analysis and solid proposal, @Snwachukwu ! **TL;DR**: I agree with... [11:12:37] !log Test Kitchen mw-user experiment (poll 54677) - adds: logged-in-retention-round1; removes: none; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD [11:12:39] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [11:18:20] !log Test Kitchen mw-user experiment (poll 54694) - adds: none; removes: logged-in-retention-round1; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD [11:18:21] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [11:18:40] !log Test Kitchen mw-user experiment (poll 54695) - adds: logged-in-retention-round1; removes: none; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD [11:18:41] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [11:20:20] !log Test Kitchen mw-user experiment (poll 54700) - adds: none; removes: logged-in-retention-round1; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD [11:20:22] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [11:26:02] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Event-Platform: Fix PyFlink log levels - https://phabricator.wikimedia.org/T419997#11777322 (10JMonton-WMF) About the log level, I'm not completely sure what fixed it, but right now the HTML enrichment pipeline is showing Python logs properly. I added... [11:31:26] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Dumps-Generation: when analyzing a Wikifunctions dump, parent_id in page creation revisions is sometimes 0 and sometimes None - https://phabricator.wikimedia.org/T420974#11777331 (10APizzata-WMF) a:03APizzata-WMF [11:34:52] !log Test Kitchen mw-user experiment (poll 54743) - adds: logged-in-retention-round1; removes: none; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD [11:34:54] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [11:48:10] 06Data-Engineering, 06Data-Engineering-Radar, 06Data-Persistence, 06DBA, and 4 others: ICU 72 upgrade: `categorylinks` table swap - https://phabricator.wikimedia.org/T419980#11777371 (10Marostegui) Sorry for the delay, I've been out for almost 3 weeks and I am catching up now. Some comments: 1) I'd strongl... [13:01:54] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Event-Platform: HTML Enrichment - Backfilling configuration - https://phabricator.wikimedia.org/T421216#11777789 (10Ottomata) Very nice writeup, thank you. There are a couple of other (minor?) pieces of the puzzle. - `process_max_workers_default=1`... [13:07:52] 06Data-Engineering, 06Data-Engineering-Radar, 10Dumps-Generation, 10Prod-Kubernetes, and 2 others: mediawiki-dumps-legacy is running without security policy on dse-k8s-eqiad - https://phabricator.wikimedia.org/T419259#11777828 (10BTullis) 05In progress→03Resolved I believe that this is now applying... [13:24:15] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Event-Platform: HTML Enrichment - Backfilling configuration - https://phabricator.wikimedia.org/T421216#11777905 (10JMonton-WMF) With the latest test, we are actually hitting are hitting these errors: ` Exceeded checkpoint tolerable failure threshold... [13:37:05] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Event-Platform: HTML Enrichment - Backfilling configuration - https://phabricator.wikimedia.org/T421216#11777986 (10JMonton-WMF) On a new test I'm doing this: - Set `exponential-delay` to avoid failing after 10 failures. If we get random failures (O... [13:40:57] 06Data-Engineering, 06Data-Engineering-Radar, 06Data-Platform-SRE (2026-03-27 - 2026-04-17), 07Essential-Work: Provide an access to MaxMind GeoIP in DSE K8S pods - https://phabricator.wikimedia.org/T405509#11777996 (10BTullis) I believe that the GeoIP files may now be mounted by Airflow task pods. As per:... [13:56:23] !log Test Kitchen mw-user experiment (poll 55163) - adds: email_confirmation_banner_ab_test; removes: none; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD [13:56:25] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:05:28] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Traffic: Surge in webrequest sequence-id validation check - https://phabricator.wikimedia.org/T422030 (10JAllemandou) 03NEW [14:13:41] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Dumps-Generation: when analyzing a Wikifunctions dump, parent_id in page creation revisions is sometimes 0 and sometimes None - https://phabricator.wikimedia.org/T420974#11778209 (10APizzata-WMF) Here are my findings: ` -- I want to see the status of... [14:16:49] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Traffic: Investigate raise in Invalid HAProxyKafka messages in esams - https://phabricator.wikimedia.org/T422033 (10JAllemandou) 03NEW [14:23:04] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Traffic: Investigate raise in Invalid HAProxyKafka messages in esams - https://phabricator.wikimedia.org/T422033#11778343 (10JAllemandou) [14:24:07] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Traffic: Surge in webrequest sequence-id validation check - https://phabricator.wikimedia.org/T422030#11778354 (10JAllemandou) [14:24:24] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Traffic: Surge in webrequest sequence-id validation check - https://phabricator.wikimedia.org/T422030#11778357 (10JAllemandou) [14:30:45] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Event-Platform: HTML Enrichment - Backfilling configuration - https://phabricator.wikimedia.org/T421216#11778395 (10Ottomata) > More off-heap memory Right! makes sense. IIRC, @AKhatun_WMF and I had to increase this for edit types stuff too, to deal w... [14:34:03] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Event-Platform: HTML Enrichment - Backfilling configuration - https://phabricator.wikimedia.org/T421216#11778423 (10AKhatun_WMF) @Ottomata I committed the off heap change, yes. [the change](https://gerrit.wikimedia.org/r/plugins/gitiles/operations/dep... [14:39:21] 06Data-Engineering, 06Growth-Team: Investigate empty Constructive edit rate of newer editors (mobile web) - https://phabricator.wikimedia.org/T421514#11778489 (10KStoller-WMF) p:05Medium→03High a:03MNeisler [14:52:03] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Traffic: Surge in webrequest sequence-id validation check - https://phabricator.wikimedia.org/T422030#11778559 (10Fabfur) This could be related to upgrade to HAProxy 3.2 (T421402) that started on the drmrs datacenter, we'll investigate if the sequence... [14:53:25] 06Data-Engineering, 06Growth-Team, 06Product-Analytics (Kanban): Investigate empty Constructive edit rate of newer editors (mobile web) - https://phabricator.wikimedia.org/T421514#11778575 (10MNeisler) [15:00:06] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10AQS2.0, 13Patch-For-Review: Alter AQS Cassandra tables in support of video plays endpoints - https://phabricator.wikimedia.org/T420008#11778630 (10Snwachukwu) Hi @Eevans I'd like for the change to be applied to production tables [15:05:11] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Event-Platform: HTML Enrichment - Backfilling configuration - https://phabricator.wikimedia.org/T421216#11778673 (10Ottomata) Re checkpointing, this could be what we need when backfilling: https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/... [15:20:38] !log Test Kitchen mw-user experiment (poll 55412) - adds: none; removes: email_confirmation_banner_ab_test; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD [15:20:40] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:27:41] !log Test Kitchen mw-user experiment (poll 55433) - adds: email_confirmation_banner_ab_test; removes: none; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD [15:27:43] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:43:41] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Traffic, 13Patch-For-Review: Surge in webrequest sequence-id validation check - https://phabricator.wikimedia.org/T422030#11778885 (10Fabfur) This is most probably due to a deprecation in haproxy configuration directives https://www.haproxy.com/blog... [16:07:15] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Dumps-Generation: when analyzing a Wikifunctions dump, parent_id in page creation revisions is sometimes 0 and sometimes None - https://phabricator.wikimedia.org/T420974#11778966 (10JAllemandou) That's interesting! @Ottomata could you have a look at t... [16:15:51] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Traffic, 13Patch-For-Review: Surge in webrequest sequence-id validation check - https://phabricator.wikimedia.org/T422030#11779016 (10Fabfur) [[ https://gerrit.wikimedia.org/r/c/operations/puppet/+/1266301 | The patch ]]has been applied to all impa... [16:17:56] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10AQS2.0, 13Patch-For-Review: Alter AQS Cassandra tables in support of video plays endpoints - https://phabricator.wikimedia.org/T420008#11779021 (10Eevans) 05Open→03Resolved >>! In T420008#11778630, @Snwachukwu wrote: > Hi @Eevans I'd like fo... [16:35:34] 06Data-Engineering, 06Data-Engineering-Radar, 10Dumps-Generation, 13Patch-Needs-Improvement: incomplete conversion of flow revisions after disabling flow, breaks stubs dumps - https://phabricator.wikimedia.org/T228921#11779141 (10Ottomata) [16:37:41] 06Data-Engineering: Backfill newly productionized edit types dataset - https://phabricator.wikimedia.org/T421919#11779167 (10Ottomata) a:05Ottomata→03AKhatun_WMF [16:41:56] (03CR) 10Snwachukwu: [C:03+2] Extend mediarequest Cassandra loads with poster/plays for video-requests API [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1250005 (https://phabricator.wikimedia.org/T415202) (owner: 10Snwachukwu) [16:42:02] (03CR) 10Snwachukwu: [V:03+2 C:03+2] Extend mediarequest Cassandra loads with poster/plays for video-requests API [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1250005 (https://phabricator.wikimedia.org/T415202) (owner: 10Snwachukwu) [16:50:08] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Commons-Impact-Metrics, 10Commons-Impact-Metrics-Requests: Update Commons Impact Metrics allow-list March 2026 - https://phabricator.wikimedia.org/T421982#11779263 (10Ahoelzl) [16:50:38] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Event-Platform, 13Patch-For-Review: eventutilities-python - make Flink Source and Sink parallelism configurable - https://phabricator.wikimedia.org/T421951#11779266 (10Ahoelzl) [16:50:57] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Backfill newly productionized edit types dataset - https://phabricator.wikimedia.org/T421919#11779268 (10Ahoelzl) [16:55:05] 06Data-Engineering, 10Data-Engineering-Wikistats: Add total file size to metric to Wikistats - https://phabricator.wikimedia.org/T421598#11779293 (10Ahoelzl) @GGoncalves-WMF please assess if this should be folded in upcoming commons efforts [17:00:17] 06Data-Engineering, 06Research: Request for Hourly Pageview Data for multiple articles– July 18 to September 8, 2025 - https://phabricator.wikimedia.org/T409676#11779331 (10Ottomata) Hi! > access to internal data sets such as wmf.pageview_actor, sessionlength These are unlikely to be made available publicly,... [17:00:33] 06Data-Engineering, 06Data-Engineering-Radar, 06Research: Request for Hourly Pageview Data for multiple articles– July 18 to September 8, 2025 - https://phabricator.wikimedia.org/T409676#11779338 (10Ottomata) [17:05:47] 06Data-Engineering, 10Event-Platform: Profile SimpleEditType to identify inefficiencies in mwedittype - https://phabricator.wikimedia.org/T421412#11779404 (10Ahoelzl) We should clarify if StructuredEditType needs to be supported. [17:20:22] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Event-Platform: HTML Enrichment - Backfilling configuration - https://phabricator.wikimedia.org/T421216#11779453 (10Ottomata) > There are a couple of other (minor?) pieces of the puzzle. ...And also, maybe we should just try upgrading to Flink 2.2.0 a... [17:23:28] !Deploying Refinery at fa28ad8 for change 1250005 / T415202 Extend mediarequest Cassandra loads with poster/plays for video-requests API [17:23:28] T415202: Introduce a new AQS endpoint to expose video plays - https://phabricator.wikimedia.org/T415202 [18:01:40] !log Deployed refinery using scap, then deployed onto hdfs [18:01:42] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:48:48] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Remove the test DBT DAG from test_k8s Airflow - https://phabricator.wikimedia.org/T422080 (10amastilovic) 03NEW [20:49:50] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Event-Platform, 13Patch-For-Review: HTML Enrichment - Backfilling configuration - https://phabricator.wikimedia.org/T421216#11780169 (10Ottomata) BTW  I think we can see the container OOMs here: https://grafana.wikimedia.org/goto/bfht68nna58g0f?orgI... [20:55:05] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Event-Platform, 13Patch-For-Review: HTML Enrichment - Backfilling configuration - https://phabricator.wikimedia.org/T421216#11780186 (10Ottomata) Uh, that did not work. ` org.apache.flink.client.program.ProgramAbortException: java.lang.RuntimeExce... [21:54:37] !log Test Kitchen mw-user experiment (poll 56581) - adds: none; removes: none; fields: email_confirmation_banner_ab_test - xLab/MPIC/TK tips at https://w.wiki/FwuD [21:54:38] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [21:59:51] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Traffic: Surge in webrequest sequence-id validation check - https://phabricator.wikimedia.org/T422030#11780385 (10Ahoelzl) a:03JAllemandou [22:00:34] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Traffic: Investigate raise in Invalid HAProxyKafka messages in esams - https://phabricator.wikimedia.org/T422033#11780389 (10Ahoelzl) a:03JAllemandou [22:39:19] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Backfill newly productionized edit types dataset - https://phabricator.wikimedia.org/T421919#11780504 (10AKhatun_WMF) Update: - Created `akhatun.edit_type` according to the [edit-type stream's schema](https://gitlab.wikimedia.org/repos/data-engineering/sc... [23:46:25] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Backfill newly productionized edit types dataset - https://phabricator.wikimedia.org/T421919#11780591 (10Ottomata) > Should I host the code anywhere, get it reviewed? Hosting probably not required since it is a one-time spark job. Nah, but it would be goo...