[00:54:58] (03PS2) 10Xcollazo: Fix MWHistorySnapshotMerger: DELETE+INSERT replaces MERGE, add page/user reconcile [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1296665 (https://phabricator.wikimedia.org/T427328) [01:33:14] (03CR) 10Xcollazo: [C:03+2] Fix MWHistorySnapshotMerger: DELETE+INSERT replaces MERGE, add page/user reconcile [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1296665 (https://phabricator.wikimedia.org/T427328) (owner: 10Xcollazo) [01:46:32] (03Merged) 10jenkins-bot: Fix MWHistorySnapshotMerger: DELETE+INSERT replaces MERGE, add page/user reconcile [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1296665 (https://phabricator.wikimedia.org/T427328) (owner: 10Xcollazo) [07:29:52] (03CR) 10Nicholusmuwonge: [C:03+2] Script to gather metrics for Recent Changes in pilot Wikis. [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/1288258 (https://phabricator.wikimedia.org/T426384) (owner: 10Seanleong-wmde) [07:30:25] (03Merged) 10jenkins-bot: Script to gather metrics for Recent Changes in pilot Wikis. [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/1288258 (https://phabricator.wikimedia.org/T426384) (owner: 10Seanleong-wmde) [08:02:04] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st): Create API and User-Agent compliance related tables under wmf_traffic - https://phabricator.wikimedia.org/T427840#11984136 (10JAllemandou) We talked with @GGoncalves-WMF yesterday, and we think that since the data could be defined as "core datasets", it woul... [08:24:28] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06Data-Platform-SRE: Provide a scheduled data download service from Google Cloud Storage - https://phabricator.wikimedia.org/T427457#11984157 (10Gehel) @Ahoelzl has access to provision a service account. [08:41:57] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st): Create API and User-Agent compliance related tables under wmf_traffic - https://phabricator.wikimedia.org/T427840#11984200 (10KCVelaga_WMF) > We talked with @GGoncalves-WMF yesterday, and we think that since the data could be defined as "core datasets", it w... [09:29:47] 06Data-Engineering, 06Research: Add new referrer-class for AI chatbots/LLMs to clickstream dataset - https://phabricator.wikimedia.org/T428136 (10MGerlach) 03NEW [09:30:44] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06Data-Platform-SRE (2026-04-24 - 2026-05-15), 13Patch-For-Review: Support for Java 21 and Flink 2 - https://phabricator.wikimedia.org/T412978#11984413 (10hashar) I have built the images: ` docker-registry.wikimedia.org/releng/java21:0.1 docker-registry.... [09:52:27] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st): Create API and User-Agent compliance related tables under wmf_traffic - https://phabricator.wikimedia.org/T427840#11984475 (10JAllemandou) Summarizing the conversation we just had with @KCVelaga_WMF : * KC owns the schedule to move the HQL and DAGs to the a... [09:58:51] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st): Create API and User-Agent compliance related tables under wmf_traffic - https://phabricator.wikimedia.org/T427840#11984496 (10KCVelaga_WMF) >>! In T427840#11984475, @JAllemandou wrote: > Summarizing the conversation we just had with @KCVelaga_WMF : > * KC o... [10:11:42] 06Data-Engineering, 10Observability-Logging, 06SRE, 10Wikimedia-Logstash, and 3 others: Produce ECS formatted logstash logs to Event Platform, allowing them to be queried in the WMF Data Lake with SQL - https://phabricator.wikimedia.org/T291645#11984536 (10BTullis) [10:18:54] 06Data-Engineering, 10Observability-Logging, 06SRE, 10Wikimedia-Logstash, and 3 others: Produce ECS formatted logstash logs to Event Platform, allowing them to be queried in the WMF Data Lake with SQL - https://phabricator.wikimedia.org/T291645#11984548 (10BTullis) [11:13:28] !log Test Kitchen mw-user experiment (poll 84311) - adds: none; removes: none; fields: incident_reporting_system_interaction - xLab/MPIC/TK tips at https://w.wiki/FwuD [11:13:29] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [12:28:16] 06Data-Engineering, 10Observability-Logging, 06SRE, 10Wikimedia-Logstash, and 3 others: Produce ECS formatted logstash logs to Event Platform, allowing them to be queried in the WMF Data Lake with SQL - https://phabricator.wikimedia.org/T291645#11984961 (10BTullis) Setting T425087 as a parent task, since t... [12:50:40] (03PS1) 10KCVelaga: User-Agent compliance and API requests refinement HQLs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1297686 (https://phabricator.wikimedia.org/T419522) [12:52:50] (03PS2) 10KCVelaga: User-Agent compliance and API requests refinement HQLs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1297686 (https://phabricator.wikimedia.org/T419522) [12:54:52] (03PS3) 10KCVelaga: User-Agent compliance and API requests refinement HQLs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1297686 (https://phabricator.wikimedia.org/T419522) [12:59:53] (03CR) 10Joal: "LGTM - I didn't check the files in detail as we reviewed them not long ago." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1297686 (https://phabricator.wikimedia.org/T419522) (owner: 10KCVelaga) [12:59:58] (03CR) 10Joal: [C:03+1] User-Agent compliance and API requests refinement HQLs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1297686 (https://phabricator.wikimedia.org/T419522) (owner: 10KCVelaga) [13:01:13] (03PS4) 10KCVelaga: User-Agent compliance and API requests refinement HQLs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1297686 (https://phabricator.wikimedia.org/T419522) [13:02:04] (03CR) 10KCVelaga: User-Agent compliance and API requests refinement HQLs (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1297686 (https://phabricator.wikimedia.org/T419522) (owner: 10KCVelaga) [14:01:03] (03CR) 10Joal: [V:03+2 C:03+2] "Merging for later deploy" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1297686 (https://phabricator.wikimedia.org/T419522) (owner: 10KCVelaga) [14:01:07] 06Data-Engineering, 06ServiceOps new: Standard helm chart for simple service-utils nodejs apps - https://phabricator.wikimedia.org/T428174 (10Ottomata) 03NEW [14:08:02] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06Data-Platform-SRE: Provide a scheduled data download service from Google Cloud Storage - https://phabricator.wikimedia.org/T427457#11985543 (10Gehel) We expect #data-engineering to work on T405360, which can then be configured for this data transfer. [14:11:39] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: Add "wiki_id" to Page View Stream - https://phabricator.wikimedia.org/T427925#11985571 (10Ottomata) Nice patch! @mpopov @JAllemandou @phuedx I'm not sure who exactly to ask, but is there any reason we shouldn't add... [14:24:24] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10DPE-MediaWiki-Incremental-History, 10Event-Platform: mediawiki.page_change.v1 events - add wikidata id for pages - https://phabricator.wikimedia.org/T428176 (10Ottomata) 03NEW [14:24:50] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: Add "wiki_id" to Page View Stream - https://phabricator.wikimedia.org/T427925#11985664 (10JAllemandou) >>! In T427925#11985570, @Ottomata wrote: > Nice patch! > > @mpopov @JAllemandou @phuedx I'm not sure who exactl... [14:26:41] 06Data-Engineering, 06ServiceOps new: Standard helm chart for simple service-utils nodejs apps - https://phabricator.wikimedia.org/T428174#11985680 (10MLechvien-WMF) Thanks for filing that Andrew. Putting this on @Scott_French radar for roadmap considerations. [14:31:50] 06Data-Engineering, 06ServiceOps new: Standard helm chart for simple service-utils nodejs apps - https://phabricator.wikimedia.org/T428174#11985717 (10MLechvien-WMF) IMO this looks like a good idea. It would be great to identify couple more candidate services who may benefit from this and be early adopters. [15:16:25] (03CR) 10Dr0ptp4kt: "Cluster key note (not sure if it will help, just food for thought for @mforns@wikimedia.org who is running with the patches now)." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1295064 (https://phabricator.wikimedia.org/T427532) (owner: 10Dr0ptp4kt) [15:24:34] (03CR) 10Dr0ptp4kt: "Clarification" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1295064 (https://phabricator.wikimedia.org/T427532) (owner: 10Dr0ptp4kt) [16:08:06] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 13Patch-For-Review: WE5.3.3b: Contributor Count Per Page [Attribution API] - https://phabricator.wikimedia.org/T426316#11986127 (10JAllemandou) I prefer option 2 from above, particularly for the reason that a metric changing in time is not a good idea. I'm... [16:13:05] (03PS1) 10Xcollazo: Add event_user_is_cross_wiki, page_is_deleted, revision_is_deleted_by_page_deletion, user_central_id [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1297743 (https://phabricator.wikimedia.org/T425730) [16:16:04] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: Add "wiki_id" to Page View Stream - https://phabricator.wikimedia.org/T427925#11986148 (10Ottomata) > The table is small enough that it probably could be pre-loaded in the flink job. Yes, but then we'd need to manag... [16:23:33] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Dumps-Generation: Data missing from en.wiktionary.org February 2026 "MediaWiki Content File Exports" compared to "XML Database dump" - https://phabricator.wikimedia.org/T417596#11986176 (10JeffDoozan) 05Resolved→03Open Sorry for the delayed response.... [16:32:35] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: Add "wiki_id" to Page View Stream - https://phabricator.wikimedia.org/T427925#11986210 (10JAllemandou) Even for pageview only, I would prefer not duplicate data that can easily be recomputed from other sources, for n... [17:11:23] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st): The revision_seconds_to_identity_revert field in wmf.mediawiki_history has sometimes negative values - https://phabricator.wikimedia.org/T419267#11986326 (10JAllemandou) I confirm the newer snapshot `2026-05` still doens't have thie issue. The problem in the... [18:14:28] (03PS1) 10Xcollazo: Add event_user_is_cross_wiki, page_is_deleted, revision_is_deleted_by_page_deletion, user_central_id [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1297762 (https://phabricator.wikimedia.org/T425730) [18:15:18] (03Abandoned) 10Xcollazo: Add event_user_is_cross_wiki, page_is_deleted, revision_is_deleted_by_page_deletion, user_central_id [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1297762 (https://phabricator.wikimedia.org/T425730) (owner: 10Xcollazo) [18:17:19] 06Data-Engineering, 10Event-Platform: `mediawiki.page_change.v1`: two schema validation errors causing events to be silently dropped by EventGate - https://phabricator.wikimedia.org/T421237#11986534 (10Ottomata) Hm, we should probably get this resolved ASAP. I'm hacking on {T428176}, and that will require bum... [18:19:09] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform: `mediawiki.page_change.v1`: two schema validation errors causing events to be silently dropped by EventGate - https://phabricator.wikimedia.org/T421237#11986556 (10Ottomata) [18:20:25] (03PS2) 10Xcollazo: Add event_user_is_cross_wiki, page_is_deleted, revision_is_deleted_by_page_deletion, user_central_id [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1297743 (https://phabricator.wikimedia.org/T425730) [18:42:29] (03PS3) 10Xcollazo: Add event_user_is_cross_wiki, page_is_deleted, revision_is_deleted_by_page_deletion, user_central_id [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1297743 (https://phabricator.wikimedia.org/T425730) [18:56:16] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10DPE-MediaWiki-Incremental-History, 10Event-Platform, 13Patch-For-Review: mediawiki.page_change.v1 events - add wikidata id for pages - https://phabricator.wikimedia.org/T428176#11986687 (10Ottomata) So, I think we can do this. But, caveat: the wikid... [19:03:01] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10DPE-MediaWiki-Incremental-History, 10Event-Platform, 13Patch-For-Review: mediawiki.page_change.v1 events - add wikidata id for pages - https://phabricator.wikimedia.org/T428176#11986700 (10xcollazo) > But, caveat: the wikidata item id is attached to... [19:16:16] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10DPE-MediaWiki-Incremental-History, 10Event-Platform, 13Patch-For-Review: mediawiki.page_change.v1 events - add wikidata id for pages - https://phabricator.wikimedia.org/T428176#11986721 (10Ottomata) Yeah, or something like that. I think if we were t... [19:17:20] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10DPE-MediaWiki-Incremental-History, 10Event-Platform, 13Patch-For-Review: mediawiki.page_change.v1 events - add wikidata id for pages - https://phabricator.wikimedia.org/T428176#11986722 (10Ottomata) It seems like having the wikidata item in page_chan... [19:57:01] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: Add "wiki_id" to Page View Stream - https://phabricator.wikimedia.org/T427925#11986780 (10mpopov) I'm with @JAllemandou here. Also, why not just update the UDF that produces `normalized_host` so that it also includes... [20:25:52] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: Add "wiki_id" to Page View Stream - https://phabricator.wikimedia.org/T427925#11986837 (10Ottomata) > Also, why not just update the UDF that produces normalized_host so that it also includes wiki_id? I think for the... [20:26:52] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: Add "wiki_id" to Page View Stream - https://phabricator.wikimedia.org/T427925#11986842 (10Ottomata) > Also, why not just update the UDF that produces normalized_host so that it also includes wiki_id? But uh, perhaps... [20:56:25] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10DPE-MediaWiki-Incremental-History, 07Epic, 13Patch-For-Review: Incremental MediaWiki History Phase I - https://phabricator.wikimedia.org/T424350#11986916 (10xcollazo) [21:12:58] 06Data-Engineering, 06ServiceOps new: Standard helm chart for simple service-utils nodejs apps - https://phabricator.wikimedia.org/T428174#11986980 (10Scott_French) Thanks for opening this, @Ottomata. I like the idea of offering something like this, and indeed it's a pattern that has worked well for services... [21:24:49] 06Data-Engineering, 06Research: Add new referrer-class for AI chatbots/LLMs to clickstream dataset - https://phabricator.wikimedia.org/T428136#11987002 (10Isaac) Thanks for the suggestion Kai and putting together the task @MGerlach! I was curious so did some very quickly exploration of impact on the dataset. C... [21:27:57] 06Data-Engineering, 06Data-Engineering-Radar, 10Data Pipelines: Add Ukrainian Wikipedia to Clickstream dataset - https://phabricator.wikimedia.org/T310972#11987004 (10Isaac) 05Open→03Resolved a:03Isaac FYI this was completed under {T289532}! See https://dumps.wikimedia.org/other/clickstream/ starti... [21:28:47] 06Data-Engineering, 06Research: Add new referrer-class for AI chatbots/LLMs to clickstream dataset - https://phabricator.wikimedia.org/T428136#11987010 (10Isaac) [21:28:48] 06Data-Engineering, 06Data-Engineering-Radar, 10Data Pipelines, 07Documentation: Clickstream dataset documentation should be extracted from Research page - https://phabricator.wikimedia.org/T356528#11987011 (10Isaac) [21:28:49] 06Data-Engineering, 06Data-Engineering-Icebox: Re-examine how internal search referrals are handled by Clickstream - https://phabricator.wikimedia.org/T292435#11987013 (10Isaac) [21:28:50] 06Data-Engineering, 06Research: Consider adding more namespaces to Clickstream dataset - https://phabricator.wikimedia.org/T296359#11987012 (10Isaac) [21:28:51] 06Data-Engineering, 06Data-Engineering-Icebox, 10Data-release: Wikipedia Clickstream dataset. Programmatic Access - https://phabricator.wikimedia.org/T134231#11987014 (10Isaac)