[00:01:36] RESOLVED: [3x] MediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag: ... [00:01:36] High Kafka consumer lag for mw_content_history_reconcile_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-content-history-reconcile-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_content_history_reconcile_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag [00:16:12] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10DPE-MediaWiki-Incremental-History, 13Patch-For-Review: Extend MWHistorySnapshotMerger to reconcile page and user event rows - https://phabricator.wikimedia.org/T427328#11979702 (10xcollazo) Implemented as Option B (delete-then-insert) in Gerrit 1296665... [03:49:56] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 13Patch-For-Review: WE5.3.3b: Contributor Count Per Page [Attribution API] - https://phabricator.wikimedia.org/T426316#11979837 (10AKhatun_WMF) ### TL;DR - **Unless strictly required for editor metrics, I propose to leave out cross-wiki users for now. Whe... [04:07:21] 06Data-Engineering: Add user status details for cross-wiki users in MWH - https://phabricator.wikimedia.org/T428018 (10AKhatun_WMF) 03NEW [06:21:54] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st): Create API and User-Agent compliance related tables under wmf_traffic - https://phabricator.wikimedia.org/T427840#11979949 (10KCVelaga_WMF) >>! In T427840#11975217, @JAllemandou wrote: >>>! In T427840#11973679, @KCVelaga_WMF wrote:need. > > Hm, I wonder wha... [09:09:14] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06SRE, 10Event-Platform: Flink Page View: Create K8s resources - https://phabricator.wikimedia.org/T426425#11980454 (10JMonton-WMF) [09:13:52] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform: HTML Pipeline - Issue with max size messages - https://phabricator.wikimedia.org/T425336#11980479 (10JMonton-WMF) We are doing an analysis of the errors, to understand if increasing the max size will remove them completely or not. Result... [09:39:20] (03CR) 10A-pizzata: [C:03+1] Add event_entity='user' to MWHistoryDeltaWriter (MERGE 6) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1290945 (https://phabricator.wikimedia.org/T425729) (owner: 10Xcollazo) [11:01:30] (03CR) 10Joal: Add event_entity='user' to MWHistoryDeltaWriter (MERGE 6) (035 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1290945 (https://phabricator.wikimedia.org/T425729) (owner: 10Xcollazo) [12:12:25] (03CR) 10A-pizzata: [C:03+1] "LGTM, small nit" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1293836 (https://phabricator.wikimedia.org/T427314) (owner: 10Xcollazo) [12:22:38] (03CR) 10A-pizzata: [C:03+1] Fix MWHistorySnapshotMerger: DELETE+INSERT replaces MERGE, add page/user reconcile [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1296665 (https://phabricator.wikimedia.org/T427328) (owner: 10Xcollazo) [12:25:06] (03PS1) 10Joal: Fix MWH revert algorithm [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1297115 [12:45:22] (03CR) 10A-pizzata: [C:03+1] "LGTM, small nit: add Bug: T266374 to the commit?" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1297115 (owner: 10Joal) [12:49:52] (03CR) 10A-pizzata: [C:03+1] "LGTM" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1296674 (https://phabricator.wikimedia.org/T427862) (owner: 10Xcollazo) [12:50:18] (03PS2) 10Joal: Fix MWH revert algorithm [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1297115 (https://phabricator.wikimedia.org/T266374) [12:50:40] (03CR) 10Joal: "Done!" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1297115 (https://phabricator.wikimedia.org/T266374) (owner: 10Joal) [13:09:15] (03CR) 10Joal: Daily revert detection: align with monthly DenormalizedRevisionsBuilder (039 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1293836 (https://phabricator.wikimedia.org/T427314) (owner: 10Xcollazo) [13:16:30] (03CR) 10Xcollazo: [C:03+1] Fix MWH revert algorithm [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1297115 (https://phabricator.wikimedia.org/T266374) (owner: 10Joal) [13:18:34] (03CR) 10Joal: [C:03+1] "LGTM" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1296665 (https://phabricator.wikimedia.org/T427328) (owner: 10Xcollazo) [13:19:01] (03CR) 10Ottomata: Add event_entity='user' to MWHistoryDeltaWriter (MERGE 6) (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1290945 (https://phabricator.wikimedia.org/T425729) (owner: 10Xcollazo) [13:21:23] (03CR) 10Ottomata: Add event_entity='user' to MWHistoryDeltaWriter (MERGE 6) (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1290945 (https://phabricator.wikimedia.org/T425729) (owner: 10Xcollazo) [13:55:48] (03CR) 10Xcollazo: Add event_entity='user' to MWHistoryDeltaWriter (MERGE 6) (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1290945 (https://phabricator.wikimedia.org/T425729) (owner: 10Xcollazo) [14:22:59] (03PS3) 10Milimetric: Add one-off tables [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1294434 [14:24:57] (03PS4) 10Milimetric: Add one-off tables [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1294434 [14:29:36] (03CR) 10Xcollazo: Add event_entity='user' to MWHistoryDeltaWriter (MERGE 6) (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1290945 (https://phabricator.wikimedia.org/T425729) (owner: 10Xcollazo) [14:30:20] !log Test Kitchen edge-unique experiments (poll 80607) - adds: none; removes: synth-aa-detect-hoisting-errors-1; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD [14:30:23] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:00:21] (03CR) 10Xcollazo: Add event_entity='user' to MWHistoryDeltaWriter (MERGE 6) (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1290945 (https://phabricator.wikimedia.org/T425729) (owner: 10Xcollazo) [15:05:18] (03PS4) 10Xcollazo: Add event_entity='user' to MWHistoryDeltaWriter (MERGE 6) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1290945 (https://phabricator.wikimedia.org/T425729) [15:05:27] (03CR) 10Ottomata: Add event_entity='user' to MWHistoryDeltaWriter (MERGE 6) (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1290945 (https://phabricator.wikimedia.org/T425729) (owner: 10Xcollazo) [15:07:00] (03CR) 10Xcollazo: Add event_entity='user' to MWHistoryDeltaWriter (MERGE 6) (035 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1290945 (https://phabricator.wikimedia.org/T425729) (owner: 10Xcollazo) [15:57:33] (03CR) 10Seanleong-wmde: Script to gather metrics for Recent Changes in pilot Wikis. (032 comments) [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/1288258 (https://phabricator.wikimedia.org/T426384) (owner: 10Seanleong-wmde) [16:03:15] (03PS7) 10Seanleong-wmde: Script to gather metrics for Recent Changes in pilot Wikis. [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/1288258 (https://phabricator.wikimedia.org/T426384) [16:12:07] (03CR) 10Xcollazo: "After discussions, we decided to park this change to focus on the logic being worked on elsewhere." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1296674 (https://phabricator.wikimedia.org/T427862) (owner: 10Xcollazo) [16:12:29] (03Abandoned) 10Xcollazo: Convert control_map from MAP to typed struct [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1296674 (https://phabricator.wikimedia.org/T427862) (owner: 10Xcollazo) [16:18:43] !log Test Kitchen edge-unique experiments (poll 80930) - adds: logged-out-retention-round14; removes: none; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD [16:18:45] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:31:05] (03CR) 10Xcollazo: Add event_entity='user' to MWHistoryDeltaWriter (MERGE 6) (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1290945 (https://phabricator.wikimedia.org/T425729) (owner: 10Xcollazo) [17:32:37] (03PS5) 10Xcollazo: Add event_entity='user' to MWHistoryDeltaWriter (MERGE 6) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1290945 (https://phabricator.wikimedia.org/T425729) [17:59:11] (03CR) 10Xcollazo: Add event_entity='user' to MWHistoryDeltaWriter (MERGE 6) (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1290945 (https://phabricator.wikimedia.org/T425729) (owner: 10Xcollazo) [17:59:36] (03PS6) 10Xcollazo: Add event_entity='user' to MWHistoryDeltaWriter (MERGE 6) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1290945 (https://phabricator.wikimedia.org/T425729) [18:02:06] (03CR) 10Xcollazo: Add event_entity='user' to MWHistoryDeltaWriter (MERGE 6) (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1290945 (https://phabricator.wikimedia.org/T425729) (owner: 10Xcollazo) [18:35:36] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10MediaWiki-extensions-Wikibase-Client, 10Wikidata: Move the wbc_entity_usage table onto a dedicated DB shard - https://phabricator.wikimedia.org/T176273#11982715 (10Ottomata) [18:35:42] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10MediaWiki-extensions-Wikibase-Client, 10Wikidata: Move the wbc_entity_usage table onto a dedicated DB shard - https://phabricator.wikimedia.org/T176273#11982716 (10Ahoelzl) a:03Ahoelzl [18:36:53] (03CR) 10Joal: [C:03+1] Add event_entity='user' to MWHistoryDeltaWriter (MERGE 6) (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1290945 (https://phabricator.wikimedia.org/T425729) (owner: 10Xcollazo) [18:40:42] 06Data-Engineering, 10Event-Platform: mediawiki-event-enrichment - set deterministic meta.id - https://phabricator.wikimedia.org/T424987#11982731 (10Ottomata) [18:40:45] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: Streaming HTML & Edit Types - productionization checklist - https://phabricator.wikimedia.org/T423920#11982732 (10Ottomata) [18:41:15] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform: mediawiki-event-enrichment - set deterministic meta.id - https://phabricator.wikimedia.org/T424987#11982733 (10Ahoelzl) [18:41:32] 06Data-Engineering, 10Event-Platform: jsonschema-tools should not consider definitions fields in compatibility checks. - https://phabricator.wikimedia.org/T425028#11982734 (10Ottomata) p:05Triage→03Low [18:41:46] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform: mediawiki-event-enrichment - set deterministic meta.id - https://phabricator.wikimedia.org/T424987#11982735 (10Ahoelzl) p:05Triage→03Medium a:03JMonton-WMF [18:42:24] 06Data-Engineering, 10Event-Platform: jsonschema-tools should not consider definitions fields in compatibility checks. - https://phabricator.wikimedia.org/T425028#11982737 (10Ottomata) a:03Ottomata [18:42:35] 06Data-Engineering, 07Essential-Work, 10Event-Platform: jsonschema-tools should not consider definitions fields in compatibility checks. - https://phabricator.wikimedia.org/T425028#11982738 (10Ahoelzl) [18:42:46] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 07Essential-Work, 10Event-Platform: jsonschema-tools should not consider definitions fields in compatibility checks. - https://phabricator.wikimedia.org/T425028#11982740 (10Ottomata) [18:43:41] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06Data-Platform-SRE (2026-04-24 - 2026-05-15): implement script to move data from P&T data lake to FR Tech data lake - https://phabricator.wikimedia.org/T425133#11982744 (10Ottomata) [18:43:42] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06Data-Platform-SRE (2026-04-24 - 2026-05-15): implement script to move data from P&T data lake to FR Tech data lake - https://phabricator.wikimedia.org/T425133#11982745 (10Ahoelzl) p:05Triage→03Medium a:03amastilovic [18:46:16] 14Analytics, 06Data-Engineering, 10Data-Engineering-Wikistats: Wiki Stats: Problem Detecting (unique) Desktop Devices since August 2025? - https://phabricator.wikimedia.org/T425359#11982749 (10Ahoelzl) @Hghani can you evaluate if this is related to any data incident or needs to be investigated? [18:48:20] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 07Essential-Work: Inconsistent wiki list: grouped_wikis.csv extended *after* some sqoop jobs have already started - https://phabricator.wikimedia.org/T425385#11982769 (10Ottomata) [18:48:21] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 07Essential-Work: Inconsistent wiki list: grouped_wikis.csv extended *after* some sqoop jobs have already started - https://phabricator.wikimedia.org/T425385#11982770 (10Ahoelzl) p:05Triage→03Medium a:03Ahoelzl [18:49:18] (03CR) 10Xcollazo: [C:03+2] Add event_entity='user' to MWHistoryDeltaWriter (MERGE 6) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1290945 (https://phabricator.wikimedia.org/T425729) (owner: 10Xcollazo) [18:49:34] 06Data-Engineering, 06Wikimedia Enterprise: PageViews S3 Data Transfer MR [Enterprise] - https://phabricator.wikimedia.org/T425543#11982786 (10Ottomata) Related: {T425133} [18:50:01] 06Data-Engineering, 06Wikimedia Enterprise: PageViews S3 Data Transfer MR [Enterprise] - https://phabricator.wikimedia.org/T425543#11982792 (10Ahoelzl) @LDlulisa-WMF are you unblocked? Do you need any further DE reviews? [18:50:24] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06Wikimedia Enterprise: PageViews S3 Data Transfer MR [Enterprise] - https://phabricator.wikimedia.org/T425543#11982793 (10Ahoelzl) a:03Ahoelzl [18:50:30] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06Wikimedia Enterprise: PageViews S3 Data Transfer MR [Enterprise] - https://phabricator.wikimedia.org/T425543#11982796 (10Ahoelzl) p:05Triage→03High [18:51:02] 06Data-Engineering, 10Event-Platform: Edit type enrichment: Add timeout - https://phabricator.wikimedia.org/T424547#11982799 (10Ahoelzl) p:05Triage→03Medium [18:51:09] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform: Edit type enrichment: Add timeout - https://phabricator.wikimedia.org/T424547#11982801 (10Ottomata) [18:54:41] 10Data-Engineering-Roadmap, 07Epic: [Epic] KAPOW: The next generation of bot detection in the Data Platform. - https://phabricator.wikimedia.org/T425661#11982823 (10Ottomata) [18:55:56] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st): Use KAPOW scores for existing metrics (pageviews and unique_devices). - https://phabricator.wikimedia.org/T425668#11982835 (10Ottomata) [18:57:52] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06Wikimedia Enterprise: WME Pageviews DAG for HDFS to S3 Transfer - https://phabricator.wikimedia.org/T426017#11982852 (10Ahoelzl) p:05Triage→03High a:03Ahoelzl [18:58:42] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06Wikimedia Enterprise: PageViews S3 Data Transfer MR [Enterprise] - https://phabricator.wikimedia.org/T425543#11982858 (10Ottomata) Is this a duplicate of {T426017}? [19:02:23] 06Data-Engineering, 10Event-Platform: mediawiki.page_html_content_change.v1 stream content_uri field uses localhost instead of wiki hostname - https://phabricator.wikimedia.org/T427598#11982864 (10Ottomata) a:03Ottomata [19:02:27] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform: mediawiki.page_html_content_change.v1 stream content_uri field uses localhost instead of wiki hostname - https://phabricator.wikimedia.org/T427598#11982866 (10Ottomata) [19:03:26] (03Merged) 10jenkins-bot: Add event_entity='user' to MWHistoryDeltaWriter (MERGE 6) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1290945 (https://phabricator.wikimedia.org/T425729) (owner: 10Xcollazo) [20:41:40] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10DPE-MediaWiki-Incremental-History, 13Patch-For-Review: Spike: drop all 90-day windows in MWHistoryDeltaWriter and replace with full-table revert detection - https://phabricator.wikimedia.org/T427314#11983110 (10xcollazo) Implementation notes — hot-page... [20:46:15] FIRING: HdfsRpcQueueLength: RPC queue length on the analytics-hadoop cluster is too high. - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Alerts#HDFS_Namenode_RPC_length_queue/latency - https://grafana.wikimedia.org/d/000000585/hadoop?var-hadoop_cluster=analytics-hadoop&orgId=1&panelId=54&fullscreen - https://alerts.wikimedia.org/?q=alertname%3DHdfsRpcQueueLength [20:50:59] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10DPE-MediaWiki-Incremental-History, 13Patch-For-Review: Airflow DAGs for mediawiki_history_incremental_v1 writers - https://phabricator.wikimedia.org/T425730#11983132 (10xcollazo) DAG 1 — ordering constraint: depends_on_past=True The daily delta writer... [20:51:15] RESOLVED: HdfsRpcQueueLength: RPC queue length on the analytics-hadoop cluster is too high. - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Alerts#HDFS_Namenode_RPC_length_queue/latency - https://grafana.wikimedia.org/d/000000585/hadoop?var-hadoop_cluster=analytics-hadoop&orgId=1&panelId=54&fullscreen - https://alerts.wikimedia.org/?q=alertname%3DHdfsRpcQueueLength [20:52:39] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10DPE-MediaWiki-Incremental-History, 13Patch-For-Review: Airflow DAGs for mediawiki_history_incremental_v1 writers - https://phabricator.wikimedia.org/T425730#11983133 (10xcollazo) [20:53:24] (03PS6) 10Xcollazo: Daily revert detection: align with monthly DenormalizedRevisionsBuilder [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1293836 (https://phabricator.wikimedia.org/T427314) [20:57:53] (03CR) 10Xcollazo: "Latest patchset refactors all the comments to be more on point, and to remove stuff that is not relevant anymore." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1293836 (https://phabricator.wikimedia.org/T427314) (owner: 10Xcollazo) [20:58:58] (03CR) 10Xcollazo: [C:03+2] "Since there were no more code concerns, and I have a +1, I will go ahead with this patch." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1293836 (https://phabricator.wikimedia.org/T427314) (owner: 10Xcollazo) [21:05:44] (03PS7) 10Xcollazo: Add DDL for mediawiki_history_incremental_v1 Iceberg table [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1287959 (https://phabricator.wikimedia.org/T425729) [21:11:41] (03Merged) 10jenkins-bot: Daily revert detection: align with monthly DenormalizedRevisionsBuilder [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1293836 (https://phabricator.wikimedia.org/T427314) (owner: 10Xcollazo) [22:48:38] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06Data-Platform-SRE: Provide a scheduled data download service from Google Cloud Storage - https://phabricator.wikimedia.org/T427457#11983371 (10Ahoelzl) [23:42:25] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 05Metrics-Sprint-2026-2027, 13Patch-For-Review: DE3.1 - Logged-out reader 21-day retention on web - https://phabricator.wikimedia.org/T424706#11983514 (10tchin) It's okay, I changed a few small things in the sql so I can just do it manually on my end