[02:51:41] 10Data-Engineering (Q3 2024 January 1st - March 31th), 10Dumps-Generation: Wikimedia Downloads not complete - https://phabricator.wikimedia.org/T383030#10562133 (10xcollazo) 05Open→03Resolved [08:07:09] 10Data-Engineering (Q3 2024 January 1st - March 31th), 06Language and Product Localization, 10MediaWiki-extensions-Translate, 06MW-Interfaces-Team, and 3 others: Intermittent JobQueueError due to "Unable to deliver all events: 500: Internal Server Error" - https://phabricator.wikimedia.org/T386138#10562390 (... [09:34:36] 10Data-Engineering (Q3 2024 January 1st - March 31th), 10Growth-Structured-Tasks, 06Growth-Team, 10Image-Suggestions, and 6 others: wmf.wikidata_item_page_link and wmf.wikidata_entity snapshots stuck at 2025-01-20 - https://phabricator.wikimedia.org/T386255#10562679 (10Lucas_Werkmeister_WMDE) (Not introduc... [13:04:56] (03PS1) 10Aqu: Create new refinery canaryevents module [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1120953 [13:19:00] (03CR) 10CI reject: [V:04-1] Create new refinery canaryevents module [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1120953 (owner: 10Aqu) [13:42:51] (03PS2) 10Aqu: Create new refinery canaryevents module [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1120953 [13:57:45] (03CR) 10CI reject: [V:04-1] Create new refinery canaryevents module [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1120953 (owner: 10Aqu) [14:01:25] 10Data-Engineering (Q3 2024 January 1st - March 31th): Provide a list of Phabricator tags relevant for Data Engineering - https://phabricator.wikimedia.org/T386752#10563453 (10Ottomata) Thank you! Looking good. We might want to look at this list and refactor a few of them. E.g. we have #data_pipelines and this... [14:03:50] 06Data-Engineering, 06Java-Scala-Standardization: Resolve conflict between GitLab CI automated package deployment token variable names - https://phabricator.wikimedia.org/T386056#10563457 (10Ottomata) Indeed! Whichever variable we settle on, we should update docs at https://gitlab.wikimedia.org/repos/data-eng... [14:06:18] (03PS3) 10Aqu: WIP: Add refinery canaryevents module [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1120953 [14:15:13] 06Data-Engineering, 06Data-Engineering-Radar, 10BDC-Implementation, 06Data-Platform-SRE, 07Epic: EPIC: Trino/minIO/Hive-Standalone-Metaserver/Dagster/Metabase/Superset Implementation - https://phabricator.wikimedia.org/T377362#10563526 (10Jgreen) [14:21:30] (03CR) 10CI reject: [V:04-1] WIP: Add refinery canaryevents module [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1120953 (owner: 10Aqu) [14:49:45] 10Data-Engineering (Q3 2024 January 1st - March 31th), 13Patch-For-Review: Timeout hive-metastore locks - https://phabricator.wikimedia.org/T365563#10563669 (10xcollazo) Interesting that you hit this a while ago @Antoine_Quhen. As we discussed briefly on yesterday on our sync up, the Hive Metastore can do dea... [14:50:51] 10Data-Engineering (Q3 2024 January 1st - March 31th), 10DPE-Data-Platform-related-Mediawiki-Content-data, 10Data-Platform (Data Platform Ops Week Working Group), 10Data-Platform-SRE (2025.02.10 - 2025.02.28), and 2 others: DAG failing due to failure to ac... - https://phabricator.wikimedia.org/T386114#10563675 [14:51:06] 10Data-Engineering (Q3 2024 January 1st - March 31th), 06Discovery-Search, 10Dumps 2.0, 10Data-Platform-SRE (2025.02.10 - 2025.02.28), 13Patch-For-Review: Add relevant kafka clusters to defined airflow connections in puppet - https://phabricator.wikimedia.org/T379676#10563677 (10brouberol) 05In prog... [14:54:09] 10Data-Engineering (Q3 2024 January 1st - March 31th), 10DPE-Data-Platform-related-Mediawiki-Content-data, 10Data-Platform (Data Platform Ops Week Working Group), 10Data-Platform-SRE (2025.02.10 - 2025.02.28), and 2 others: DAG failing due to failure to ac... - https://phabricator.wikimedia.org/T386114#10563684 [15:00:27] (03CR) 10Ottomata: "Alternate ideas:" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1120953 (owner: 10Aqu) [15:00:43] (03CR) 10Ottomata: "Alternate ideas:" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1120953 (owner: 10Aqu) [15:01:30] (03CR) 10Ottomata: "unresolve" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1120953 (owner: 10Aqu) [15:05:07] 10Data-Engineering (Q3 2024 January 1st - March 31th), 10Event-Platform, 13Patch-For-Review: Upgrade eventgate-wikimedia to node20 - https://phabricator.wikimedia.org/T383814#10563737 (10Ottomata) eventgate-analytics in eqiad is upgraded to node20. Will proceed with other eventgate instance next week. [15:08:42] 10Data-Engineering-Roadmap, 10Dumps 2.0, 10Discovery-Search (2025.02.10 - 2025.02.28), 07Epic, 13Patch-For-Review: EPIC: Update flink jobs to support Flink 1.20 - https://phabricator.wikimedia.org/T376812#10563768 (10Ottomata) FYI https://flink.apache.org/2025/02/12/apache-flink-1.20.1-release-announcem... [15:11:04] 06Data-Engineering, 06Data-Engineering-Radar, 06Growth-Team, 10GrowthExperiments, and 5 others: mw.track: support for histogram metrics - https://phabricator.wikimedia.org/T383563#10563781 (10lmata) [15:29:00] 06Data-Engineering: Migrate analytics Airflow DAGs to k8s Airflow deployment - https://phabricator.wikimedia.org/T386282#10563934 (10mforns) I also like `main`! Other brainstorming ideas: - `shared` - `collective` I'd thought of `common`, but discarded it because of the similarity with `commons`, which has bee... [15:31:05] 10Data-Engineering (Q3 2024 January 1st - March 31th): Migrate analytics Airflow DAGs to k8s Airflow deployment - https://phabricator.wikimedia.org/T386282#10563959 (10Ahoelzl) p:05Triage→03High a:03amastilovic [15:34:20] 10Data-Engineering (Q3 2024 January 1st - March 31th): Migrate analytics Airflow DAGs to k8s Airflow deployment - https://phabricator.wikimedia.org/T386282#10563985 (10mforns) One question about the kubernetes test instance (not related to instance naming). Could it maybe cause trouble when 2 or more people are... [15:47:21] 10Data-Engineering (Q3 2024 January 1st - March 31th), 10MediaWiki-DomainEvents, 06MW-Interfaces-Team, 07Epic: DomainEvents - Broadcasting and receiving cross-process events - https://phabricator.wikimedia.org/T379935#10564040 (10Ottomata) Okay thanks. I have more questions, but now we are getting in to j... [15:57:39] 06Data-Engineering, 13Patch-For-Review: [eventstreams] Fix event detail yaml error - https://phabricator.wikimedia.org/T386750#10564075 (10mforns) This has been deployed, thanks @tchin! [16:22:42] 10Data-Engineering (Q3 2024 January 1st - March 31th): Migrate analytics Airflow DAGs to k8s Airflow deployment - https://phabricator.wikimedia.org/T386282#10564247 (10mforns) Regarding the migration options, I like that option 1 is much more risk-free. If we go that way, I think we should make an active effort... [17:27:00] 06Data-Engineering, 06Data-Engineering-Radar, 10BDC-Implementation, 06Data-Platform-SRE, 07Epic: EPIC: Trino/minIO/Hive-Standalone-Metaserver/Dagster/Metabase/Superset Implementation - https://phabricator.wikimedia.org/T377362#10564632 (10Jgreen) [17:39:54] 06Data-Engineering, 06Data-Platform-SRE, 07Epic: Upgrade Hadoop to version 3.3.6 and Hive to version 4.0.1 - https://phabricator.wikimedia.org/T379385#10564705 (10xcollazo) Just bumped into this doc so wanted to share: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=75978150#AdminManualMetas... [18:00:35] 10Data-Engineering (Q3 2024 January 1st - March 31th), 13Patch-For-Review: [eventstreams] Fix event detail yaml error - https://phabricator.wikimedia.org/T386750#10564858 (10mforns) [18:06:34] 06Data-Engineering, 10Cassandra, 10Commons-Impact-Metrics: Recreate top-based Cassandra tables for Commons Impact Metrics - https://phabricator.wikimedia.org/T374268#10564883 (10mforns) [18:06:34] 06Data-Engineering, 10Commons-Impact-Metrics, 13Patch-For-Review: [CIM] Skewed ranking with the top Editors monthly API - https://phabricator.wikimedia.org/T370470#10564884 (10mforns) [18:19:21] 10Data-Engineering (Q3 2024 January 1st - March 31th): List out all migration candidates for mediawiki_content_history - https://phabricator.wikimedia.org/T386757#10564968 (10Ahoelzl) Looks like there is parallel work going on: https://docs.google.com/document/d/1p_7DzX5UfUbpUGau0sJ_u3TGEROn4vJoaJy8kpm0_Ec/edit?... [18:31:26] 10Data-Engineering (Q3 2024 January 1st - March 31th), 06Product-Analytics, 10Event-Platform: Enable Event Platform streams to opt out of collecting User-Agent data - https://phabricator.wikimedia.org/T382173#10565033 (10Ottomata) BTW, I just came across this very related old task {T263466} [18:59:43] 10Data-Engineering (Q3 2024 January 1st - March 31th): Enable Spark data lineage for all Airflow instances - https://phabricator.wikimedia.org/T386862 (10Ahoelzl) 03NEW [19:01:50] 10Data-Engineering (Q3 2024 January 1st - March 31th), 10DPE-Data-Platform-related-Mediawiki-Content-data, 10Data-Platform (Data Platform Ops Week Working Group), 10Data-Platform-SRE (2025.02.10 - 2025.02.28), and 2 others: DAG failing due to failure to ac... - https://phabricator.wikimedia.org/T386114#10565285 [19:04:03] 06Data-Engineering, 10Commons-Impact-Metrics: Commons Impact Metrics "all-wikis" problem - https://phabricator.wikimedia.org/T382731#10565293 (10mforns) Thanks @GFontenelle_WMF for the report! This is indeed a bug. The wiki should be there. [19:06:02] 06Data-Engineering, 10Commons-Impact-Metrics: Commons Impact Metrics "all-wikis" problem - https://phabricator.wikimedia.org/T382731#10565299 (10mforns) Ha, actually, we identified this problem some months ago, in this task T372805. I will close this task as a duplicate and make sure all subscribers are transf... [19:07:04] 06Data-Engineering, 10Commons-Impact-Metrics: Commons Impact Metrics "all-wikis" problem - https://phabricator.wikimedia.org/T382731#10565302 (10mforns) →14Duplicate dup:03T372805 [19:07:05] 06Data-Engineering, 10Commons-Impact-Metrics: [Commons Impact Metrics] Add page wiki to the corresponding top endpoints - https://phabricator.wikimedia.org/T372805#10565304 (10mforns) [19:10:20] 10Data-Engineering (Q3 2024 January 1st - March 31th): Enable Spark data lineage for all Airflow instances - https://phabricator.wikimedia.org/T386862#10565315 (10Ahoelzl) [19:30:01] 06Data-Engineering, 10AQS2.0, 10Commons-Impact-Metrics: [Commons Impact Metrics] Usage monitoring for commons-analytics AQS service - https://phabricator.wikimedia.org/T371133#10565374 (10mforns) [19:40:06] 10Data-Engineering (Q3 2024 January 1st - March 31th), 10Growth-Structured-Tasks, 06Growth-Team, 10Image-Suggestions, and 6 others: wmf.wikidata_item_page_link and wmf.wikidata_entity snapshots stuck at 2025-01-20 - https://phabricator.wikimedia.org/T386255#10565400 (10xcollazo) Got it, thanks for the cont... [22:58:12] 10Data-Engineering (Q3 2024 January 1st - March 31th): Enable Spark data lineage for all Airflow instances - https://phabricator.wikimedia.org/T386862#10565959 (10Ahoelzl) [22:58:12] 10Data-Engineering-Roadmap, 07Epic: Data Platform Data Lineage - https://phabricator.wikimedia.org/T377789#10565960 (10Ahoelzl) [23:01:45] 10Data-Engineering (Q3 2024 January 1st - March 31th): Enable Spark data lineage for all Airflow instances - https://phabricator.wikimedia.org/T386862#10565976 (10Ahoelzl) [23:02:01] 10Data-Engineering (Q3 2024 January 1st - March 31th): Enable Spark data lineage for all Airflow instances - https://phabricator.wikimedia.org/T386862#10565977 (10Ahoelzl) Planning document: https://docs.google.com/spreadsheets/d/1QduhKl4pK0kgeRPNnXMNwk0cCVlhyq5q-l4k3Ka4o84/edit?gid=0#gid=0 [23:23:43] 10Data-Engineering (Q3 2024 January 1st - March 31th): Enable Spark data lineage for all Airflow instances - https://phabricator.wikimedia.org/T386862#10566030 (10Ahoelzl)