[06:00:57] 10Data-Engineering (Q2 2024 October 1st - December 31th), 06DBA, 07Schema-change-in-production: Drop deprecated abuse filter fields on wmf wikis - https://phabricator.wikimedia.org/T367781#10245083 (10ABran-WMF) >>! In T367781#10237762, @ABran-WMF wrote: > ["db2155", "db2172", "db2219"] were missing on s4, d... [06:14:41] 10Data-Engineering (Q2 2024 October 1st - December 31th), 06DBA, 07Schema-change-in-production: Drop deprecated abuse filter fields on wmf wikis - https://phabricator.wikimedia.org/T367781#10245090 (10ABran-WMF) [08:23:18] 10Data-Engineering (Q2 2024 October 1st - December 31th), 13Patch-For-Review: [Refine Refactoring] Refine jobs should be scheduled by Airflow: deployment - https://phabricator.wikimedia.org/T369845#10245237 (10Antoine_Quhen) Last progresses: - We are now refining the whole set of streams in staging, and it's 9... [08:52:13] 10Data-Engineering (Q2 2024 October 1st - December 31th), 06DBA, 07Schema-change-in-production: Drop deprecated abuse filter fields on wmf wikis - https://phabricator.wikimedia.org/T367781#10245326 (10ABran-WMF) [09:12:15] (03PS1) 10Aqu: Update Refine smtp server - backport to 0.2.49 [analytics/refinery/source] (0.2.49) - 10https://gerrit.wikimedia.org/r/1081914 (https://phabricator.wikimedia.org/T325394) [09:14:49] 06Data-Engineering, 06Data-Platform-SRE: Refine jobs fail to send alert emails - https://phabricator.wikimedia.org/T377698 (10gmodena) 03NEW [09:16:23] (03CR) 10Gmodena: [C:03+1] Update Refine smtp server - backport to 0.2.49 [analytics/refinery/source] (0.2.49) - 10https://gerrit.wikimedia.org/r/1081914 (https://phabricator.wikimedia.org/T325394) (owner: 10Aqu) [09:17:12] (03PS2) 10Gmodena: Update Refine smtp server - backport to 0.2.49 [analytics/refinery/source] (0.2.49) - 10https://gerrit.wikimedia.org/r/1081914 (https://phabricator.wikimedia.org/T377698) (owner: 10Aqu) [09:32:00] (03PS2) 10Aqu: Event deduplication via windowing [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1080306 (https://phabricator.wikimedia.org/T369845) [09:32:06] (03CR) 10CI reject: [V:04-1] Event deduplication via windowing [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1080306 (https://phabricator.wikimedia.org/T369845) (owner: 10Aqu) [09:43:00] (03PS3) 10Aqu: Event deduplication via windowing [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1080306 (https://phabricator.wikimedia.org/T369845) [09:56:28] 10Data-Engineering (Q2 2024 October 1st - December 31th), 06Data-Platform-SRE, 13Patch-For-Review: Refine jobs fail to send alert emails - https://phabricator.wikimedia.org/T377698#10245643 (10gmodena) [09:56:39] 10Data-Engineering (Q2 2024 October 1st - December 31th), 06Data-Platform-SRE, 13Patch-For-Review: Refine jobs fail to send alert emails - https://phabricator.wikimedia.org/T377698#10245645 (10gmodena) a:03gmodena [10:20:00] (03PS4) 10Aqu: Event deduplication via windowing [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1080306 (https://phabricator.wikimedia.org/T369845) [10:29:41] 10Data-Engineering (Q2 2024 October 1st - December 31th), 06Data-Platform-SRE, 13Patch-For-Review: Refine jobs fail to send alert emails - https://phabricator.wikimedia.org/T377698#10245754 (10gmodena) [12:10:43] 06Data-Engineering, 10Data-Platform-SRE (2024.10.19 - 2024.11.08), 03Discovery-Search (Current work): Unable to find ingested tables in datahub - https://phabricator.wikimedia.org/T376657#10245990 (10BTullis) Oh dear, I seem to have caused some kind of problem with DataHub. In light of the missing tables, I... [12:24:23] 06Data-Engineering, 10MediaWiki-extensions-EventLogging, 07ci-test-error, 10MW-1.43-notes (1.43.0-wmf.28; 2024-10-22): Failure in ContextAttributesFactoryTest::testAgentContextAttributes - https://phabricator.wikimedia.org/T377673#10246024 (10Reedy) 05Open→03Resolved a:03Reedy [12:37:34] 06Data-Engineering, 06Data Products, 06Data-Platform, 06Movement-Insights, and 2 others: Temporary Accounts Initiative (IP Masking) - Add user_is_temp to data tables - https://phabricator.wikimedia.org/T356701#10246073 (10fkaelin) To follow up my previous comment: > Another point for discussion: the mediaw... [13:55:39] (03PS1) 10Aqu: Event deduplication via windowing - backport on 0.2.49 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1081979 (https://phabricator.wikimedia.org/T369845) [13:55:46] (03CR) 10CI reject: [V:04-1] Event deduplication via windowing - backport on 0.2.49 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1081979 (https://phabricator.wikimedia.org/T369845) (owner: 10Aqu) [13:56:02] (03Abandoned) 10Aqu: Event deduplication via windowing - backport on 0.2.49 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1081979 (https://phabricator.wikimedia.org/T369845) (owner: 10Aqu) [14:00:58] (03PS2) 10Aqu: Event deduplication via windowing - backport on 0.2.49 [analytics/refinery/source] (0.2.49) - 10https://gerrit.wikimedia.org/r/1081164 (https://phabricator.wikimedia.org/T369845) [14:09:48] 06Data-Engineering: [Iceberg Migration] Extend Iceberg table maintenance mechanism to support multiple Airflow instances - https://phabricator.wikimedia.org/T373693#10246508 (10xcollazo) As part of {T370898} the structured data team will be needing support for automatic table maintenance for their new Iceberg ta... [14:20:13] 10Data-Engineering (Q2 2024 October 1st - December 31th), 13Patch-For-Review: [Refine Refactoring] Refine jobs should be scheduled by Airflow: deployment - https://phabricator.wikimedia.org/T369845#10246550 (10Ottomata) For reference about deploying backport: https://wikitech.wikimedia.org/wiki/Data_Platform/... [14:27:39] 10Data-Engineering (Q2 2024 October 1st - December 31th), 10Data Pipelines, 10Data-Catalog, 13Patch-For-Review: Integrate Spark with DataHub with lineage - https://phabricator.wikimedia.org/T306896#10246588 (10tchin) [14:49:29] 06Data-Engineering, 10Data Pipelines: Refine Data Quality - late events, RefineMonitor refactor, etc. - https://phabricator.wikimedia.org/T377739 (10Ottomata) 03NEW [14:49:48] 06Data-Engineering, 10Data Pipelines: [Refine Refactoring] Refine Data Quality - late events, RefineMonitor refactor, etc. - https://phabricator.wikimedia.org/T377739#10246781 (10Ottomata) [14:56:24] 06Data-Engineering, 10Data-Platform-SRE (2024.10.19 - 2024.11.08), 13Patch-For-Review: Design a suitable DAG deployment method - https://phabricator.wikimedia.org/T368033#10246795 (10brouberol) We have deployed changes to the `airflow` k8s chart ensuring that DAG changes are automatically pulled and serializ... [15:02:56] (03CR) 10Ottomata: "Fine to merge this, but it shoudln't be needed as a backport. This param can be set by puppet." [analytics/refinery/source] (0.2.49) - 10https://gerrit.wikimedia.org/r/1081914 (https://phabricator.wikimedia.org/T377698) (owner: 10Aqu) [15:03:18] (03CR) 10Gmodena: [C:04-1] Update Refine smtp server - backport to 0.2.49 [analytics/refinery/source] (0.2.49) - 10https://gerrit.wikimedia.org/r/1081914 (https://phabricator.wikimedia.org/T377698) (owner: 10Aqu) [15:03:50] (03CR) 10Gmodena: [C:04-1] "let's set via puppet if feasible. I -1ed to avoid accidental merges." [analytics/refinery/source] (0.2.49) - 10https://gerrit.wikimedia.org/r/1081914 (https://phabricator.wikimedia.org/T377698) (owner: 10Aqu) [15:20:05] 10Data-Engineering (Q2 2024 October 1st - December 31th), 10Dumps 2.0 (Kanban Board), 13Patch-For-Review: Flink job to enrich reconciliation events - https://phabricator.wikimedia.org/T368787#10246945 (10gmodena) [15:41:49] 10Data-Engineering (Q2 2024 October 1st - December 31th): Airflow should alert on task failure only after exhausting retries - https://phabricator.wikimedia.org/T377745 (10gmodena) 03NEW [15:52:48] (03PS3) 10Aqu: Event deduplication via windowing - backport on 0.2.49 [analytics/refinery/source] (0.2.49) - 10https://gerrit.wikimedia.org/r/1081164 (https://phabricator.wikimedia.org/T369845) [15:53:13] (03CR) 10CI reject: [V:04-1] Event deduplication via windowing - backport on 0.2.49 [analytics/refinery/source] (0.2.49) - 10https://gerrit.wikimedia.org/r/1081164 (https://phabricator.wikimedia.org/T369845) (owner: 10Aqu) [15:54:05] (03PS4) 10Aqu: Event deduplication via windowing - backport on 0.2.49 [analytics/refinery/source] (0.2.49) - 10https://gerrit.wikimedia.org/r/1081164 (https://phabricator.wikimedia.org/T369845) [15:54:28] (03CR) 10CI reject: [V:04-1] Event deduplication via windowing - backport on 0.2.49 [analytics/refinery/source] (0.2.49) - 10https://gerrit.wikimedia.org/r/1081164 (https://phabricator.wikimedia.org/T369845) (owner: 10Aqu) [15:56:29] (03PS5) 10Aqu: Event deduplication via windowing - backport on 0.2.49 [analytics/refinery/source] (0.2.49) - 10https://gerrit.wikimedia.org/r/1081164 (https://phabricator.wikimedia.org/T369845) [16:00:07] (03CR) 10Aleksandar Mastilovic: [V:03+2 C:03+2] "LGTM!" [analytics/refinery/source] (0.2.49) - 10https://gerrit.wikimedia.org/r/1081164 (https://phabricator.wikimedia.org/T369845) (owner: 10Aqu) [16:05:11] Starting build #21 for job analytics-refinery-maven-release [16:29:08] Project analytics-refinery-maven-release build #21: 09SUCCESS in 23 min: https://integration.wikimedia.org/ci/job/analytics-refinery-maven-release/21/ [17:24:08] 06Data-Engineering, 10Event-Platform, 10MW-1.43-notes (1.43.0-wmf.27; 2024-10-15), 13Patch-For-Review, and 2 others: Delete redundant mobile- and desktopwebuiactions event in WikimediaEvents - https://phabricator.wikimedia.org/T376065#10247666 (10ovasileva) a:03ovasileva [17:24:23] 06Data-Engineering, 10Event-Platform, 10Web Team Essential Work 2024 (Migrate to new Event Platform), 10Web-Team-Backlog (FY2024-25 Q2 Sprint 2): Deprecate use of desktop- and mobilewebuiactions in Event Platform - https://phabricator.wikimedia.org/T368678#10247667 (10ovasileva) a:03ovasileva [17:51:00] !log Deployed latest DAGs to Airflow analytics instance to pickup T375402. [17:51:03] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:51:03] T375402: Tune Dumps 2.0 hourly ingestion jobs - https://phabricator.wikimedia.org/T375402 [18:05:17] 10Data-Engineering (Q2 2024 October 1st - December 31th), 13Patch-For-Review: Airflow should alert on task failure only after exhausting retries - https://phabricator.wikimedia.org/T377745#10247862 (10gmodena) a:03gmodena [18:05:38] Starting build #18 for job analytics-refinery-update-jars [18:07:46] (03PS1) 10Maven-release-user: Add refinery-source jars for v0.2.49.2 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1082063 [18:07:46] Project analytics-refinery-update-jars build #18: 09SUCCESS in 2 min 7 sec: https://integration.wikimedia.org/ci/job/analytics-refinery-update-jars/18/ [20:17:55] 10Quarry: [bug] Quarry queries are stopped - https://phabricator.wikimedia.org/T377010#10248327 (10rook) It is possible that you were encountering the three hour time limit for analytics searches. If there was some lag it could have increased your query time from what looks like an hour to later. I'm unsure of h... [20:18:06] 10Quarry: [bug] Quarry queries are stopped - https://phabricator.wikimedia.org/T377010#10248328 (10rook) 05Open→03Declined [20:22:43] (03Abandoned) 10Aqu: Add refinery-source jars for v0.2.51 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1077070 (owner: 10Maven-release-user) [20:23:25] Starting build #19 for job analytics-refinery-update-jars [20:25:12] (03PS1) 10Maven-release-user: Add refinery-source jars for v0.2.49.2 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1082083 [20:25:12] Project analytics-refinery-update-jars build #19: 09SUCCESS in 1 min 46 sec: https://integration.wikimedia.org/ci/job/analytics-refinery-update-jars/19/ [20:48:38] (03Abandoned) 10Aqu: Add refinery-source jars for v0.2.49.2 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1082063 (owner: 10Maven-release-user) [21:00:37] (03PS2) 10Aqu: Add refinery-source jars for v0.2.49.2 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1082083 (owner: 10Maven-release-user) [22:44:46] 06Data-Engineering, 07Epic: [Epic] Data Platform Data Lineage - https://phabricator.wikimedia.org/T377789 (10Ahoelzl) 03NEW [22:53:23] 06Data-Engineering, 07Epic: [Epic] Data Platform Data Lineage - https://phabricator.wikimedia.org/T377789#10248653 (10Ahoelzl) [22:53:26] 06Data-Engineering, 07Epic: All things DataHub - https://phabricator.wikimedia.org/T369756#10248654 (10Ahoelzl) [22:54:36] 06Data-Engineering, 07Epic: [Epic] Data Platform Data Lineage - https://phabricator.wikimedia.org/T377789#10248660 (10Ahoelzl) [22:54:38] 10Data-Engineering (Q2 2024 October 1st - December 31th), 10Data Pipelines, 10Data-Catalog, 13Patch-For-Review: Integrate Spark with DataHub with lineage - https://phabricator.wikimedia.org/T306896#10248659 (10Ahoelzl) [22:54:38] 10Data-Engineering (Q2 2024 October 1st - December 31th): [SPIKE] Define process to build out lineage in DataHub - https://phabricator.wikimedia.org/T369758#10248661 (10Ahoelzl) [22:54:40] 06Data-Engineering, 06Data-Platform-SRE, 07Epic: Data Catalog MVP - https://phabricator.wikimedia.org/T299910#10248662 (10Ahoelzl) [22:55:33] 06Data-Engineering, 07Epic: [Epic] Data Platform Data Lineage - https://phabricator.wikimedia.org/T377789#10248665 (10Ahoelzl) [22:55:34] 10Data-Engineering (Q2 2024 October 1st - December 31th): [SPIKE] Define process to build out lineage in DataHub - https://phabricator.wikimedia.org/T369758#10248664 (10Ahoelzl) [22:55:35] 06Data-Engineering, 06Data Products: Public DataHub - https://phabricator.wikimedia.org/T366720#10248666 (10Ahoelzl) [22:56:04] 06Data-Engineering, 07Epic: Data Platform Data Lineage - https://phabricator.wikimedia.org/T377789#10248667 (10Ahoelzl) [22:58:56] 10Data-Engineering (Q2 2024 October 1st - December 31th), 10Data Pipelines, 10Data-Catalog, 13Patch-For-Review: Integrate Spark with DataHub with lineage - https://phabricator.wikimedia.org/T306896#10248669 (10Ahoelzl)