[03:36:28] 06Data-Engineering, 10Data-Catalog: Emit lineage information about Airflow jobs to DataHub - https://phabricator.wikimedia.org/T312566#10232357 (10Milimetric) 05Open→03Declined This is being done in other ways by very clever Airflow / Spark tools. [08:40:41] 06Data-Engineering, 06Data-Platform, 10Dumps-Generation, 06Trust and Safety Product Team, and 3 others: Hide autoblocks from the globalblocks table database dump - https://phabricator.wikimedia.org/T376726#10232749 (10kostajh) a:03kostajh [09:38:08] 06Data-Engineering, 10Event-Platform, 13Patch-For-Review: [Event Platform] Declare webrequest as an Event Platform stream - https://phabricator.wikimedia.org/T314956#10233002 (10gmodena) @Ottomata @Ahoelzl unless you have any objection, I'd like to resolve this task. We agreed not to declare webrequest_fron... [10:25:31] 06Data-Engineering, 06Data-Platform, 10Dumps-Generation, 06Trust and Safety Product Team, and 3 others: Hide autoblocks from the globalblocks table database dump - https://phabricator.wikimedia.org/T376726#10233103 (10kostajh) Now that the patches are merged, I see that https://dumps.wikimedia.org/other/gl... [12:52:18] 06Data-Engineering, 10Dumps 2.0, 03Discovery-Search (Current work), 10Event-Platform, 13Patch-For-Review: [SPIKE] how can we support Spark producer/consumers in Event Platform - https://phabricator.wikimedia.org/T374341#10233359 (10pfischer) > Were you able to get it working with .writeStream() after all... [13:21:08] (03PS1) 10Gehel: playing with extraction of RefineHelper [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1080706 [13:21:30] (03CR) 10Gehel: [C:04-1] "Just as an example, do not merge!" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1080706 (owner: 10Gehel) [14:09:01] 10Data-Engineering (Q2 2024 October 1st - December 31th): Set up Alerting for Data Quality dags in Airflow. - https://phabricator.wikimedia.org/T377333 (10Snwachukwu) 03NEW [14:09:33] 10Data-Engineering (Q2 2024 October 1st - December 31th): Set up Alerting for Data Quality dags in Airflow. - https://phabricator.wikimedia.org/T377333#10233700 (10Snwachukwu) a:03Snwachukwu [14:16:26] (03CR) 10Xcollazo: [C:03+2] Update the smtp server settings for email from refine [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1079529 (https://phabricator.wikimedia.org/T325394) (owner: 10Btullis) [14:31:23] (03Merged) 10jenkins-bot: Update the smtp server settings for email from refine [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1079529 (https://phabricator.wikimedia.org/T325394) (owner: 10Btullis) [14:44:44] 06Data-Engineering, 10EventStreams, 10Observability-Tracing, 10Prod-Kubernetes, and 3 others: eventstreams regularly uses more than 95% of its memory limit - https://phabricator.wikimedia.org/T357005#10233884 (10lmata) Moving to radar, please let us know if you need assistance. [15:00:13] (03CR) 10Xcollazo: [C:03+1] "Here is what I think needs done, please let me know if it makes sense?" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1078733 (https://phabricator.wikimedia.org/T375527) (owner: 10Milimetric) [15:04:12] (03CR) 10Xcollazo: [C:03+1] "(As discussed elsewhere, we will leave this patch out for now, to be delivered on next train.)" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1078733 (https://phabricator.wikimedia.org/T375527) (owner: 10Milimetric) [15:10:31] 06Data-Engineering, 10Data-Platform-SRE (2024.09.28 - 2024.10.18), 03Discovery-Search (Current work): Unable to find ingested tables in datahub - https://phabricator.wikimedia.org/T376657#10234065 (10brouberol) I've also had the same experience with Airflow DAG/task objects. - Some can be found just fine ([... [15:13:50] 06Data-Engineering, 06Research, 06Structured-Data-Backlog: Make HTML Dumps available in hadoop - https://phabricator.wikimedia.org/T305688#10234095 (10fkaelin) Summarizing my take-away from this [[ https://wikimedia.slack.com/archives/CSV483812/p1728572896790659 | slack thread ]] about how to use html datase... [15:14:14] 06Data-Engineering, 10Data-Platform-SRE (2024.09.28 - 2024.10.18), 03Discovery-Search (Current work): Unable to find ingested tables in datahub - https://phabricator.wikimedia.org/T376657#10234102 (10brouberol) Actually, this [daily DAG](https://airflow-test-k8s.wikimedia.org/dags/cleanup_airflow_db/grid,) i... [15:15:56] (03PS1) 10Xcollazo: changelog: update for v0.2.52 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1080740 [15:16:15] (03CR) 10Xcollazo: [V:03+2 C:03+2] changelog: update for v0.2.52 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1080740 (owner: 10Xcollazo) [15:17:32] Starting build #20 for job analytics-refinery-maven-release [15:38:29] Project analytics-refinery-maven-release build #20: 09SUCCESS in 20 min: https://integration.wikimedia.org/ci/job/analytics-refinery-maven-release/20/ [15:43:52] 10Quarry: [bug] Quarry queries are stopped - https://phabricator.wikimedia.org/T377010#10234228 (10Prototyperspective) Seems like much less or no issues now. In any case, please add some info when queries are stopped. [15:52:13] 10Data-Engineering (Q2 2024 October 1st - December 31th), 06Movement-Insights: Temporarily Extend Retention Window for webrequest tables - https://phabricator.wikimedia.org/T375943#10234252 (10Ahoelzl) [15:53:31] 10Data-Engineering (Q2 2024 October 1st - December 31th), 10CheckUser, 06Privacy Engineering: Add cu_log_event and cu_private_event CheckUser tables to data lake - https://phabricator.wikimedia.org/T376752#10234249 (10Ahoelzl) [15:58:37] Starting build #17 for job analytics-refinery-update-jars [16:00:33] (03PS1) 10Maven-release-user: Add refinery-source jars for v0.2.52 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1080756 [16:00:33] Project analytics-refinery-update-jars build #17: 09SUCCESS in 1 min 55 sec: https://integration.wikimedia.org/ci/job/analytics-refinery-update-jars/17/ [16:40:00] 06Data-Engineering, 10Event-Platform, 10Web Team Essential Work 2024 (Migrate to new Event Platform), 10Web-Team-Backlog (FY2024-25 Q2 Sprint 2): Deprecate use of desktop- and mobilewebuiactions in Event Platform - https://phabricator.wikimedia.org/T368678#10234633 (10KSarabia-WMF) a:05Jdlrobson→03Edtad... [16:40:24] 06Data-Engineering, 10Internet-Archive, 06The-Wikipedia-Library, 10Event-Platform, 13Patch-Needs-Improvement: page-links-change stream is assigning template propagation events to the wrong edits - https://phabricator.wikimedia.org/T216504#10234644 (10Pppery) [16:41:21] (03CR) 10Xcollazo: [V:03+2 C:03+2] Add refinery-source jars for v0.2.52 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1080756 (owner: 10Maven-release-user) [16:49:46] !log About to deploy regular analytics train [16:49:48] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:29:08] !log Deployed refinery using scap, then deployed onto hdfs [17:29:10] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:30:39] 06Data-Engineering, 06Data-Platform, 06DBA, 07Schema-change-in-production: Change page.page_links_updated to fixed-length timestamp in wmf wikis - https://phabricator.wikimedia.org/T371742#10235015 (10Ladsgroup) [17:48:54] 06Data-Engineering, 10MediaWiki-Core-Hooks, 06MW-Interfaces-Team, 10Event-Platform, 13Patch-For-Review: Implement DomainEventDispatcher (baseline) - https://phabricator.wikimedia.org/T377229#10235067 (10Ottomata) [18:43:56] 10Data-Engineering (Q2 2024 October 1st - December 31th), 10CheckUser, 06Privacy Engineering: Add cu_log_event and cu_private_event CheckUser tables to data lake - https://phabricator.wikimedia.org/T376752#10235378 (10mpopov) @Tgr @Dreamy_Jazz: I think it would be good to document the use case(s) that motiva... [18:52:48] 10Data-Engineering (Q2 2024 October 1st - December 31th), 10CheckUser, 06Privacy Engineering: Add cu_log_event and cu_private_event CheckUser tables to data lake - https://phabricator.wikimedia.org/T376752#10235396 (10Dreamy_Jazz) >>! In T376752#10235378, @mpopov wrote: > @Tgr @Dreamy_Jazz: I think it would... [19:57:15] 06Data-Engineering, 10Event-Platform, 13Patch-For-Review: [Event Platform] Declare webrequest as an Event Platform stream - https://phabricator.wikimedia.org/T314956#10235572 (10Ottomata) Let's not close the task yet. I think there may be ways to get a partial support for this. E.g. having a schema and str... [22:39:39] 06Data-Engineering, 06Product-Analytics, 10Wmfdata-Python: Support importing a Parquet file into HDFS using wmfdata-python - https://phabricator.wikimedia.org/T273196#10236111 (10nshahquinn-wmf)