[00:36:09] 06Data-Engineering, 06Data Products: Modify ClickStreamBuilder pipeline to cope with pagelinks schema changes - https://phabricator.wikimedia.org/T355588#9585896 (10lbowmaker) @xcollazo - we moved this out of scope for our current sprint so we could focus on the sqoop job: https://phabricator.wikimedia.org/T34... [01:23:40] 06Data-Engineering: Update AQS API automatically with new content data beginning of each month - https://phabricator.wikimedia.org/T348792#9585932 (10nshahquinn-wmf) [06:30:53] (HiveServerHeapUsage) firing: Hive Server JVM Heap usage is above 80% on an-coord1003:10100 - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hive/Alerts#Hive_Server_Heap_Usage - https://grafana.wikimedia.org/d/000000379/hive?panelId=7&fullscreen&orgId=1&var-instance=an-coord1003:10100 - https://alerts.wikimedia.org/?q=alertname%3DHiveServerHeapUsage [07:05:53] (HiveServerHeapUsage) resolved: Hive Server JVM Heap usage is above 80% on an-coord1003:10100 - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hive/Alerts#Hive_Server_Heap_Usage - https://grafana.wikimedia.org/d/000000379/hive?panelId=7&fullscreen&orgId=1&var-instance=an-coord1003:10100 - https://alerts.wikimedia.org/?q=alertname%3DHiveServerHeapUsage [07:28:18] 06Data-Engineering, 06Data Products: Modify ClickStreamBuilder pipeline to cope with pagelinks schema changes - https://phabricator.wikimedia.org/T355588#9586160 (10JAllemandou) Indeed, the job will not be affected with next month changes: https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/blam... [09:00:12] wow, looks like bots CR are away :( [09:11:46] (03CR) 10Joal: [V: 03+2] "Tested on cluster" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1007413 (https://phabricator.wikimedia.org/T345771) (owner: 10Joal) [09:25:24] It's probably related to https://phabricator.wikimedia.org/T357729 [09:25:44] !log decommissioning an-tool1005 now that superset-next is migrated to k8s - T358706 [09:25:48] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [09:25:48] T358706: Decommission an-tool1005 - https://phabricator.wikimedia.org/T358706 [09:29:10] (03CR) 10Mforns: [C: 03+1] "LGTM!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1007413 (https://phabricator.wikimedia.org/T345771) (owner: 10Joal) [09:29:16] 06Data-Engineering, 102024.03.04 - 2024.03.24: Update the From: addresses of all email from DPE pipelines so that they use routable addresses - https://phabricator.wikimedia.org/T358675#9586354 (10Gehel) p:05Triage→03High [09:29:33] 06Data-Engineering, 06Data-Platform-SRE: Cleanup superset related resources from puppet - https://phabricator.wikimedia.org/T358570#9586363 (10Gehel) p:05Triage→03Medium [09:29:47] 06Data-Engineering, 102024.02.12 - 2024.03.03: Migrate bare-metal superset services over to Kubernetes - https://phabricator.wikimedia.org/T358569#9586362 (10Gehel) p:05Triage→03High [09:56:09] (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging for deploy" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1007413 (https://phabricator.wikimedia.org/T345771) (owner: 10Joal) [09:59:37] !log Deploying refinery with scap (fix sqoop for tomorrow) [09:59:39] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:33:37] 06Data-Engineering, 102024.03.04 - 2024.03.24: Update the From: addresses of all email from DPE pipelines so that they use routable addresses - https://phabricator.wikimedia.org/T358675#9586542 (10BTullis) a:03BTullis [10:48:52] 06Data-Engineering, 06Data-Platform-SRE, 07Epic: Migrate the Analytics Superset instances to our DSE Kubernetes cluster - https://phabricator.wikimedia.org/T347710#9586568 (10brouberol) [10:54:53] (HiveServerHeapUsage) firing: Hive Server JVM Heap usage is above 80% on an-coord1003:10100 - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hive/Alerts#Hive_Server_Heap_Usage - https://grafana.wikimedia.org/d/000000379/hive?panelId=7&fullscreen&orgId=1&var-instance=an-coord1003:10100 - https://alerts.wikimedia.org/?q=alertname%3DHiveServerHeapUsage [11:18:07] 06Data-Engineering, 06Data-Platform-SRE, 07Epic: Re-evaluate data cache security issues, along with per user data cachin - https://phabricator.wikimedia.org/T358753 (10brouberol) [11:27:45] 06Data-Engineering, 102024.03.04 - 2024.03.24, 13Patch-For-Review: Update the From: addresses of all email from DPE pipelines so that they use routable addresses - https://phabricator.wikimedia.org/T358675#9586664 (10BTullis) I have started writing the patches to permit this overriding of the sender address... [11:31:51] 06Data-Engineering, 06Data-Platform-SRE, 07Epic, 13Patch-For-Review: Re-evaluate data cache security issues, along with per user data caching - https://phabricator.wikimedia.org/T358753#9586665 (10brouberol) [11:36:03] 06Data-Engineering, 06Data-Platform-SRE, 07Epic, 13Patch-For-Review: Re-evaluate data cache security issues, along with per user data caching - https://phabricator.wikimedia.org/T358753#9586679 (10BTullis) I think that this can be merged with (or into?) {T273850} That was the original ticket where we atte... [11:43:14] 06Data-Engineering, 06Data-Platform-SRE, 07Epic: Migrate the Analytics Superset instances to our DSE Kubernetes cluster - https://phabricator.wikimedia.org/T347710#9586691 (10brouberol) [11:44:11] 06Data-Engineering, 06Data-Platform-SRE, 07Epic, 13Patch-For-Review: Re-evaluate data cache security issues, along with per user data caching - https://phabricator.wikimedia.org/T358753#9586689 (10brouberol) 05Open→03Declined Agreed, let's merge. [11:49:24] 06Data-Engineering, 10WMF-JobQueue, 06serviceops, 07Unstewarded-production-error, and 2 others: Could not enqueue jobs: "Unable to deliver all events: 503: Service Unavailable" - https://phabricator.wikimedia.org/T249745#9586715 (10Clement_Goubert) >>! In T249745#9583374, @gmodena wrote: > Hey @Clement_Gou... [12:00:21] (03CR) 10Joal: "thank you for reviews folks. I think I covered most of the comments, don't hesitate to add more :)" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1003745 (https://phabricator.wikimedia.org/T356363) (owner: 10Joal) [12:01:26] (03PS15) 10Joal: Extract RefineSingleApp code from Refine [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1003745 (https://phabricator.wikimedia.org/T356363) [12:02:24] 06Data-Engineering, 10Data Pipelines, 06SRE, 06Traffic-Icebox: Mobile redirects drop provenance parameters - https://phabricator.wikimedia.org/T252227#9586760 (10dr0ptp4kt) Hi team - @lbowmaker asked if I could take a look at this and provide some context. I was having a think on this, and I'd like to pond... [12:42:39] 06Data-Engineering, 10WMF-JobQueue, 06serviceops, 07Unstewarded-production-error, and 2 others: Could not enqueue jobs: "Unable to deliver all events: 503: Service Unavailable" - https://phabricator.wikimedia.org/T249745#9586820 (10Joe) When we're talking about errors, it's always a good idea to reason in... [13:14:53] (HiveServerHeapUsage) resolved: Hive Server JVM Heap usage is above 80% on an-coord1003:10100 - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hive/Alerts#Hive_Server_Heap_Usage - https://grafana.wikimedia.org/d/000000379/hive?panelId=7&fullscreen&orgId=1&var-instance=an-coord1003:10100 - https://alerts.wikimedia.org/?q=alertname%3DHiveServerHeapUsage [13:55:04] 06Data-Engineering, 102024.02.12 - 2024.03.03: Check home/HDFS leftovers of nickifeajika - https://phabricator.wikimedia.org/T354241#9586982 (10fkaelin) These directories can be removed both on the stat clients and hdfs. Thanks! [14:06:19] !log sudo systemctl reset-failed refinery-sqoop-whole-mediawiki.service [14:06:21] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:09:56] 06Data-Engineering, 102024.02.12 - 2024.03.03: Check home/HDFS leftovers of nickifeajika - https://phabricator.wikimedia.org/T354241#9587053 (10brouberol) 05Open→03Resolved ` brouberol@an-master1003:~$ sudo kerberos-run-command hdfs hdfs dfs -ls /user/nickifeajika Found 6 items drwx------ - nickifeajika... [14:16:22] 06Data-Engineering, 06MediaWiki-Engineering, 10WMF-JobQueue, 06serviceops, and 3 others: Could not enqueue jobs: "Unable to deliver all events: 503: Service Unavailable" - https://phabricator.wikimedia.org/T249745#9587080 (10MSantos) [15:18:20] (03CR) 10Xcollazo: "A couple questions below." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1007413 (https://phabricator.wikimedia.org/T345771) (owner: 10Joal) [15:28:00] (03CR) 10Sbisson: "@conn" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/1006060 (https://phabricator.wikimedia.org/T343183) (owner: 10Sbisson) [15:30:01] 06Data-Engineering, 06Data Products: Modify ClickStreamBuilder pipeline to cope with pagelinks schema changes - https://phabricator.wikimedia.org/T355588#9587435 (10xcollazo) Ah, got it! Thank you both! [15:32:28] (03CR) 10Sbisson: "Unable to add cchen@wikimedia.org as a reviewer" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/1006060 (https://phabricator.wikimedia.org/T343183) (owner: 10Sbisson) [15:32:35] 14Analytics, 06Data-Engineering, 10MediaWiki-extensions-EventLogging: uBlock blocks EventLogging - https://phabricator.wikimedia.org/T186572#9587462 (10TheresNoTime) [15:55:46] 06Data-Engineering, 10Event-Platform: [Data Quality] [SPIKE] Can we identify indicators to inform an SLO for event emission and intake? - https://phabricator.wikimedia.org/T345195#9587592 (10xcollazo) This work will be critical for productionizing #dumps_2.0. Linking to Slack thread from a couple months ago,... [16:22:09] (03CR) 10Joal: [V: 03+2 C: 03+2] Fix sqoop for pagelinks normalization migration (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1007413 (https://phabricator.wikimedia.org/T345771) (owner: 10Joal) [16:30:58] 06Data-Engineering, 06Data-Platform-SRE, 10Scap: analytics/refinery: Stop using git-fat - https://phabricator.wikimedia.org/T328472#9587781 (10dancy) >>! In T328472#9585732, @dancy wrote: > Hi folks. scaps git-lfs support has been fixed so we can migrate analytics/refinery to git-lfs. To enable LFS for thi... [16:33:39] 06Data-Engineering, 06Data-Platform-SRE, 10Scap: analytics/refinery: Stop using git-fat - https://phabricator.wikimedia.org/T328472#9587797 (10dancy) Btw, I notice that there are many versions of some artifacts stored in the repository. Are they all used at runtime? If not, it would be better to only inclu... [17:02:52] (03PS16) 10Joal: Extract RefineSingleApp code from Refine [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1003745 (https://phabricator.wikimedia.org/T356363) [17:03:09] gmodena: I have removed a function that was doing nothing --^ :) [17:03:28] joal ack [17:03:41] And tried to apply your previous comments [17:11:16] (03CR) 10Xcollazo: Fix sqoop for pagelinks normalization migration (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1007413 (https://phabricator.wikimedia.org/T345771) (owner: 10Joal) [18:49:29] (03PS3) 10Aleksandar Mastilovic: Add HQL query files for the "pingback" report [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1006970 [19:02:20] 06Data-Engineering, 06Data-Platform-SRE, 10Scap, 10Now this 🫠: analytics/refinery: Stop using git-fat - https://phabricator.wikimedia.org/T328472#9588544 (10dancy) p:05Triage→03Medium a:03dancy [19:11:53] (03PS1) 10Ahmon Dancy: Switch from git-fat to git-lfs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1007690 (https://phabricator.wikimedia.org/T328472) [19:42:54] (03PS2) 10Ahmon Dancy: Switch from git-fat to git-lfs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1007690 (https://phabricator.wikimedia.org/T328472) [19:45:01] (03PS1) 10Ahmon Dancy: scap.cfg: Use git-lfs instead of git-fat [analytics/refinery/scap] - 10https://gerrit.wikimedia.org/r/1007692 (https://phabricator.wikimedia.org/T328472) [19:57:21] 06Data-Engineering, 06Data-Platform-SRE, 06SRE, 06serviceops, 10Event-Platform: DRY kafka broker declaration in helmfiles - https://phabricator.wikimedia.org/T253058#9588704 (10Ottomata) +1, or add this as a subtask of that? Either good with me! [20:11:41] (03CR) 10Ottomata: Extract RefineSingleApp code from Refine (037 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1003745 (https://phabricator.wikimedia.org/T356363) (owner: 10Joal)