[04:06:38] 10Analytics, 10Event-Platform, 10Observability-Logging, 10SRE, and 3 others: Integrate Event Platform and ECS logs - https://phabricator.wikimedia.org/T291645 (10lmata) [05:20:01] 10Analytics-Radar, 10Event-Platform, 10WMF-JobQueue, 10Wikibase change dispatching scripts to jobs, and 2 others: Queuing jobs is extremely slow - https://phabricator.wikimedia.org/T292048 (10Ladsgroup) I don't know the code well enough to judge but I ask for a double check when you have time. Adding the... [07:25:11] 10Analytics-Data-Quality, 10WMDE-TechWish, 10WMDE-Templates-FocusArea, 10WMDE-TechWish-Sprint-2021-09-29: Check whether VE template dialog and Template Wizard metrics are healthy - https://phabricator.wikimedia.org/T292045 (10awight) a:03awight [07:52:46] 10Analytics-Radar, 10Patch-For-Review: Update ROCm version on GPU instances. - https://phabricator.wikimedia.org/T287267 (10elukey) To keep archives happy: * We decided to target ROCm 4.3.1 (current latest upstream) and tensorflow-rocm 2.6. * Instead of rolling out the packages on stat100[5,8], we started fro... [08:04:52] 10Analytics-Radar, 10Patch-For-Review: Update ROCm version on GPU instances. - https://phabricator.wikimedia.org/T287267 (10elukey) https://github.com/ROCmSoftwarePlatform/tensorflow-upstream/issues/1461 [08:52:47] joal: I think that we are going to have to remove the source snapshots on the aqs cluster for `pageviews_per_article_flat/data` - aqs1004 is at 97% of capacity. [08:53:08] https://www.irccloud.com/pastebin/mpkmnto4/ [08:53:36] We have transferred all 12 of these to the aqs_next cluster. [08:55:17] 10Analytics, 10Analytics-Kanban, 10Data-Engineering: Snapshot and Reload cassandra2 pageview_per_article data table from all 12 instances - https://phabricator.wikimedia.org/T291472 (10BTullis) I believe that we will have to remove the source snapshots from the aqs100x servers, because there is not enough av... [09:32:49] 10Analytics, 10Analytics-Kanban, 10Data-Engineering: Snapshot and Reload cassandra2 pageview_per_article data table from all 12 instances - https://phabricator.wikimedia.org/T291472 (10BTullis) 33% of the way through the second snapshot loading operation. [10:20:30] !log upgrade ROCm to 4.2 on stat1008 [10:20:34] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:37:19] will do stat1005 after lunch, waiting for a green light for some tests! [10:56:13] hi all! I am able to connect to the stat machines via bast3005 but not bast 1003 - am I missing something? :D [11:33:19] btullis: Please do - if any issue happen we'll rebuild one [11:34:21] btullis: as a reminder, presto-test-1 is still alerting for puppet update failures [11:48:15] 10Analytics-Data-Quality, 10WMDE-TechWish, 10WMDE-Templates-FocusArea, 10WMDE-TechWish-Sprint-2021-09-29: Check whether VE template dialog and Template Wizard metrics are healthy - https://phabricator.wikimedia.org/T292045 (10awight) VE Template Dialog panels were broken because the `$edit_count` variable... [11:50:19] 10Analytics-Data-Quality, 10WMDE-TechWish, 10WMDE-Templates-FocusArea, 10WMDE-TechWish-Sprint-2021-09-29: Check whether VE template dialog and Template Wizard metrics are healthy - https://phabricator.wikimedia.org/T292045 (10awight) TemplateWizard turns out to not be broken. [13:20:24] !log btullis@aqs1004:~$ sudo nodetool-a clearsnapshot [13:20:27] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [13:33:41] majavah: ack - Sorry. Will sort it. [13:37:00] 10Analytics, 10Analytics-Kanban, 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review: Test Alluxio as cache layer for Presto - https://phabricator.wikimedia.org/T266641 (10BTullis) I have made some more progress on this, but it is still fairly slow. Firstly, I have tried the vanilla downloa... [13:39:26] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10Metrics-Platform, and 2 others: wgEventStreams (EventStreamConfig) should support per wiki overrides - https://phabricator.wikimedia.org/T277193 (10DAbad) 05Open→03Resolved Looks like this is deployed and can now be closed. Closing [13:40:21] 10Analytics, 10Analytics-Kanban, 10Data-Engineering, 10Growth-Team, and 6 others: Migrated Server-side EventLogging events recording http.client_ip as 127.0.0.1 - https://phabricator.wikimedia.org/T288853 (10DAbad) @Ottomata can we close this ticket out now? Or is there work left? [13:42:31] 10Analytics, 10Analytics-Kanban, 10Data-Engineering, 10Growth-Team, and 6 others: Migrated Server-side EventLogging events recording http.client_ip as 127.0.0.1 - https://phabricator.wikimedia.org/T288853 (10Ottomata) There is still work, I haven't deployed the config change. Sorry about that. I got caught... [13:43:37] 10Analytics, 10Analytics-Kanban, 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review: Test Alluxio as cache layer for Presto - https://phabricator.wikimedia.org/T266641 (10BTullis) Next I am trying to attach a hive database as an UDB. Following these instructions: https://docs.alluxio.io/os... [13:44:29] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10Metrics-Platform, and 2 others: wgEventStreams (EventStreamConfig) should support per wiki overrides - https://phabricator.wikimedia.org/T277193 (10Ottomata) 05Resolved→03Open Not quite! We've deployed the code, but we now need to restructure wgEven... [14:03:15] mforns: milimetric someone in slack is looking for help on calculating retention metrics by geo for the hackathon, was going to ping you there but wasn't sure if that would commit you to helping! :) [14:04:13] 10Analytics, 10Analytics-Kanban, 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review: Test Alluxio as cache layer for Presto - https://phabricator.wikimedia.org/T266641 (10BTullis) I have posted several more questions to the Slack workspace for Alluxio. I have a feeling that the error above... [14:23:25] 10Analytics, 10Metrics-Platform, 10Product-Analytics: [Metrics Platform] Define stream configuration syntax relevant to v1 release - https://phabricator.wikimedia.org/T273235 (10jlinehan) Closing this s we've moved on to more specific tasks and have de facto formats in the library code now. [14:24:59] 10Analytics, 10Event-Platform, 10Platform Team Workboards (Clinic Duty Team): Adopt conventions for server receive and client/event timestamps in non analytics event schemas - https://phabricator.wikimedia.org/T267648 (10DAbad) [14:27:38] 10Analytics-EventLogging, 10Analytics-Radar, 10Metrics-Platform: Consider how to best architect transmission of events from Browser Client - https://phabricator.wikimedia.org/T240454 (10jlinehan) 05Open→03Resolved Done [14:27:40] 10Analytics-EventLogging, 10Analytics-Radar, 10Epic: Review and evolve client environment around EventLogging - https://phabricator.wikimedia.org/T240462 (10jlinehan) [14:30:45] !log upgrade stat1005 to ROCm 4.2.0 [14:30:48] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:54:00] ottomata: o/ [15:54:06] do you have a minute for a conda question [15:54:07] ?? [15:54:13] o. about to start a meeting in 6 butyes! [15:54:20] start an interview* [15:54:24] ottomata: thanks for the ping, I've already met with them [15:54:27] ah! sorry then later on :) [15:54:42] i got 5 mins luca! [15:54:45] wasssuuuup [15:54:50] oooookkkkk [15:55:48] i mean, what is the quesiton!??!?! [15:55:50] ASK MEEEEEE [15:55:53] so for the neural audo mashup we are trying to install "magenta" via pip3, a tf library that requires some extra deps from debian (libasound2-dev libjack-dev portaudio19-dev, already installed) [15:55:57] yes yes one sec :D [15:55:59] there it is ok i was impatient :0 [15:56:26] the main issue is that when we pip3 install magenta in the stacked env we get cpp errors related to alsa headers etc.. not found [15:56:35] meanwhile in a plain venv all compiles fine [15:56:51] oo [15:56:52] hm [15:57:00] is it possible that the stacked env uses its own runtime path to get shd libs and headers? [15:57:01] that is a complicated questions that i do not have an immediate answer to [15:57:07] it is possible.... [15:57:15] i think there are some C libs in there [15:57:31] ack perfect, I just wanted to know if there was something like <- <- -> -> A + B nintendo combination to execute [15:57:49] will ping you later :) [15:57:58] https://anaconda.org/conda-forge/alsa-lib/ [15:57:59] ? [15:58:13] https://anaconda.org/conda-forge/jack [15:58:22] https://anaconda.org/anaconda/portaudio [15:58:26] could try conda installing those? [15:59:07] ahh interesting trying thanks [16:00:55] 10Analytics, 10Analytics-Kanban, 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review: Test Alluxio as cache layer for Presto - https://phabricator.wikimedia.org/T266641 (10BTullis) Here is the full stacktrace from the `master.log` file for this operation. ` 2021-10-06 13:59:05,016 ERROR All... [16:00:57] or.... wait that is anconda? portaudio might already be installed in the base anaconda-wmf [16:01:06] elukey: ^ [16:01:43] the error is something like [16:01:44] src/RtMidi.cpp:1101:10: fatal error: alsa/asoundlib.h: No such file or directory [16:01:47] #include [16:01:49] ^~~~~~~~~~~~~~~~~~ [16:02:15] when pip installing a pkg that needs to compile cpp stuff (using system's headers and shd lib [16:09:51] 10Analytics, 10Analytics-Kanban, 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review: Test Alluxio as cache layer for Presto - https://phabricator.wikimedia.org/T266641 (10BTullis) This is also an interesting error. We have generated a keytab for each host that is to access HDFS, but this s... [16:32:45] 10Analytics, 10Analytics-Kanban, 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review: Test Alluxio as cache layer for Presto - https://phabricator.wikimedia.org/T266641 (10BTullis) Hmm. Not looking good. The word on the street is that the Alluxio Catalog Service doesn't support kerberized H... [16:34:01] Alluxio progress is not good. Whilst I've got the UFS working with Kerberized HDFS, trying to attach a Hive database as a UDB gives an error. All the signs indicate that Kerberized Hive is not supported. [16:34:42] :( :( :( [16:34:55] :( [16:38:33] 10Analytics, 10Analytics-Kanban, 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review: Test Alluxio as cache layer for Presto - https://phabricator.wikimedia.org/T266641 (10BTullis) There are still options of creating Hive tables that point to Alluxio locations, but it's still a compromise c... [17:09:20] 10Analytics-Radar, 10Product-Analytics (Kanban): [REQUEST] Investigate decrease in New Registered Users - https://phabricator.wikimedia.org/T289799 (10kzimmerman) Thanks @Tgr and @mpopov. Summary and recommendations: Irene did not find evidence that there are bugs that need to be corrected or changes to user... [17:57:22] 10Analytics, 10Generated Data Platform, 10Platform Engineering Roadmap, 10User-Eevans: Implement per-article endpoint of the pageviews API - https://phabricator.wikimedia.org/T289265 (10lbowmaker) [18:05:28] (03PS1) 10Milimetric: [WIP] Finding ways to deduplicate users across projects and geographies [analytics/refinery] - 10https://gerrit.wikimedia.org/r/726945 [18:09:18] (03CR) 10Milimetric: "If you want to insert into milimetric.editors_daily still:" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/726945 (owner: 10Milimetric) [18:09:28] here you go mforns ^ [18:09:41] Thanks a lot milimetric :] [18:46:08] (03PS1) 10Hashar: Add .gitreview configuration [analytics/gobblin-wmf] - 10https://gerrit.wikimedia.org/r/726976 [18:47:29] (03CR) 10Gehel: [V: 03+2 C: 03+2] Add .gitreview configuration [analytics/gobblin-wmf] - 10https://gerrit.wikimedia.org/r/726976 (owner: 10Hashar)