[07:00:38] (03PS13) 10Joal: Update to spark-3 and scala-2.12 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/656897 [07:08:58] (03PS14) 10Joal: Update to spark-3 and scala-2.12 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/656897 [07:21:56] (03PS15) 10Joal: Update to spark-3 and scala-2.12 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/656897 [07:24:30] wow --^ [07:24:40] elukey: we're getting close :) [07:25:02] elukey: spark3 has a mode to work with old shuffler, so we can actually do the migration incrementally! [07:25:17] nice!! [07:25:31] really looking forward to see the shuffler metrics [07:25:43] Currently focusing on having the existing jobs in airflow easy to migrate [07:25:57] For the metrics we'll need to wait for the full migration though :) [07:53:46] yep yep [08:31:44] (03PS16) 10Joal: Update to spark-3 and scala-2.12 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/656897 [09:32:41] 10Data-Engineering, 10Data-Catalog: Resolve 500 errors when browsing Kafka datasets - https://phabricator.wikimedia.org/T308736 (10BTullis) [09:34:55] 10Data-Engineering, 10Data-Catalog: Connect Kafka to the MVP [Mile Stone 5] - https://phabricator.wikimedia.org/T299899 (10BTullis) There is an issue when attempting to browse the Kafka datasets, resulting in an error 500 being sent to the client. I have created this ticket to track the fix: {T308736} It is... [09:54:39] 10Data-Engineering, 10Data-Catalog: Connect Kafka to the MVP [Mile Stone 5] - https://phabricator.wikimedia.org/T299899 (10EChetty) p:05Medium→03High [10:08:08] 10Data-Engineering, 10Data-Engineering-Kanban, 10Product-Analytics, 10Superset: Help with data that's not appearing on charts - https://phabricator.wikimedia.org/T301895 (10BTullis) @Mayakp.wiki - I'm not aware of any timeline for when the migration CLI will be available. It seems to be mentioned as a goal... [11:20:23] 10Data-Engineering, 10Data-Engineering-Kanban, 10SRE, 10Traffic, 10Patch-For-Review: intake-analytics is responsible for up to a 85% of varnish backend fetch errors - https://phabricator.wikimedia.org/T306181 (10BTullis) Thanks @AlexisJazz for that suggestion. I think it might well help in terms of findi... [12:34:48] joal, dcausse: I have a conflicting meeting for our WDQS Analysis meeting with Aisha. Can you manage without me? [12:35:03] gehel: sure [12:35:07] thanks! [12:35:09] gehel: yup, no problem [12:56:30] (03PS17) 10Joal: Update to spark-3 and scala-2.12 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/656897 [13:05:01] (03PS18) 10Joal: Update to spark-3 and scala-2.12 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/656897 [15:07:34] 10Analytics, 10Data-Engineering-Radar, 10Event-Platform, 10Metrics-Platform, 10Browser-Support-Microsoft-Edge: Problem with delay caused by intake-analytics.wikimedia.org - https://phabricator.wikimedia.org/T295427 (10BTullis) @Downsize43 are you still able to replicate this problem with intake-analytics... [15:24:43] (03PS2) 10Snwachukwu: Create HQL scripts to generate Wikidata's ArticlePlaceholder and Reliability metrics. [analytics/refinery] - 10https://gerrit.wikimedia.org/r/791394 (https://phabricator.wikimedia.org/T300021) [15:37:03] 10Data-Engineering, 10Data-Engineering-Kanban: Fix airflow interlanguage job - https://phabricator.wikimedia.org/T308766 (10JAllemandou) [15:38:27] btullis: seen and accepted your follow [15:38:29] 10Data-Engineering, 10Data-Engineering-Kanban: Fix api_daily job - https://phabricator.wikimedia.org/T308767 (10JAllemandou) [15:38:44] Apologises if I was slow, I never check that menu [15:56:10] RhinosF1: Sorry, you lost me a bit. Which follow? Were you talking about that RAID ticket? [16:04:32] 10Data-Engineering, 10Data-Engineering-Kanban, 10Airflow: Fix api_daily job - https://phabricator.wikimedia.org/T308767 (10odimitrijevic) [16:04:54] 10Data-Engineering, 10Data-Engineering-Kanban, 10Airflow: Fix airflow interlanguage job - https://phabricator.wikimedia.org/T308766 (10odimitrijevic) [16:05:13] 10Data-Engineering, 10Data-Engineering-Kanban, 10Airflow: Fix api_daily job - https://phabricator.wikimedia.org/T308767 (10Snwachukwu) a:03Snwachukwu [16:06:46] (03PS19) 10Joal: Update to spark-3 and scala-2.12 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/656897 [16:11:39] 10Analytics-Kanban, 10Data-Engineering, 10Patch-For-Review: Automate kerberos credential creation and management to ease the creation of testing infrastructure - https://phabricator.wikimedia.org/T292389 (10BTullis) @Majavah - Apologies for the delay on this. I think that given the current workload we're unl... [16:12:59] (03CR) 10Milimetric: [C: 03+2] Use dedicated Phabricator bug report / feature request forms [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/768167 (https://phabricator.wikimedia.org/T308610) (owner: 10Aklapper) [16:14:22] 10Analytics-Wikistats, 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review: Use dedicated Phabricator bug report / feature request forms - https://phabricator.wikimedia.org/T308610 (10JArguello-WMF) [16:15:34] (03PS1) 10NOkafor: Performed the following changes; [analytics/refinery] - 10https://gerrit.wikimedia.org/r/793507 [16:20:30] 10Data-Engineering, 10Event-Platform: [BUG] jsonschema-tools materializes fields in yaml in a different order than in json files - https://phabricator.wikimedia.org/T308450 (10JArguello-WMF) p:05Triage→03Low [16:21:48] 10Quarry: Add an option to export result in Wikilist - https://phabricator.wikimedia.org/T137268 (10rook) Is this still desired? If so does it mean if a single column of: ` id -- 1 2 3 ` Would be presented as: ` [[1]] [[2]] [[3]] ` one entry per row, no column name? [16:24:54] 10Quarry, 10Patch-For-Review: Handle visiting non-existent query - https://phabricator.wikimedia.org/T280915 (10rook) [16:25:23] 10Quarry, 10cloud-services-team (Kanban): Quarry returns 500 rather than 404 when asked for an invalid query ID - https://phabricator.wikimedia.org/T290874 (10rook) [16:27:04] (03Abandoned) 10Vivian Rook: Use one_or_none to handle non-existent queries [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/682030 (https://phabricator.wikimedia.org/T280915) (owner: 10BrandonXLF) [16:31:49] 10Data-Engineering, 10Data-Engineering-Kanban, 10Product-Analytics, 10Superset: Error when updating dashboard - https://phabricator.wikimedia.org/T308441 (10odimitrijevic) @Pablo Does removing the emoji's resolve the issue? [16:35:55] 10Data-Engineering, 10Data-Engineering-Kanban, 10Epic: Data Infrastructure as a Service MVP - https://phabricator.wikimedia.org/T308317 (10JArguello-WMF) [16:45:17] 10Analytics-Wikistats, 10Data-Engineering, 10NFDI: Is it possible to setup wikistats for a new wiki? - https://phabricator.wikimedia.org/T308253 (10JArguello-WMF) a:03EChetty [16:46:42] (03PS2) 10NOkafor: Performed the following changes; [analytics/refinery] - 10https://gerrit.wikimedia.org/r/793507 [16:48:08] 10Data-Engineering, 10Airflow, 10SRE, 10Patch-For-Review: Create conda .deb and docker image - https://phabricator.wikimedia.org/T304450 (10Ottomata) [16:49:04] 10Data-Engineering, 10Airflow, 10SRE, 10Patch-For-Review: Create conda .deb and docker image - https://phabricator.wikimedia.org/T304450 (10Ottomata) Mostly done, but to finish we are blocking on waiting for Gitlab Docker images {T304845} [16:59:53] !log deploying airflow-dags analytics with new artifact names, first clearing artifacts cache dir - T307115 [16:59:56] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:59:56] T307115: Improvements of artifacts cache - https://phabricator.wikimedia.org/T307115 [17:02:50] 10Analytics-Kanban, 10Data-Engineering, 10Event-Platform, 10Fundraising-Backlog, and 3 others: Determine which remaining legacy EventLogging schemas need to be migrated or decommissioned - https://phabricator.wikimedia.org/T282131 (10odimitrijevic) [17:06:33] 10Analytics-Kanban, 10Data-Engineering, 10Data-Engineering-Kanban: Improve Refine bad data handling - https://phabricator.wikimedia.org/T289003 (10JArguello-WMF) 05Open→03Declined [17:08:11] 10Data-Engineering, 10Data-Engineering-Kanban: Data structuring guidance request - https://phabricator.wikimedia.org/T287402 (10JArguello-WMF) a:05mforns→03EChetty [17:08:19] 10Data-Engineering: Data structuring guidance request - https://phabricator.wikimedia.org/T287402 (10EChetty) [17:11:20] 10Analytics-Kanban, 10Data-Engineering: Refactor analytics-meta MariaDB layout to use an-db100[12] - https://phabricator.wikimedia.org/T284150 (10JArguello-WMF) [17:13:08] 10Data-Engineering-Kanban, 10Airflow, 10GitLab (CI & Job Runners): Allow a shared, protected runner for the data-engineering group in GitLab - https://phabricator.wikimedia.org/T295045 (10BTullis) 05Open→03Declined Declining this task as we have no time to work on it at the moment. [17:14:13] 10Data-Engineering: Implement one golang AQS microservice - https://phabricator.wikimedia.org/T299729 (10JArguello-WMF) a:05Milimetric→03None [17:14:35] 10Data-Engineering: Investigate trend of gradual hive server heap exhaustion - https://phabricator.wikimedia.org/T303168 (10BTullis) Removing from kanban for now. [17:14:43] 10Data-Engineering, 10Patch-For-Review: Add a presto query logger - https://phabricator.wikimedia.org/T269832 (10BTullis) Taking this out of kanban for now, although we still want to do it. [17:15:06] 10Data-Engineering, 10Cassandra, 10Platform Team Workboards (Platform Engineering Reliability): Stop ingesting data to the old AQS cluster - https://phabricator.wikimedia.org/T302276 (10JArguello-WMF) [17:16:02] 10Data-Engineering, 10Data-Engineering-Kanban, 10Airflow, 10Patch-For-Review: Set up backups and monitoring of airflow instances - https://phabricator.wikimedia.org/T307102 (10BTullis) 05Open→03Resolved [17:20:28] 10Data-Engineering, 10Data-Engineering-Kanban: Draft initial data storage platform and place budget hold for Q2 - https://phabricator.wikimedia.org/T308318 (10BTullis) I'm moving this to Ops on the board, because it didn't come from another team. It is something that we have decided to do ourselves and it's ab... [17:30:59] 10Data-Engineering, 10Data-Engineering-Kanban: Fix turnilo after upgrade - https://phabricator.wikimedia.org/T308778 (10BTullis) [17:31:20] (03PS1) 10Snwachukwu: Fix api hql file and Projectview hql file. [analytics/refinery] - 10https://gerrit.wikimedia.org/r/793523 (https://phabricator.wikimedia.org/T308767) [17:33:08] (03PS2) 10Snwachukwu: Fix api hql file and Projectview hql file. [analytics/refinery] - 10https://gerrit.wikimedia.org/r/793523 (https://phabricator.wikimedia.org/T308767) [17:34:49] (03CR) 10Snwachukwu: "Fix api_metrics.hql location and projectview hourly and geo hql script" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/793523 (https://phabricator.wikimedia.org/T308767) (owner: 10Snwachukwu) [17:37:08] 10Data-Engineering, 10Data-Engineering-Kanban: Fix turnilo after upgrade - https://phabricator.wikimedia.org/T308778 (10BTullis) p:05Triage→03High I have reported this to Turnilo's Slack channel here: https://turnilo.slack.com/archives/CEQMX06NB/p1652789689569899 {F35155435,width=60%} The author is aware... [17:37:38] 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review: Upgrade Turnilo - https://phabricator.wikimedia.org/T301990 (10BTullis) a:05razzi→03BTullis [17:45:53] 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review: Upgrade Turnilo - https://phabricator.wikimedia.org/T301990 (10BTullis) I've created a child ticket to track the fix for turnilo. {T308778} I'm hopeful that the upstream author will be able to find the issue and suggest a workaround or provid... [17:47:40] 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review: Upgrade Turnilo - https://phabricator.wikimedia.org/T301990 (10BTullis) Moving to paused whle the fix is being investigated. One of the options is to downgrade again, otherwise I would resolve this ticket now. [17:48:39] 10Data-Engineering-Kanban, 10Data-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Reimage WMCS db proxies to Bullseye - https://phabricator.wikimedia.org/T298940 (10BTullis) Claiming this ticket as Razzi is leaving the Foundation. [18:11:27] 10Quarry: Vizquery for quarry - https://phabricator.wikimedia.org/T288841 (10rook) [18:11:47] 10Quarry: Provide a more intuitive way to design DB queries, as Quarry is not ideal for complex ones... - https://phabricator.wikimedia.org/T208839 (10rook) [18:11:50] 10Quarry: Vizquery for quarry - https://phabricator.wikimedia.org/T288841 (10rook) Please reopen if not a duplicate. [18:17:49] 10Quarry, 10Documentation: Landing page for Quarry - https://phabricator.wikimedia.org/T308783 (10rook) [21:29:49] 10Data-Engineering, 10Event-Platform, 10Generated Data Platform, 10Patch-For-Review: [Shared Event Platform] Ability to use Event Platform streams in Flink without boilerplate - https://phabricator.wikimedia.org/T308356 (10Ottomata) Some more thoughts. I spent some time looking into a [[ https://nightlies... [21:51:11] 10Data-Engineering, 10Event-Platform, 10Generated Data Platform, 10Patch-For-Review: [Shared Event Platform] Ability to use Event Platform streams in Flink without boilerplate - https://phabricator.wikimedia.org/T308356 (10Ottomata) I think what I want would be something similar to how I would expect Confl... [22:22:13] (VarnishkafkaNoMessages) firing: ... [22:22:13] varnishkafka for instance cp3060:9132 is not logging cache_text requests from eventlogging - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka - https://grafana.wikimedia.org/d/000000253/varnishkafka?orgId=1&var-datasource=esams%20prometheus/ops&var-source=eventlogging&var-cp_cluster=cache_text&var-instance=cp3060:9132&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DVarnishkafkaNoMessages