[00:57:08] (03CR) 10David Martin: "Looks good! I've noted one identifier that should be spelled differently, and some concerns about the amount of white space that's here" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1043205 (https://phabricator.wikimedia.org/T363436) (owner: 10Ecarg) [05:40:04] (03PS3) 10Ecarg: Create HQL for new join table [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1043205 (https://phabricator.wikimedia.org/T363436) [05:40:49] (03CR) 10Ecarg: Create HQL for new join table (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1043205 (https://phabricator.wikimedia.org/T363436) (owner: 10Ecarg) [05:42:08] (03CR) 10Ecarg: Create HQL for new join table (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1043205 (https://phabricator.wikimedia.org/T363436) (owner: 10Ecarg) [06:33:42] 10Quarry: [bug] Quarry queries not completing - https://phabricator.wikimedia.org/T367464#9891513 (10SD0001) One such case: https://quarry.wmcloud.org/query/83084 The query actually completed on the db in 140 seconds but there was a Trove connection reset while updating the status to "success". ` sqlalchemy.e... [07:11:23] 10Quarry, 13Patch-Needs-Improvement: EXPLAIN is broken because new analytics wiki replica cluster contains multiple servers - https://phabricator.wikimedia.org/T205214#9891551 (10SD0001) [07:11:53] 10Quarry: quarry explain not working since move to multiple databases - https://phabricator.wikimedia.org/T288170#9891549 (10SD0001) →14Duplicate dup:03T205214 [07:15:07] 10Quarry: [bug] Quarry queries not completing - https://phabricator.wikimedia.org/T367464#9891554 (10SD0001) p:05Triage→03High [07:24:39] 10Quarry: [bug] Quarry queries not completing - https://phabricator.wikimedia.org/T367464#9891576 (10github-toolforge-bot) siddharthvp opened https://github.com/toolforge/quarry/pull/51 [07:33:05] 10Quarry, 10Cloud-VPS: [bug] Lot of queries stuck in queued state for hours and days - https://phabricator.wikimedia.org/T365136#9891580 (10SD0001) The actual problem should be sorted with the fix for T367464. [07:40:44] 10Quarry: query runs forever - https://phabricator.wikimedia.org/T366909#9891592 (10SD0001) Or more likely because of a connection reset from Trove (T367464). [07:40:51] 10Quarry: query runs forever - https://phabricator.wikimedia.org/T366909#9891597 (10SD0001) →14Duplicate dup:03T367464 [07:41:07] 10Quarry: [bug] Quarry queries not completing - https://phabricator.wikimedia.org/T367464#9891599 (10SD0001) [09:10:40] 10Quarry, 10cloud-services-team (FY2023/2024-Q3-Q4): [bug] Access denied for user 'quarry'@'172.16.2.72' (using password: NO) - https://phabricator.wikimedia.org/T365374#9891772 (10fnegri) @Liz are the queries that never finish different from other queries, or are they similar but sometimes they randomly f... [09:44:40] 06Data-Engineering, 06Data Products, 10MediaWiki-extensions-CentralAuth, 05Account-Vanishing, and 2 others: Apply schema change to add type column on GlobalRenameQueue table to the live databases - https://phabricator.wikimedia.org/T367495 (10Seddon) 03NEW [09:49:34] 10Quarry: [bug] Quarry queries not completing - https://phabricator.wikimedia.org/T367464#9891856 (10github-toolforge-bot) siddharthvp closed https://github.com/toolforge/quarry/pull/51 [10:12:55] 10Quarry: refreshing a running query changes favicon from orange to blue - https://phabricator.wikimedia.org/T362101#9891899 (10github-toolforge-bot) siddharthvp opened https://github.com/toolforge/quarry/pull/52 [10:26:30] 06Data-Engineering, 06Data Products, 06DBA, 10MediaWiki-extensions-CentralAuth, and 3 others: Apply schema change to add type column on GlobalRenameQueue table to the live databases - https://phabricator.wikimedia.org/T367495#9891924 (10Ladsgroup) [10:50:07] 10Quarry, 13Patch-Needs-Improvement: EXPLAIN is broken because new analytics wiki replica cluster contains multiple servers - https://phabricator.wikimedia.org/T205214#9891953 (10github-toolforge-bot) siddharthvp opened https://github.com/toolforge/quarry/pull/53 [11:10:42] 10Quarry: refreshing a running query changes favicon from orange to blue - https://phabricator.wikimedia.org/T362101#9891976 (10github-toolforge-bot) siddharthvp closed https://github.com/toolforge/quarry/pull/52 [13:03:16] 10Quarry, 10Cloud-VPS: [bug] Lot of queries stuck in queued state for hours and days - https://phabricator.wikimedia.org/T365136#9892340 (10SD0001) 05Open→03Resolved a:03SD0001 The fix is live. Please reopen if this occurs again. [13:04:04] 10Quarry: [bug] Quarry queries not completing - https://phabricator.wikimedia.org/T367464#9892343 (10SD0001) 05Open→03Resolved a:03SD0001 The fix is live. Please reopen if the problem arises again. [13:24:29] 10Data-Engineering (Q4 2024 April 1st - June 30th), 10Event-Platform, 10GitLab (Pipeline Services Migration🐤), 13Patch-For-Review: Migrate Data Engineering Pipelinelib repos to GitLab - https://phabricator.wikimedia.org/T344730#9892436 (10Snwachukwu) The repos have been archived: Here are the steps i took... [13:38:11] hi folks! [13:38:32] archiva seems to have /var/lib/archiva almost filled up [13:38:35] is it known? [13:38:42] Cc: btullis, brouberol, stevemunene --^ [13:39:35] I discussed it on #wikimedia-data-platform-alerts [13:40:02] elukey: Thanks, I spotted the alert too, but I'm afk at the moment. [13:40:20] ah wow TIL #wikimedia-data-platform-alerts [13:40:31] lemme know if you need a hand! [13:40:31] no problem. We don't have an easy way of fixing this atm, without resizing the disk, which would bring the host down for a while [13:41:10] (03PS1) 10Gehel: test: Add unit tests for Refinery*DatabaseResponse [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1043781 (https://phabricator.wikimedia.org/T365197) [13:42:17] Also, this is happening very soon. https://phabricator.wikimedia.org/T367407 [13:42:47] joal: easy review for you ^ that inline comment was bothering me, I replaced it with a unit test [13:43:38] Retiring archiva and replacing it with a read-only archive of historical artifacts, served by a simple web server (hopefully simple). [13:44:44] T367407 is unlikely to be done before multiple weeks, we need to migrate all write use cases away from archiva first [13:44:45] T367407: Retire Archiva, including keeping a read only copy of all previously published artifacts - https://phabricator.wikimedia.org/T367407 [13:45:54] Yeah, sorry. 'very soon' was a bit strong there. [13:48:20] makes sense yes! [14:31:34] 10Quarry: Add line numbers in SQL input textarea - https://phabricator.wikimedia.org/T315066#9892702 (10SD0001) 05Open→03In progress [14:41:19] 10Quarry, 13Patch-Needs-Improvement: EXPLAIN is broken because new analytics wiki replica cluster contains multiple servers - https://phabricator.wikimedia.org/T205214#9892773 (10github-toolforge-bot) siddharthvp closed https://github.com/toolforge/quarry/pull/53 [14:44:04] 10Quarry: Add line numbers in SQL input textarea - https://phabricator.wikimedia.org/T315066#9892781 (10github-toolforge-bot) siddharthvp opened https://github.com/toolforge/quarry/pull/54 [14:48:23] 10Quarry: [bug] Quarry queries not completing - https://phabricator.wikimedia.org/T367464#9892912 (10Liz) I don't see that there has been any improvement. My queries are still not finishing their run. [14:50:41] 10Quarry: [bug] Quarry queries not completing - https://phabricator.wikimedia.org/T367464#9893183 (10Liz) 05Resolved→03Open [14:51:09] 10Data-Engineering (Q4 2024 April 1st - June 30th), 10Event-Platform, 10GitLab (Pipeline Services Migration🐤), 13Patch-For-Review: Migrate Data Engineering Pipelinelib repos to GitLab - https://phabricator.wikimedia.org/T344730#9893198 (10Snwachukwu) The following documents have been edited: # [[ https... [15:17:49] 06Data-Engineering, 07Documentation: Create user-focused Spark SQL documentation - https://phabricator.wikimedia.org/T329550#9893423 (10TBurmeister) [15:20:13] 06Data-Engineering, 07Documentation: Create user-focused Spark SQL documentation - https://phabricator.wikimedia.org/T329550#9893432 (10TBurmeister) a:05TBurmeister→03None I added a new section with a couple links at https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Spark#Use_PySpark_to_run_SQL... [15:28:36] 06Data-Engineering, 07Documentation: Create user-focused Spark SQL documentation - https://phabricator.wikimedia.org/T329550#9893457 (10TBurmeister) [15:28:37] 06Data-Engineering, 06Tech-Docs-Team, 05Goal: Redesign Data Platform docs on Wikitech - https://phabricator.wikimedia.org/T350911#9893458 (10TBurmeister) [15:40:20] 10Quarry, 10Cloud-VPS: [bug] Lot of queries stuck in queued state for hours and days - https://phabricator.wikimedia.org/T365136#9893481 (10Teslaton) @SD0001: would it be possible to batch fail queries affected by this problem in latest months - i.e. to mark them as failed instead of current "queued" state... [15:42:00] 10Quarry, 10Cloud-VPS: [bug] Lot of queries stuck in queued state for hours and days - https://phabricator.wikimedia.org/T365136#9893490 (10Novem_Linguae) Usually if I press stop and start a couple times on my own queries, they will restart. They are also forkable. [16:10:56] (03CR) 10David Martin: Create HQL for new join table (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1043205 (https://phabricator.wikimedia.org/T363436) (owner: 10Ecarg) [18:32:49] 10Quarry: Deduplicate config load - https://phabricator.wikimedia.org/T349135#9894202 (10github-toolforge-bot) siddharthvp opened https://github.com/toolforge/quarry/pull/55 [19:34:13] 10Quarry: Show replication lag - https://phabricator.wikimedia.org/T60841#9894368 (10SD0001) 05Open→03Resolved a:03Framawiki https://github.com/toolforge/quarry/pull/22/ was merged last year. [20:12:07] 06Data-Engineering, 10Cassandra: Create keyspace and table for Knowledge Gaps - https://phabricator.wikimedia.org/T340494#9894477 (10Eevans) 05Open→03Resolved [21:22:35] 06Data-Engineering-Radar, 10Cassandra: Make Cassandra client encryption non-optional (AQS cluster) - https://phabricator.wikimedia.org/T309229#9894669 (10Eevans) [21:29:29] 06Data-Engineering-Radar, 10Cassandra: Enforce Cassandra client encryption (AQS cluster) - https://phabricator.wikimedia.org/T309229#9894683 (10Eevans) [21:33:48] 06Data-Engineering-Radar, 10Cassandra: Enforce Cassandra client encryption (AQS cluster) - https://phabricator.wikimedia.org/T309229#9894691 (10Eevans) [21:42:38] 06Data-Engineering-Radar, 10Cassandra: Enforce Cassandra client encryption (AQS cluster) - https://phabricator.wikimedia.org/T309229#9894701 (10Eevans) The good news is that with the AQS v2 services online, we're nearly there. `sh-session eevans@aqs1010:~$ nodetool-a clientstats --all Address SS... [21:43:15] 06Data-Engineering-Radar, 10Cassandra: Enforce Cassandra client encryption (AQS cluster) - https://phabricator.wikimedia.org/T309229#9894702 (10Eevans) [21:47:35] 06Data-Engineering-Radar, 10Cassandra: Enforce Cassandra client encryption (AQS cluster) - https://phabricator.wikimedia.org/T309229#9894723 (10Eevans) >>! In T309229#9894701, @Eevans wrote: > The good news is that with the AQS v2 services online, we're nearly there. > > `lang=sh-session > eevans@aqs1010:~$ n... [21:51:23] 06Data-Engineering, 10Cassandra: Audit and update AQS Cassandra roles & grants - https://phabricator.wikimedia.org/T313877#9894729 (10Eevans) I think the `aqs` role was used exclusively by AQS v1 and can be removed. Can anyone in #data-engineering confirm? @JAllemandou perhaps? [21:58:19] 06Data-Engineering, 10Cassandra, 10Structured Data Engineering, 06Structured-Data-Backlog: image suggestions DAG should not use aqsloader Cassandra role - https://phabricator.wikimedia.org/T356446#9894751 (10Eevans) [23:20:46] 06Data-Engineering-Radar, 10Cassandra: Enforce Cassandra client encryption (AQS cluster) - https://phabricator.wikimedia.org/T309229#9895015 (10tchin) I don't think so. The image suggestion work on Flink never progressed passed the original ticket.