[07:12:47] (03CR) 10Joal: Updated the get Cassandra password function to; (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/790651 (https://phabricator.wikimedia.org/T306895) (owner: 10NOkafor) [08:01:22] good morning, does anyone know of a command line tool to validate a json instance against json schema? I have tried https://pypi.org/project/jsonschema/ but it does not handle recursive references in the schema (goes into an infinite recursion :D ) [08:02:14] Hi hashar - I don't ( [08:03:05] :] [08:03:41] meanwhile I found a nice library to generate a json schema from a Java class https://github.com/victools/jsonschema-generator/ [08:03:57] it almost work out of the box: ship it classes, get schemas! [08:04:04] Nice :) [08:04:39] I spare you the details of me discovering what a `generic` is or being puzzled at the use case for `java.util.function.Supplier` [08:05:24] yeah [08:05:31] ah http://json-schema.org/implementations.html#validators [08:49:16] PROBLEM - turnilo on an-tool1007 is CRITICAL: connect to address 10.64.36.118 and port 9091: Connection refused https://wikitech.wikimedia.org/wiki/Analytics/Systems/Turnilo-Pivot [08:50:06] ACKNOWLEDGEMENT - turnilo on an-tool1007 is CRITICAL: connect to address 10.64.36.118 and port 9091: Connection refused Btullis Working on the upgrade in T301990 https://wikitech.wikimedia.org/wiki/Analytics/Systems/Turnilo-Pivot [08:54:24] !log booted an-tool1007 from network to begin buster upgrade [08:54:27] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [09:03:59] 10Data-Engineering, 10Data-Engineering-Kanban, 10Data-Catalog: Setup and check backups for datahub - https://phabricator.wikimedia.org/T308113 (10BTullis) The weekly dumps of the datahub database are now present on dbprov1002 as expected. ` root@dbprov1002:/srv/backups/dumps/latest/dump.analytics_meta.2022-0... [09:06:56] RECOVERY - turnilo on an-tool1007 is OK: TCP OK - 0.001 second response time on 10.64.36.118 port 9091 https://wikitech.wikimedia.org/wiki/Analytics/Systems/Turnilo-Pivot [09:08:06] 10Data-Engineering, 10Data-Engineering-Kanban, 10Airflow, 10Patch-For-Review: Set up backups and monitoring of airflow instances - https://phabricator.wikimedia.org/T307102 (10BTullis) The backups of all of the airflow databases are on on dbprov1002 as expected. ` root@dbprov1002:/srv/backups/dumps/latest/... [09:09:46] !completed an-tool1007 reinstall as buster and initial puppet runs [09:32:39] 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review: Upgrade Turnilo - https://phabricator.wikimedia.org/T301990 (10BTullis) I completed the reinstall of an-tool1007 to buster with the following on ganeti1024.eqiad.wmnet ` sudo gnt-instance shutdown an-tool1007.eqiad.wmnet sudo gnt-instance mod... [09:54:24] 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review: Upgrade Turnilo - https://phabricator.wikimedia.org/T301990 (10BTullis) Every minute we see an entry like this in the logs: ` May 17 09:45:46 an-tool1007 turnilo[5457]: Scanning cluster 'druid-analytics-eqiad' for new sources May 17 09:45:46... [10:20:41] Hi btullis - I'm assuming you're working on the turnilo fix? [10:22:08] Turnilo is now up but we're missing datasets [10:22:20] I assume it's due to a configuration change [10:22:45] Or to be more precise, to a change in expected cnfiguration [10:23:02] making our current config break the thing and make the datasets not showing up [10:37:47] joal: Yes, precisely that. We have some automatically detected data cubes, which don't appear in the configuration file, but everything that is defined isn't showing up. [10:38:11] yeah - I assume if we check logs we'll see config issues [10:38:41] Nothing obvious yet. Latest observations here: https://phabricator.wikimedia.org/T301990#7933889 [10:39:25] thanks btullis - let me know if you need help [10:39:43] Current guess is that it's something to do with a new security model for restricting access to cubes: https://github.com/allegro/turnilo/blob/master/docs/security.md#data-cubes-level-access [11:06:05] joal: does the list look correct now? I have added a `sourceListScan: disable` into the config file. [11:09:35] Actually now we're missing the cubes that have not been configured :S [11:09:56] It feels as if we can't have both the scanned and configured cubes now [11:10:00] btullis: --^ [11:11:39] joal: I'm sure that there's probably a way. There were messages like this in the log file after I enabled verbose mode: `Cluster 'druid-analytics-eqiad' already has an external for 'edits_hourly' ('edits_hourly')` [11:12:41] ..even though `edits_hourly` wasn't showing up in the previous version. [12:03:20] 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review: Upgrade Turnilo - https://phabricator.wikimedia.org/T301990 (10BTullis) There seems to be a clear problem with our current configuration and version 1.35. The symptoms are as follows: * If I load the config as-is, then only the automatically... [12:09:51] (03CR) 10Vivian Rook: [C: 03+2] query.py: Make quarry history descending [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/792277 (https://phabricator.wikimedia.org/T306340) (owner: 10Jiyu) [12:14:09] (03Merged) 10jenkins-bot: query.py: Make quarry history descending [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/792277 (https://phabricator.wikimedia.org/T306340) (owner: 10Jiyu) [12:15:41] 10Quarry, 10Patch-For-Review, 10good first task: Make quarry history descending - https://phabricator.wikimedia.org/T306340 (10rook) 05Open→03Resolved [12:16:37] 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review: Upgrade Turnilo - https://phabricator.wikimedia.org/T301990 (10BTullis) I've posted a message to the Turnilo Slack channel, to see if they can help at all. {F35148243,width=70%} [12:18:08] (03CR) 10Vivian Rook: [C: 03+2] Return 404 on query ids that do not exist [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/791606 (https://phabricator.wikimedia.org/T290874) (owner: 10Vivian Rook) [12:22:13] (03Merged) 10jenkins-bot: Return 404 on query ids that do not exist [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/791606 (https://phabricator.wikimedia.org/T290874) (owner: 10Vivian Rook) [12:24:22] 10Quarry, 10Patch-For-Review, 10cloud-services-team (Kanban): Quarry returns 500 rather than 404 when asked for an invalid query ID - https://phabricator.wikimedia.org/T290874 (10rook) 05Open→03Resolved [12:40:34] 10Quarry: Quarry, unable to run tests following the README.md - https://phabricator.wikimedia.org/T308493 (10rook) @Aklapper yes I think it is a bug in our docs for quarry. Things, including testing things, were updated in 2021-09 I believe that is when blubber was introduced to quarry. I'm not sure if the line... [12:42:05] 10Analytics, 10API Platform (Product Roadmap), 10Code-Health-Objective, 10Epic, and 3 others: AQS 2.0 - https://phabricator.wikimedia.org/T263489 (10DAbad) [13:09:35] 10Data-Engineering, 10Equity-Landscape: New HDFS user "gdi" for Equity Landscape - https://phabricator.wikimedia.org/T308453 (10ntsako) Created database on hive. [13:09:50] 10Data-Engineering, 10Equity-Landscape: New HDFS user "gdi" for Equity Landscape - https://phabricator.wikimedia.org/T308453 (10ntsako) 05Open→03Resolved [13:10:57] 10Analytics, 10API Platform (Product Roadmap), 10Code-Health-Objective, 10Epic, and 3 others: AQS 2.0 - https://phabricator.wikimedia.org/T263489 (10DAbad) [13:11:04] 10Data-Engineering, 10API Platform, 10Platform Engineering Roadmap, 10User-Eevans: AQS 2.0: Implement the unique devices endpoints - https://phabricator.wikimedia.org/T288298 (10DAbad) 05Open→03In progress [13:11:48] 10Data-Engineering, 10API Platform, 10Platform Engineering Roadmap, 10User-Eevans: AQS 2.0: Implement pageviews endpoints - https://phabricator.wikimedia.org/T288296 (10DAbad) [13:11:54] 10Analytics, 10API Platform, 10Generated Data Platform, 10Platform Engineering Roadmap, 10User-Eevans: Implement per-article endpoint of the pageviews API - https://phabricator.wikimedia.org/T289265 (10DAbad) 05Open→03In progress [14:24:32] 10Data-Engineering, 10Data-Engineering-Kanban, 10Airflow, 10Patch-For-Review: Set up backups and monitoring of airflow instances - https://phabricator.wikimedia.org/T307102 (10BTullis) @jcrespo additionally provided confirmation that these airflow backups are in the Bacula database in P27847. [14:25:13] 10Data-Engineering, 10Data-Engineering-Kanban, 10Data-Catalog: Setup and check backups for datahub - https://phabricator.wikimedia.org/T308113 (10BTullis) @jcrespo additionally provided confirmation that the datahub database backup is correctly referenced in the Bacula database: See P27847 for details. [15:05:50] 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review: Upgrade Turnilo - https://phabricator.wikimedia.org/T301990 (10BTullis) I'm receiving substantial help and a great response from [[https://github.com/adrianmroz-allegro|Adrian Mróź]] who is a key contributor to Turnilo, via [[https://turnilo.... [15:19:06] 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review: Upgrade Turnilo - https://phabricator.wikimedia.org/T301990 (10BTullis) p:05Triage→03Medium [15:19:37] 10Data-Engineering-Kanban, 10Data-Catalog: User Experience: Authentication - https://phabricator.wikimedia.org/T307711 (10BTullis) [15:26:08] 10Data-Engineering, 10Event-Platform, 10Generated Data Platform: [Shared Event Platform] Ability to use Event Platform streams in Flink without boilerplate - https://phabricator.wikimedia.org/T308356 (10Ottomata) Writing up findings here. There are so many levels of typing in Flink it can be pretty hard to... [15:29:54] 10Data-Engineering, 10Event-Platform, 10Generated Data Platform: [Shared Event Platform] Ability to use Event Platform streams in Flink without boilerplate - https://phabricator.wikimedia.org/T308356 (10Ottomata) It would be easy and possible to adapt [[ https://github.com/apache/flink/blob/master/flink-form... [15:34:51] 10Quarry: Quarry, unable to run tests following the README.md - https://phabricator.wikimedia.org/T308493 (10bd808) >>! In T308493#7934300, @rook wrote: > @bd808 does that seem right to you? That is basically what one would do to match how the pipelinelib tests work in Jenkins. My note about the error sounding... [16:27:02] (03PS1) 10Btullis: Make a small change to trigger the build pipeline [analytics/datahub] (wmf) - 10https://gerrit.wikimedia.org/r/792643 (https://phabricator.wikimedia.org/T308052) [16:58:12] 10Quarry: Quarry, unable to run tests following the README.md - https://phabricator.wikimedia.org/T308493 (10rook) >>! In T308493#7935097, @bd808 wrote: > That is basically what one would do to match how the pipelinelib tests work in Jenkins. My note about the error sounding similar to T295318 was based on that... [17:00:43] (03CR) 10Btullis: [C: 03+2] Make a small change to trigger the build pipeline [analytics/datahub] (wmf) - 10https://gerrit.wikimedia.org/r/792643 (https://phabricator.wikimedia.org/T308052) (owner: 10Btullis) [17:06:27] 10Data-Engineering, 10Data-Engineering-Kanban, 10Product-Analytics, 10Superset: Error when updating dashboard - https://phabricator.wikimedia.org/T308441 (10mpopov) [17:15:47] (03Merged) 10jenkins-bot: Make a small change to trigger the build pipeline [analytics/datahub] (wmf) - 10https://gerrit.wikimedia.org/r/792643 (https://phabricator.wikimedia.org/T308052) (owner: 10Btullis) [17:44:58] (03Abandoned) 10Razzi: Upgrade to superset 1.35.0 [analytics/turnilo/deploy] - 10https://gerrit.wikimedia.org/r/791461 (https://phabricator.wikimedia.org/T301990) (owner: 10Razzi) [18:08:32] 10Data-Engineering, 10Data-Engineering-Kanban: Split turnilo staging off of an-tool1005 - https://phabricator.wikimedia.org/T308597 (10razzi) [18:09:30] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q4:(Need By: TBD) rack/setup/install an-presto10[06-15].eqiad.wmnet - https://phabricator.wikimedia.org/T306835 (10Jclark-ctr) [20:38:17] a-team: can someone advise on https://phabricator.wikimedia.org/T308294#7936287 ? A user is having issue with new presto access [20:39:01] I'm thinking from the error that might need privatedata-users too? [20:51:09] milimetric: thanks for the confirmation [20:51:23] I've told them to fill out a new request [21:02:19] 10Analytics-Wikistats, 10Data-Engineering: Use dedicated Phabricator bug report / feature request forms - https://phabricator.wikimedia.org/T308610 (10Aklapper) [21:02:32] (03PS2) 10Aklapper: Use dedicated Phabricator bug report / feature request forms [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/768167 (https://phabricator.wikimedia.org/T308610) [21:10:28] np, RhinosF1, thanks for the ping [21:11:50] np too :) [21:21:54] 10Data-Engineering, 10Event-Platform, 10Generated Data Platform, 10Patch-For-Review: [Shared Event Platform] Ability to use Event Platform streams in Flink without boilerplate - https://phabricator.wikimedia.org/T308356 (10Ottomata) > I believe it is possible to start with the Table API and then immediate... [21:32:20] !log sudo systemctl reset-failed ifup@ens13.service on an-tool1007 [21:32:22] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log