[06:31:48] (03CR) 10Awight: [C: 03+1] Bump to jsonschema-tools 0.11.0 to get consistent json and yaml serialization ordering [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/809019 (https://phabricator.wikimedia.org/T308450) (owner: 10Ottomata) [06:32:02] (03CR) 10Awight: [C: 03+1] "Library change looks good!" [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/809021 (https://phabricator.wikimedia.org/T308450) (owner: 10Ottomata) [06:56:48] 10Analytics, 10API Platform: Establish testing procedure for Druid-based endpoints - https://phabricator.wikimedia.org/T311190 (10JAllemandou) Hi @BPirkle - I'll gladly spend some time with you (and anyone interested) to explain more about Druid if needed :) [07:09:12] (03CR) 10Joal: [C: 03+1] "One nit, otherwise good to go!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/808888 (https://phabricator.wikimedia.org/T309718) (owner: 10NOkafor) [07:10:57] (03CR) 10Joal: [C: 03+1] "Could be extended :)" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/808905 (owner: 10Ottomata) [08:57:16] (03CR) 10Gmodena: WIP - Add new mediawiki entity fragments, and use them in new mediawiki page change schema (031 comment) [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/807565 (https://phabricator.wikimedia.org/T308017) (owner: 10Ottomata) [10:11:51] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q4: rack/setup/install stat1010 - https://phabricator.wikimedia.org/T307399 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by btullis@cumin1001 for host stat1010.eqiad.wmnet with OS bullseye [11:08:07] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q4: rack/setup/install stat1010 - https://phabricator.wikimedia.org/T307399 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by btullis@cumin1001 for host stat1010.eqiad.wmnet with OS bullseye executed with errors: - stat1010 (**FAIL**)... [11:31:41] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q4: rack/setup/install stat1010 - https://phabricator.wikimedia.org/T307399 (10BTullis) It looks like this might be an instance of the bug identified in {T304483} I'm downgrading the NIC firmware to the previous version and then I will run the cookbook agai... [11:36:22] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q4: rack/setup/install stat1010 - https://phabricator.wikimedia.org/T307399 (10fgiunchedi) >>! In T307399#8027872, @Cmjohnson wrote: > @btullis @robh was working on this last wee. /dev/sda and /dev/sdb are swapped by the controller regardless of how they we... [11:45:17] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q4: rack/setup/install stat1010 - https://phabricator.wikimedia.org/T307399 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by btullis@cumin1001 for host stat1010.eqiad.wmnet with OS bullseye [11:54:59] (03CR) 10Ottomata: Suppress useless GeocodeDatabaseReader log warn messages about 127.0.0.1 not found in database (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/808905 (owner: 10Ottomata) [11:56:29] (03CR) 10Ottomata: mediawiki/client/metrics_event: Add mediawiki.db_name property (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/777844 (https://phabricator.wikimedia.org/T304689) (owner: 10Phuedx) [11:57:38] (03CR) 10Ottomata: WIP - Add new mediawiki entity fragments, and use them in new mediawiki page change schema (031 comment) [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/807565 (https://phabricator.wikimedia.org/T308017) (owner: 10Ottomata) [12:15:20] (03CR) 10Ottomata: [C: 03+2] Bump to jsonschema-tools 0.11.0 to get consistent json and yaml serialization ordering [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/809019 (https://phabricator.wikimedia.org/T308450) (owner: 10Ottomata) [12:15:26] (03CR) 10Ottomata: [C: 03+2] Bump to jsonschema-tools 0.11.0 to get consistent json and yaml serialization ordering [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/809021 (https://phabricator.wikimedia.org/T308450) (owner: 10Ottomata) [12:40:28] (03PS1) 10Kosta Harlan: image-suggestions-feedback: Make dt field non-required, adjust docs [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/809150 (https://phabricator.wikimedia.org/T302925) [12:41:04] (03PS2) 10Kosta Harlan: image-suggestions-feedback: Make dt field non-required, drop wiki, adjust docs [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/809150 (https://phabricator.wikimedia.org/T302925) [12:41:07] (03CR) 10CI reject: [V: 04-1] image-suggestions-feedback: Make dt field non-required, drop wiki, adjust docs [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/809150 (https://phabricator.wikimedia.org/T302925) (owner: 10Kosta Harlan) [12:41:36] (03CR) 10CI reject: [V: 04-1] image-suggestions-feedback: Make dt field non-required, drop wiki, adjust docs [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/809150 (https://phabricator.wikimedia.org/T302925) (owner: 10Kosta Harlan) [12:44:27] (03PS3) 10Kosta Harlan: image-suggestions-feedback: Drop duplicative wiki field, adjust docs [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/809150 (https://phabricator.wikimedia.org/T302925) [12:48:48] !log deploying airflow-dags/analytics to work on the metadata ingestion jobs [12:48:49] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [12:49:03] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q4: rack/setup/install stat1010 - https://phabricator.wikimedia.org/T307399 (10BTullis) Hi @fgiunchedi - Thanks, I agree that it would be a pain to have to deviate from using `/dev/sda` for the primary OS drive. At the moment I have copied the only previous... [12:51:10] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q4: rack/setup/install stat1010 - https://phabricator.wikimedia.org/T307399 (10BTullis) [12:57:33] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q4: rack/setup/install stat1010 - https://phabricator.wikimedia.org/T307399 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by btullis@cumin1001 for host stat1010.eqiad.wmnet with OS bullseye executed with errors: - stat1010 (**FAIL**)... [12:58:04] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q4: rack/setup/install stat1010 - https://phabricator.wikimedia.org/T307399 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by btullis@cumin1001 for host stat1010.eqiad.wmnet with OS bullseye [13:03:37] (03CR) 10Ottomata: "Yar, this is why I was advising not to merge 1.0.0 before we were ready to actually use it." [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/809150 (https://phabricator.wikimedia.org/T302925) (owner: 10Kosta Harlan) [13:09:54] 10Analytics, 10API Platform, 10Generated Data Platform, 10Platform Engineering Roadmap, 10User-Eevans: Implement per-article endpoint of the pageviews API - https://phabricator.wikimedia.org/T289265 (10BPirkle) a:05FGoodwin→03codebug [13:10:41] 10Data-Engineering, 10API Platform, 10Platform Engineering Roadmap, 10User-Eevans: Implement top endpoint of the pageviews API - https://phabricator.wikimedia.org/T299732 (10BPirkle) a:05FGoodwin→03codebug [13:11:26] 10Data-Engineering, 10API Platform, 10Platform Engineering Roadmap, 10User-Eevans: Implement top-per-country endpoint of the pageviews API - https://phabricator.wikimedia.org/T299734 (10BPirkle) a:05FGoodwin→03codebug [13:12:11] 10Data-Engineering, 10API Platform, 10Platform Engineering Roadmap, 10User-Eevans: Implement top-by-country endpoint of the pageviews API - https://phabricator.wikimedia.org/T299733 (10BPirkle) a:05FGoodwin→03codebug [13:12:33] 10Data-Engineering, 10API Platform, 10Code-Health-Objective, 10Epic, and 3 others: Implement aggregate endpoint of the pageviews API - https://phabricator.wikimedia.org/T299731 (10BPirkle) a:05FGoodwin→03codebug [13:13:19] 10Data-Engineering, 10Data-Engineering-Kanban, 10DC-Ops, 10SRE, 10ops-eqiad: Q4: rack/setup/install stat1010 - https://phabricator.wikimedia.org/T307399 (10BTullis) [13:15:20] 10Data-Engineering, 10API Platform, 10Platform Engineering Roadmap, 10User-Eevans: AQS 2.0: Implement mediarequests endpoints - https://phabricator.wikimedia.org/T288303 (10DAbad) 05Open→03In progress a:03FGoodwin [13:15:25] 10Analytics, 10API Platform (Product Roadmap), 10Code-Health-Objective, 10Epic, and 3 others: AQS 2.0 - https://phabricator.wikimedia.org/T263489 (10DAbad) [13:20:07] 10Data-Engineering, 10Data-Engineering-Kanban, 10DC-Ops, 10SRE, 10ops-eqiad: Q4: rack/setup/install stat1010 - https://phabricator.wikimedia.org/T307399 (10BTullis) a:05Cmjohnson→03BTullis I'm just claiming this ticket and putting it on our team's workboard to reflect the fact that I'm working on it... [13:20:51] 10Data-Engineering, 10API Platform, 10Platform Engineering Roadmap, 10User-Eevans: AQS 2.0: Implement geoeditors endpoints - https://phabricator.wikimedia.org/T288305 (10DAbad) 05Open→03In progress a:03FGoodwin [13:20:54] 10Analytics, 10API Platform (Product Roadmap), 10Code-Health-Objective, 10Epic, and 3 others: AQS 2.0 - https://phabricator.wikimedia.org/T263489 (10DAbad) [13:32:59] Arf I din't notice I got disconnected :S [13:33:12] ottomata: heya - would you have a minute to sync on progress? [14:46:05] ottomata, further to your response on the analytics mailing list on 21st June [14:46:13] Do you know if EventStreamConfig can be tested on test2wiki (so I can check the API is working)? [14:46:42] I also notice that EventBus has some API tests already. Can those be run against test2wiki? [14:50:32] 10Data-Engineering, 10Data-Engineering-Kanban, 10DC-Ops, 10SRE, 10ops-eqiad: Q4: rack/setup/install stat1010 - https://phabricator.wikimedia.org/T307399 (10BTullis) Well the partitioning recipe didn't do what we wanted to anyway. * `/dev/sda` is the big RAID10 drive (as we suspected) * `/dev/sdb` is the... [14:52:31] 10Data-Engineering, 10Data-Engineering-Kanban: Gobblin dataloss during namenode failure - https://phabricator.wikimedia.org/T311263 (10Ottomata) List of datanodes that were excluding during attempt_1655808530211_10727_m_000000_0: ` grep 'Excluding datanode' application_1655808530211_10727_failed_gobblin.T31... [14:59:35] 10Data-Engineering, 10Data-Engineering-Kanban: Upgrade to latest PrestoDB and enable iceberg support - https://phabricator.wikimedia.org/T311525 (10Ottomata) [15:00:37] 10Data-Engineering, 10Data-Engineering-Kanban: Upgrade to latest PrestoDB and enable iceberg support - https://phabricator.wikimedia.org/T311525 (10Ottomata) FYI: We will not be doing {T266640} as part of this upgrade. [15:30:18] realized we do have some docs on search platform airflow, although mostly arround how deployments and CI works rather than what to actually do in airflow: https://wikitech.wikimedia.org/wiki/Discovery/Analytics [15:33:42] 10Data-Engineering, 10Data-Engineering-Kanban, 10DC-Ops, 10SRE, 10ops-eqiad: Q4: rack/setup/install stat1010 - https://phabricator.wikimedia.org/T307399 (10fgiunchedi) Thank you for the investigation @BTullis ! In case you haven't come across it yet: the `partman/custom/kafka-jumbo.cfg` configuration wou... [15:49:42] (03CR) 10Kosta Harlan: image-suggestions-feedback: Drop duplicative wiki field, adjust docs (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/809150 (https://phabricator.wikimedia.org/T302925) (owner: 10Kosta Harlan) [15:54:44] (03CR) 10Ottomata: "Naw it's okay. in this case let's just keep the 2.0.0 version and we'll drop the hive table and do a manual migration." [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/809150 (https://phabricator.wikimedia.org/T302925) (owner: 10Kosta Harlan) [16:04:53] 10Data-Engineering, 10Data-Engineering-Kanban, 10MediaWiki-extensions-EventLogging, 10Patch-For-Review: Generate $wgEventLoggingSchemas from $wgEventStreams - https://phabricator.wikimedia.org/T303602 (10Ottomata) a:03phuedx [16:12:44] 10Data-Engineering, 10Data-Engineering-Kanban: Upgrade to latest PrestoDB and enable iceberg support - https://phabricator.wikimedia.org/T311525 (10JAllemandou) Open question: Presto latest is 0.273.3 - Would we bump more than the minimal one for Iceberg? [16:12:52] (03CR) 10TChin: "Would this also require an update to the Cassandra table so that origin_wiki and rejection_reason become set or something?" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/809150 (https://phabricator.wikimedia.org/T302925) (owner: 10Kosta Harlan) [16:14:00] 10Data-Engineering, 10Data-Engineering-Kanban: Upgrade to latest PrestoDB and enable iceberg support - https://phabricator.wikimedia.org/T311525 (10Ottomata) y not? [16:15:09] 10Data-Engineering, 10Data-Engineering-Kanban, 10Product-Analytics, 10Superset, 10Patch-For-Review: Upgrade Superset to 1.4.2 - https://phabricator.wikimedia.org/T304972 (10BTullis) 05Open→03Resolved a:03BTullis [16:26:56] 10Data-Engineering, 10Data-Engineering-Kanban, 10Cassandra, 10User-Eevans: Properly add aqsloader user (w/ secrets) - https://phabricator.wikimedia.org/T305600 (10JArguello-WMF) > Now we (Data-Engineering) need to adapt when the aqsloader user comes with a password: @JAllemandou Do you know if Data Engine... [16:34:43] 10Data-Engineering, 10Data-Engineering-Kanban, 10Cassandra, 10User-Eevans: Properly add aqsloader user (w/ secrets) - https://phabricator.wikimedia.org/T305600 (10JAllemandou) >>! In T305600#8033861, @JArguello-WMF wrote: >> Now we (Data-Engineering) need to adapt when the aqsloader user comes with a passw... [16:37:45] (03CR) 10Eevans: [C: 04-1] "I'm confused here, how is wiki duplicative? As originally specified, `wiki` is a qualifier for the article the suggestion is for (since p" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/809150 (https://phabricator.wikimedia.org/T302925) (owner: 10Kosta Harlan) [16:55:21] ottomata - would you come back to batcave to talk refine quickly? [17:01:32] (03CR) 10Kosta Harlan: image-suggestions-feedback: Drop duplicative wiki field, adjust docs (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/809150 (https://phabricator.wikimedia.org/T302925) (owner: 10Kosta Harlan) [17:12:00] 10Data-Engineering, 10Foundational Technology Requests, 10Product-Analytics: "Source of truth" dataset for pageviews - https://phabricator.wikimedia.org/T310732 (10mpopov) **Note to future self**: not moving to Tracking because this will be a collaboration [17:12:23] 10Data-Engineering, 10Foundational Technology Requests, 10Product-Analytics: "Source of truth" dataset for pageviews - https://phabricator.wikimedia.org/T310732 (10mpopov) p:05Triage→03High [17:25:15] !log installing presto 0.273.3 on an-test-coord1001 and an-test-presto1001 [17:25:17] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:25:41] joal: apparently my IRC wasn't connected [17:25:44] sorry I missed! [17:25:45] am here! [17:44:08] random question, on what basis do you decide between adding more tables to an existing hive database, vs creating a database for a specific use case? We have a new use case that creates 14 tables and it seems like it could be it's own database, but only because 14 seems like a big number [17:44:39] i don't think we have such a basis :p [17:44:49] ok, so same as us :) [17:44:59] i'm not aware of any reason to limit the number of tables in a database [17:45:07] the event database has hundreds [17:46:42] Yea i suppose there are no technical reasons to prefer a particular naming, it perhaps ends up being about discoverability / how confused people get looking for specific tables [17:46:48] ya [17:48:19] Heya ottomata - wanna sync for 10 mins ? I'm in meeting after [17:49:38] 10Analytics, 10API Platform (Product Roadmap), 10Code-Health-Objective, 10Epic, and 3 others: AQS 2.0: Create repository for shared functions - https://phabricator.wikimedia.org/T311541 (10FGoodwin) [17:50:24] AH yes [17:50:28] i have 10 mins joal too [17:50:36] In da cave! [18:21:55] 10Data-Engineering, 10Data-Engineering-Kanban, 10Airflow: [Airflow] Proof of concept of Cassandra loading - https://phabricator.wikimedia.org/T307935 (10mforns) [18:24:32] kinit [18:24:36] woops :) [18:29:39] :v ^ that was fun [18:30:52] hehe :) [18:57:21] (03CR) 10Eevans: [C: 04-1] image-suggestions-feedback: Drop duplicative wiki field, adjust docs (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/809150 (https://phabricator.wikimedia.org/T302925) (owner: 10Kosta Harlan) [19:04:50] 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review: Upgrade to latest PrestoDB and enable iceberg support - https://phabricator.wikimedia.org/T311525 (10Ottomata) @XCollazo-WMF, @milimetric, @JAllemandou analytics test presto is upgraded to 0.273.3 with an `analytics_test_iceberg` catalog conf... [19:06:54] ottomata: is it OK if we deploy refinery-source and refinery? We need it to deploy our Airflow job. [19:09:43] (03PS1) 10Mforns: Update changelog.md for v0.2.2 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/809242 [19:10:13] (03CR) 10Mforns: [V: 03+2 C: 03+2] "Merging for deployment train." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/809242 (owner: 10Mforns) [19:10:21] mforns: please! [19:10:24] sorry i didn't do that! [19:10:28] oh let me mmerge one thing [19:10:39] (03CR) 10Ottomata: [C: 03+2] Suppress useless GeocodeDatabaseReader log warn messages about 127.0.0.1 not found in database [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/808905 (owner: 10Ottomata) [19:10:49] ok merged, proceed mforns [19:10:54] i need to run an errand rn, be back soon! [19:13:17] Starting build #107 for job analytics-refinery-maven-release-docker [19:15:25] 10Data-Engineering, 10Data-Engineering-Kanban, 10Patch-For-Review: Upgrade to latest PrestoDB and enable iceberg support - https://phabricator.wikimedia.org/T311525 (10XCollazo-WMF) > I'm worried that the Presto Iceberg connector might not have kerberos support? Typically, you can pass these details down wi... [19:27:18] ottomata: for when you're back - the iceberg catalog is failing with some very limited info: Query 20220628_192541_00006_3rk4b failed: analytics-test-hive.eqiad.wmnet:9083: null [19:28:04] Project analytics-refinery-maven-release-docker build #107: 09SUCCESS in 14 min: https://integration.wikimedia.org/ci/job/analytics-refinery-maven-release-docker/107/ [19:29:56] ottomata: if you wish to test, you have an iceberg table on the test cluster: joal.navigationtiming_iceberg [19:32:50] ottomata: presto in test works with hive catalog (tested with webrequest) [19:36:50] ok - gone for tonight - see you tomorrow folks [19:50:29] 10Data-Engineering: Public Druid cluster leftovers from old mw history snapshots - https://phabricator.wikimedia.org/T311547 (10Milimetric) [20:19:24] !log starting refinery deploymenty [20:19:26] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:19:38] !log starting refinery deployment for refinery-source v0.2.2 [20:19:39] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:23:37] (03PS3) 10NOkafor: removed hive.mapred.mode = nonstrict Updated the usage from hive to spark2-sql Added repair_partitions.hql file to hql path [analytics/refinery] - 10https://gerrit.wikimedia.org/r/808888 (https://phabricator.wikimedia.org/T309718) [20:27:25] (03PS7) 10Snwachukwu: Add projectview hql scripts to analytics/refinery/hql path. [analytics/refinery] - 10https://gerrit.wikimedia.org/r/797240 (https://phabricator.wikimedia.org/T309023) [20:27:28] (03PS4) 10NOkafor: Removed hive.mapred.mode = nonstrict Updated the usage from hive to spark2-sql Added repair_partitions.hql file to hql path [analytics/refinery] - 10https://gerrit.wikimedia.org/r/808888 (https://phabricator.wikimedia.org/T309718) [20:28:31] (03CR) 10Mforns: [V: 03+2 C: 03+2] "LGTM!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/808888 (https://phabricator.wikimedia.org/T309718) (owner: 10NOkafor) [20:36:00] 10Analytics, 10API Platform (Product Roadmap), 10Code-Health-Objective, 10Epic, and 3 others: AQS 2.0: Create repository for shared functions - https://phabricator.wikimedia.org/T311541 (10Eevans) Something to keep in mind: If the functions in question are (or can be made) something general to any service... [20:57:18] !log refinery deploy failed and I rolled back successfully, will try and repeat tomorrow when other people are present :] [20:57:19] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [22:10:15] 10Data-Engineering, 10Foundational Technology Requests, 10Product-Analytics: "Source of truth" dataset for pageviews - https://phabricator.wikimedia.org/T310732 (10Mayakp.wiki)