[08:33:57] !log btullis@aqs1007:~$ sudo nodetool-b clearsnapshot [08:34:00] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [08:42:43] 10Analytics, 10WMDE-Analytics-Engineering, 10WMDE-FUN-Team, 10WMDE-Fundraising-Tech, 10User-GoranSMilovanovic: Sync with https://analytics.wikimedia.org/published/datasets/ from stat1008 - https://phabricator.wikimedia.org/T293112 (10GoranSMilovanovic) 05Open→03Invalid Ok, this seems to be in sync no... [09:15:27] 10Analytics-Radar, 10Event-Platform, 10WMF-JobQueue, 10Wikibase change dispatching scripts to jobs, and 2 others: Queuing jobs is extremely slow - https://phabricator.wikimedia.org/T292048 (10TheDJ) >>! In T292048#7409006, @Ladsgroup wrote: > Increasing the replicas also reduced the save time p75: https://... [09:56:57] (03PS1) 10DCausse: Add fragment/mediawiki/revision/slot [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/731006 (https://phabricator.wikimedia.org/T293195) [09:57:31] (03CR) 10jerkins-bot: [V: 04-1] Add fragment/mediawiki/revision/slot [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/731006 (https://phabricator.wikimedia.org/T293195) (owner: 10DCausse) [10:08:08] (03PS2) 10DCausse: Add fragment/mediawiki/revision/slot [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/731006 (https://phabricator.wikimedia.org/T293195) [10:08:43] (03CR) 10jerkins-bot: [V: 04-1] Add fragment/mediawiki/revision/slot [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/731006 (https://phabricator.wikimedia.org/T293195) (owner: 10DCausse) [10:11:46] (03PS3) 10DCausse: Add fragment/mediawiki/revision/slot [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/731006 (https://phabricator.wikimedia.org/T293195) [10:12:33] (03CR) 10jerkins-bot: [V: 04-1] Add fragment/mediawiki/revision/slot [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/731006 (https://phabricator.wikimedia.org/T293195) (owner: 10DCausse) [10:12:41] meh... [10:21:06] 10Analytics-Radar, 10Data-Engineering, 10Event-Platform: Allow kafka clients to verify brokers hostnames when using SSL - https://phabricator.wikimedia.org/T291905 (10elukey) To recap the next steps: * Create the new kafka intermediate config in cfssl * Add some code like Ben's presto config to `profile::ka... [10:22:59] 10Analytics-Radar, 10Data-Engineering, 10Event-Platform, 10SRE: Allow kafka clients to verify brokers hostnames when using SSL - https://phabricator.wikimedia.org/T291905 (10elukey) [10:28:23] 10Analytics-Radar, 10Data-Engineering, 10Event-Platform, 10SRE: Allow kafka clients to verify brokers hostnames when using SSL - https://phabricator.wikimedia.org/T291905 (10Joe) For the record, we've created a `wmf-certificates` debian package that includes the puppet CA and the internal PKI created by @j... [10:28:53] (03PS4) 10DCausse: Add fragment/mediawiki/revision/slot [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/731006 (https://phabricator.wikimedia.org/T293195) [10:31:20] (03CR) 10DCausse: "Will send the corresponding MW patch to assess the feasibility of this" [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/731006 (https://phabricator.wikimedia.org/T293195) (owner: 10DCausse) [11:47:32] 10Analytics, 10Analytics-Kanban, 10Data-Engineering: Snapshot and Reload cassandra2 pageview_per_article data table from all 12 instances - https://phabricator.wikimedia.org/T291472 (10BTullis) The fifth snapshot has finished loading. ` progress: [/10.64.32.128]0:2230/2230 100% [/10.64.32.145]0:2804/2804 100... [11:53:49] 10Analytics, 10Analytics-Kanban, 10Data-Engineering: Snapshot and Reload cassandra2 pageview_per_article data table from all 12 instances - https://phabricator.wikimedia.org/T291472 (10BTullis) Reclaiming the space from `aqs1014:/srv/cassandra-a/tmp` ` root@aqs1014:/srv/cassandra-a/tmp# df -h /srv/cassandra... [13:17:28] 10Analytics-Radar, 10Event-Platform, 10WMF-JobQueue, 10Wikibase change dispatching scripts to jobs, and 2 others: Queuing jobs is extremely slow - https://phabricator.wikimedia.org/T292048 (10Ladsgroup) Timing doesn't match but can these be related {T267061}? [13:48:18] 10Analytics-Radar, 10Event-Platform, 10WMF-JobQueue, 10Wikibase change dispatching scripts to jobs, and 2 others: Queuing jobs is extremely slow - https://phabricator.wikimedia.org/T292048 (10Ladsgroup) >>! In T292048#7429740, @Ottomata wrote: >> Increasing the replicas also reduced the save time p75 > Wow... [13:55:16] (03PS5) 10DCausse: Add fragment/mediawiki/revision/slot [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/731006 (https://phabricator.wikimedia.org/T293195) [13:55:59] (03CR) 10jerkins-bot: [V: 04-1] Add fragment/mediawiki/revision/slot [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/731006 (https://phabricator.wikimedia.org/T293195) (owner: 10DCausse) [13:58:13] (03PS6) 10DCausse: Add fragment/mediawiki/revision/slot [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/731006 (https://phabricator.wikimedia.org/T293195) [14:03:58] 10Analytics, 10Data-Engineering, 10Event-Platform, 10Wikidata, and 3 others: Add MCR slot information to revision-create events - https://phabricator.wikimedia.org/T293195 (10dcausse) [14:05:35] 10Analytics, 10Data-Engineering, 10Event-Platform, 10Wikidata, and 3 others: Add MCR slot information to revision-create events - https://phabricator.wikimedia.org/T293195 (10dcausse) a:03dcausse [14:12:26] 10Analytics, 10Data-Engineering, 10Event-Platform, 10Wikidata, and 3 others: Add MCR slot information to revision-create events - https://phabricator.wikimedia.org/T293195 (10dcausse) The two patches are up for discussions and add a new array field (not a big fan of this but could not find a better way to... [14:22:57] 10Analytics, 10Analytics-Kanban, 10Data-Engineering: Snapshot and Reload cassandra2 pageview_per_article data table from all 12 instances - https://phabricator.wikimedia.org/T291472 (10BTullis) ` ### Moving table data in keyspace local_group_default_T_pageviews_per_article_flat for instance b on aqs1012 to /... [15:05:17] 10Analytics, 10Analytics-Kanban, 10Data-Engineering, 10SRE Observability (FY2021/2022-Q2), 10User-fgiunchedi: Migrate analytics cluster alerts from Icinga to AlertManager - https://phabricator.wikimedia.org/T293399 (10BTullis) [15:21:33] 10Analytics, 10Analytics-Kanban, 10Data-Engineering, 10SRE Observability (FY2021/2022-Q2), 10User-fgiunchedi: Migrate analytics cluster alerts from Icinga to AlertManager - https://phabricator.wikimedia.org/T293399 (10BTullis) Doing a quick search of the puppet repo for these alerts reveals the following... [15:30:18] (03CR) 10Mforns: [C: 03+1] "Code looks good to me! Will +1." [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/730389 (owner: 10Tim Starling) [16:07:35] (03CR) 10Cicalese: [C: 03+1] Update PHP and version queries (031 comment) [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/730389 (owner: 10Tim Starling) [17:20:33] (03PS1) 10Clare Ming: Add new scroll schema. [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/731156 (https://phabricator.wikimedia.org/T292586) [17:22:07] 10Analytics-Clusters: Cronjob on stat1005 failed at library importing - https://phabricator.wikimedia.org/T293506 (10jwang) [17:22:09] (03CR) 10jerkins-bot: [V: 04-1] Add new scroll schema. [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/731156 (https://phabricator.wikimedia.org/T292586) (owner: 10Clare Ming) [19:21:49] 10Analytics-Clusters: Cronjob on stat1005 failed at library importing - https://phabricator.wikimedia.org/T293506 (10jwang) Additional info for the failure log I got while reinstalling matplotlib ` jiawang@stat1005:~/venv/bin$~/venv/bin/pip3 install https://github.com/matplotlib/matplotlib.git Collecting https:... [19:30:49] 10Analytics-Radar, 10Anti-Harassment, 10CheckUser, 10Privacy Engineering, and 2 others: Deal with Google Chrome User-Agent deprecation - https://phabricator.wikimedia.org/T242825 (10dr0ptp4kt) @ovasileva @SCherukuwada @Jdlrobson @SWakiyama @CBogen @MarkTraceur @DVrandecic @CBlanton @Jdforrester-WMF probabl... [19:31:04] (03CR) 10Clare Ming: "I seem to be getting caught in a validation loop:" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/731156 (https://phabricator.wikimedia.org/T292586) (owner: 10Clare Ming) [19:31:56] 10Analytics-Radar, 10Anti-Harassment, 10CheckUser, 10Privacy Engineering, and 2 others: Deal with Google Chrome User-Agent deprecation - https://phabricator.wikimedia.org/T242825 (10dr0ptp4kt) @jlinehan heads up. [19:38:06] 10Analytics-Radar, 10Anti-Harassment, 10CheckUser, 10Privacy Engineering, and 3 others: Deal with Google Chrome User-Agent deprecation - https://phabricator.wikimedia.org/T242825 (10LGoto) [19:54:22] 10Quarry, 10cloud-services-team (FY2021/2022-Q1): Develop Quarry tests - https://phabricator.wikimedia.org/T210359 (10Andrew) 05Open→03Resolved Up to 80% unit test coverage; setting this aside for now. [20:45:40] (03CR) 10Jdlrobson: "Andrew, could you advise here? Should we be putting this in the legacy folder or doing something differently?" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/731156 (https://phabricator.wikimedia.org/T292586) (owner: 10Clare Ming) [21:06:31] 10Analytics-Clusters: Cronjob on stat1005 failed at library importing - https://phabricator.wikimedia.org/T293506 (10jwang) 05Open→03Resolved a:03jwang Have fixed issue.