[02:02:47] !log soda@tools-bastion-15 tools.yapping-sodium soda built and uploaded a new version [02:02:49] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.yapping-sodium/SAL [06:29:33] quarry.wmcloud.org ist extreme slow. The query itself finished, but the results page is still not finished (query at 06:12:38 UTC, nuw it ist 6:29) [06:36:58] Aha! Reload helped. The server seems to have restarted between finishing the query and showing the results [08:13:15] Indeed quarry has been a bit flaky this weekend :/, https://lists.wikimedia.org/hyperkitty/list/cloud-admin-feed@lists.wikimedia.org/, will open a task [08:46:25] fyi. opened T430486 [08:46:26] T430486: [quarry] flaky during the weekend - https://phabricator.wikimedia.org/T430486 [09:05:28] dcaro: thanks for opening the task, I saw some flakiness last month as well, we should have a look [13:15:51] !log paws rm -rf /srv/paws/files-to-remove/* on paws-nfs-2 [13:15:53] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Paws/SAL [13:37:18] High rate of production errors from Italy https://www.wikimediastatus.net/ [13:39:58] (Happened error 500 while visiting the homepage of mediawiki.org, and also to other users - https://t.me/itwikipedia/134031 ) [13:44:36] @bozzy, I think a fix is underway. Also: this isn't really the right place to report production errors; I'm not sure where the perfect place is but you might try #wikimedia-tech next time. [13:44:44] whopsie [13:45:24] no worries! [13:47:09] (still no Matrix bridge with IRC wikimedia-tech, right? https://www.mediawiki.org/wiki/MediaWiki_on_IRC ) [13:47:22] (and no Telegram bridge, etc) [13:55:57] oh, no idea, maybe not [14:51:34] Hi andrewbogott: Do you have time today for another round of testing of https://gerrit.wikimedia.org/r/c/operations/puppet/+/1302978 ? [14:56:50] dancy: yes, probably. I read it last week but need to read it again. [14:57:04] I think you'll find it a lot simpler now [14:57:08] Easy to understand [15:27:11] dancy: I cherry-picked your patch onto the puppetserver so you can test [15:27:22] ok. Trying now [15:50:53] andrewbogott: Please drop the cherry-pick and reapply with patchset 15. [15:51:04] ok! [15:52:23] done [15:56:04] !log gergesshamon@tools-bastion-15 tools.wikimonitor-beta [DEPLOY] Starting deployment | ref=refactor/migrate-to-mediawiki4j [15:56:07] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikimonitor-beta/SAL [15:56:34] oops dancy, done [15:56:45] thx [15:57:34] !log gergesshamon@tools-bastion-15 tools.wikimonitor-beta [DEPLOY] Build triggered successfully | ref=refactor/migrate-to-mediawiki4j [15:57:34] !log gergesshamon@tools-bastion-15 tools.wikimonitor-beta [DEPLOY] Restarting buildservice | ref=refactor/migrate-to-mediawiki4j [15:57:35] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikimonitor-beta/SAL [15:57:36] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikimonitor-beta/SAL [15:57:45] !log gergesshamon@tools-bastion-15 tools.wikimonitor-beta [DEPLOY] Deployment completed successfully | ref=refactor/migrate-to-mediawiki4j [15:57:46] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikimonitor-beta/SAL [16:02:17] andrewbogott: Success! [16:03:05] Great! [16:04:37] !log gergesshamon@tools-bastion-15 tools.wikimonitor [DEPLOY] Starting deployment | ref=v1.5.6 [16:04:39] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikimonitor/SAL [16:06:10] !log gergesshamon@tools-bastion-15 tools.wikimonitor [DEPLOY] Build triggered successfully | ref=v1.5.6 [16:06:10] !log gergesshamon@tools-bastion-15 tools.wikimonitor [DEPLOY] Restarting buildservice | ref=v1.5.6 [16:06:11] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikimonitor/SAL [16:06:12] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikimonitor/SAL [16:06:22] !log gergesshamon@tools-bastion-15 tools.wikimonitor [DEPLOY] Deployment completed successfully | ref=v1.5.6 [16:06:23] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikimonitor/SAL [16:11:15] andrewbogott: I'm ready for the change to be merged when you are. [16:13:38] dancy: will you document this feature on https://wikitech.wikimedia.org/wiki/Help:Project_puppetserver ? Hopefully that page doesn't need a total rewrite. [16:14:04] andrewbogott: Will do [16:14:09] also: I turn out to need to run for a bit so I'm going to revert the cherry-pick and will plan to merge later on in the day if I get free. [16:14:25] Sounds good. lemme know when it's done and I'll start working on docs then. [16:14:47] will do [16:14:53] or, at least, gerrit will [18:17:18] dancy: merged [18:17:38] Thanks for helping get that through. I'll work on the docs. [18:18:16] thanks! [20:20:05] Krinkle: do you know about the cvn project to know if/when I can safely break things? I need to build a new NFS server there, and the switchover will cause some reboots and a few minutes of downtime. T429793 [20:20:06] T429793: Update all cloud-vps Bullseye NFS servers - https://phabricator.wikimedia.org/T429793 [20:20:37] Nemo_bis, same question about the 'dumps' project. Do I need to synchronize downtime or can I just reboot things whenever? [20:39:59] * andrewbogott opens phab tickets about both of the above [20:46:48] andrewbogott: I'm not sure I understand the impact. We do actively read-write from there, from cvn-app and cvn-apache servers, if that's what you're asking. [20:47:21] https://phabricator.wikimedia.org/T430589 [20:48:23] during the switchover the nfs clients will likely hang, then I'll reboot them and they'll re-attach to the new nfs server [20:48:37] If things go poorly there might be 10-15 minutes of towntime but hopefully less than that. [20:48:54] Can the servers shut down first, and wakeup in the new world unaware of any difference? [20:49:10] I don't mind 5min downtime [20:49:27] yeah, can shut them down first. [20:49:40] I'd prefer not having to think about what the processes are going to do with nfs in a confused state. [20:49:44] But of course I can never be 100% sure that they will work perfectly the first time when they come back. [20:50:07] It wouldn't so much be 'confused state' as 'not there at all'. But it's easy for me to stop things and restart them later. [20:50:27] do you care when it happens? [20:50:39] OK. well, give me a few days then to check how we're using it. If I can rule out CVNBot's running code then the rest I can reason about [20:50:53] maybe a live change and then a reboot would be fine. [20:51:22] sure. Want to just follow up on the task once you know when is a good time for things to be shut off? [20:51:26] ack [20:51:38] thanks!