[08:04:12] rzl: which workaround? have you followed https://wikitech.wikimedia.org/wiki/Ipmi to fix the remove ipmi? [08:33:05] really nice article https://tenthousandmeters.com/blog/python-behind-the-scenes-13-the-gil-and-its-effects-on-python-multithreading/ [11:03:38] hi :) I'm trying to add a new secret to pwstore, and getting "Warning: the following recipients are invalid: B72D8E5552F08C6220BDCC857EECD29261650579" [11:06:30] it matches the ID for Lukasz's key, which looks valid and not expired... [11:08:25] did you run "pws update-keyring?" It will rebuild the keyring used by pws with the keys stored in the keys/ sub dir of the repo [11:09:30] I didn't, let me try that [11:10:50] that fixed it! thanks moritzm :) [11:15:13] excellent :-) [11:17:00] No sirenbot come back [11:17:05] Aww [11:17:07] :) [11:24:11] <_joe_> claime: still no autoop, sigh [11:25:18] Yeah [11:28:47] It's ircservserv's job right? I think it's not setup for the other channel, and I lack the rights to check sync and config updates [11:29:23] <_joe_> claime: it was synced [11:30:02] <_joe_> btullis: do you mind if I merge https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/860518 and let you handle the deployment at your convenience? [11:30:10] <_joe_> it's blocking some other work of mine [11:33:27] _joe_: ircservserv hasnĀ“t rejoined since the last netsplit [11:33:32] 11:33 -- ircservserv-wm: No such nick/channel [11:33:33] 11:33 -- [ircservserv-wm] End of /WHOIS list. [11:33:45] May explain why it's crapping out [11:34:58] <_joe_> claime: no it's chanserv that gives auto op [11:35:15] <_joe_> ircservserv just verifies the permissions are set with chanserv [11:37:02] sirenbot has the +o chanserv flag, which makes it possible to use `/msg ChanServ OP `, you need +O to auto op on join [11:37:43] <_joe_> taavi: oh we didn't add +O ? [11:37:49] <_joe_> I must have misread the docs [11:42:23] Yeah I just landed on the same [17:12:18] https://github.com/aaronjanse/dns-over-wikipedia is highly amusing or amazing or sad depending how you look at it: this hacks your local dns (via dnsmasq) or via browser extension for rediects, so that if you do hostname looks on "something.idk", it hits our API to search for "something" and extracts the official URL for whatever it is from our infobox and redirects to that hostname... [17:12:27] *redirects [17:31:31] volans: that was re the orange box, "The fix is to manually run 'puppet node deactivate ' on the puppetmaster followed by running puppet agent on the Icinga server" [17:32:28] I don't follow rzl [17:32:44] the decom cookbook does *a lot* of steps, AFAICT your host had an issue with remote IPMI [17:32:59] what have that to do with the puppetmaster and icinga? [17:33:24] I'm looking at the box under https://wikitech.wikimedia.org/wiki/Server_Lifecycle#Reclaim_to_Spares_OR_Decommission that begins "Every once in a while the remote IPMI command fails" [17:35:11] mmmh the decom cookbook does check the remote IPMI connection before starting [17:35:15] and warn the user if it doesn't [17:35:48] https://gerrit.wikimedia.org/r/plugins/gitiles/operations/cookbooks/+/refs/heads/master/cookbooks/sre/hosts/decommission.py#256 [17:35:53] right, I got past that with no trouble, it only failed at the power-off step [17:42:33] mmh that's weird [17:42:40] afaict yesterday and today there's no trace of those hosts left anywhere, is there other followup I should be doing? [17:44:27] was the host powered off? [17:44:46] the post action is to poweroff the host [17:45:10] not certain, but if that's the only item I can just add a note on the dcops ticket [17:47:20] I just powered it off [17:49:09] oh, thanks -- it was both mw1312 and mw1320 [17:49:11] how did you get it done? [17:49:47] https://phabricator.wikimedia.org/T306162#8439638 [17:49:57] or you can just retry running the decom cookbook [17:49:59] it's idempotent [17:50:43] huh okay, I thought I had tried that command (with the FQDN, before removing from DNS) but I must have had something wrong, I'll recheck in my history [17:50:56] thanks for the cleanup, I'll do the same thing on 1320 now [17:52:00] that notice on the wikitech page was for the manual steps [17:52:13] for the cookbook ones it poweroff and re-run the cookbook [17:52:30] if you can ssh in the host even with just shutdown :D [18:01:23] volans: am I doing this right, for mw1320? https://www.irccloud.com/pastebin/VJSw97nA/ [18:01:30] address via https://netbox.wikimedia.org/dcim/interfaces/2751/ [18:02:28] that might be broken, see the Ipmi page on wikitech [18:02:30] for troubleshooting [18:02:39] or if you an ssh into the host [18:02:43] shutdown from there [18:02:48] or ask dcops [18:12:09] okay, thanks - haven't had any luck but I'll ask dcops [20:35:17] volans: clarified at https://wikitech.wikimedia.org/w/index.php?title=Server_Lifecycle&diff=2038018&oldid=2027917 [21:26:53] so when I look at a superset dashboard, the "Webrequest Sampled 128", I have to multiply the request numbers by 128 to get the real numbers? [21:28:10] by the way, it's the second time within days that it's been useful in separate occasions. that was a nice talk in the last SRE meeting [22:44:44] mutante: I believe that is correct [22:49:46] jhathaway: ACK, thx [22:51:28] I was looking at traffic for host "git.wikimedia.org" and it's (if multiplied by 128) about 100 requests per day or 42k in the last year. if not multiplied that's 329 total in a full year [22:51:32] https://phab.wmfusercontent.org/file/data/bxdtcdwgi6ejkw2aiyep/PHID-FILE-q4fsafm33plprl4ry2hb/Screenshot_from_2022-12-02_13-23-53.png [22:52:11] the question was "how much traffic does this even get", since that's not gerrit and not gitlab [23:43:15] claime: please feel free to merge that eventgate CR. Apologies for having missed the ping earlier.