[08:51:35] looks like we lost a PDU is codfw [09:08:36] Amir1: can you give mailman a reload, so it picks the new dbproxy for m5? [09:18:19] marostegui: on it [09:18:23] thanks! [09:19:31] marostegui: done now. Is it fixed? [09:19:43] yes! [09:19:49] going to proceed then! [09:19:50] Thanks [11:19:16] folks qq - I am following https://wikitech.wikimedia.org/wiki/Server_Lifecycle#Manual_installation after reimaging a VM, and I'd need to update netbox [11:19:20] but the related link seems old [11:19:39] so before running any scripts in https://netbox.wikimedia.org/extras/scripts/ I'll wait for a confirmation :D [11:20:12] (the old script was ImportPuppetDB) [11:20:32] I guess https://netbox.wikimedia.org/extras/scripts/interface_automation.ImportPuppetDB/ then [11:20:58] elukey: indeed! [11:21:49] yep all good, it worked! Will fix the link in the wikipage as well [11:22:05] thanks! [11:22:44] <3 [12:06:34] thanks indeed! [13:16:11] is anyone else having trouble reaching https://lists.wikimedia.org >' [13:17:00] won't load for me [13:17:01] yes [13:17:48] Looks like it finally did load for me, did anyone do anything? [13:43:55] XioNoX: I think top.ranks is still limited availability; would you mind being gently around while I bring ms-fe1012 into LVS service, please? There was a suggestion in T294137 that this would be a good idea since it's in a new cage, and we did see a slightly odd behaviour with it last week. [13:43:56] T294137: Q2:(Need By: TBD) rack/setup/install ms-fe1009-1012 - https://phabricator.wikimedia.org/T294137 [13:44:50] [if suitable, I was aiming for any time in the next few hours] [13:47:27] Emporer: hey I’m back fully now so I should be able to assist [13:48:14] Hopefully all goes according to plan, probably the best approach is you just proceed as normal but give me a heads up when starting [13:48:36] I’ll check things look as expected and hopefully we hit no issues, if we do we can deal with it quickly? [13:49:16] topranks: ah, cool, thanks. I'm good to go now if that works for you? [13:49:16] Emperor: even :) [13:49:29] Yep [13:49:40] Gimme 2 mins to log onto the relevant devices [13:51:55] btw I'm not sure when I'll have time to work on T303725 myself but I'm happy to review patches for it -- this flavor of issue has come up several times just in the past month [13:51:56] T303725: Extend NEL headers to sites not fronted by CDN - https://phabricator.wikimedia.org/T303725 [13:52:04] topranks: Ah, I missed that; have pooled and it looks good from the swift metrics side [13:52:26] you know things are bad when pa.ravoid is worrying about things like the DON'T FRAGMENT bit [13:54:16] :) [13:54:21] cdanis: lol yeah. RFC8900 springs to mind :) [13:55:15] topranks: ms-fe1012 looking good so far from my end [13:55:28] Emperor: no probs yes just looking at it here [13:55:33] don't see any signs off issue :) [13:56:40] topranks: grand, thanks. I'll bring the rest of the new proxies online now, and give them a day or two to settle before removing the old ones. Thanks for your help :) [13:57:17] cool, cheers for the heads up, I'll check a few more things here but yeah all looking ok. [13:57:29] \o/ [14:34:17] o/ do we have a ticket for tracking updating our CAS integration with yubikeys and chrome and webauthn api? (https://www.yubico.com/blog/google-chrome-u2f-api-decommission-what-the-change-means-for-your-users-and-how-to-prepare/) [14:34:35] wondering what the timeline is, and if i should switch my default browner (if the timeline is long) [14:35:50] https://phabricator.wikimedia.org/T296629 [14:57:43] moritzm: ^ [15:13:11] ottomata: some time next month more or less [15:24:53] okay thank you,i will not change default browsers then :) [15:26:04] doh, seems the secret master plan to switch all WMF staff to Firefox doesn't work out in the end :-) [15:26:49] is going like the last plan to switch all to the correct OS :-P [18:56:09] If I want to pool/depool nodes from LVS, that's confctl, right? I don't need to touch the puppet repo? [18:56:22] correct [18:57:25] some services ship a pool/depool script on the host itself [18:57:53] or you can manage multiple ones from teh cumin hosts with the confctl CLI inflatador [18:58:18] Thanks volans ! Not planning on changing anything ATM, just wanted to make sure I'm in the right place ;) [18:58:35] what service are you looking at? [18:59:30] all the wdqs services, been looking at https://config-master.wikimedia.org/pybal/eqiad/wdqs etc for pools [19:00:16] some time soon, we'll fail them over to CODFW so we can do work related to https://phabricator.wikimedia.org/T302494 [19:01:30] does it have a discovery record? to depool/pool and entire DC for a service it might be done via confctl or not, depending how that's configured [19:01:33] * volans checking [19:03:05] inflatador: so to get some ideas about the data, from a cumin host you an run: [19:03:10] "These clusters are in active/active mode (traffic is sent to both), but due to how we route traffic with GeoDNS, the primary cluster (usually eqiad) sees most of the traffic. " [19:03:18] per https://wikitech.wikimedia.org/wiki/Wikidata_Query_Service#Deployment [19:03:21] $ confctl select 'cluster=wdqs' get | sort [19:03:39] gives you an idea on how each host is pooled/depooled in which service/dc [19:03:47] while [19:03:50] $ confctl --object-type discovery select 'dnsdisc=wdqs.*' get [19:04:22] will give you how discovery services matching wdqs.* are pooled/depooled in each DC [19:05:25] and it looks like it's pooled only in eqiad to me right now [19:06:53] Ah, OK, so it looks like I might have to change that dnsdisc (object? whatever it's called) as opposed to messing w/nodes in pools [19:07:48] if you want to route the traffic between one dc and the other yes [19:08:34] the pool are local to each LVS endpoint (.svc.eqiad.wmnet / .svc.codfw.wmnet) [19:08:49] see https://wikitech.wikimedia.org/wiki/DNS/Discovery for more info [19:08:54] on how that works [19:10:58] for the active/active vs active/passive you have to check hieradata/common/service.yaml in puppet [19:11:16] the services with active_active: True are configured to be a/a [19:11:52] of course pool first the passive one before depooling the active one ;) [19:12:31] volans got it, looks like the puppet config shows active/active and both "sites" used, but running config is only EQIAD [19:12:42] and yes, always good to make that very clear ;P [19:13:38] see also https://wikitech.wikimedia.org/wiki/DNS/Discovery#Failure_scenario ;)