[10:02:29] 10puppet-compiler, 10Infrastructure-Foundations, 10Patch-For-Review: PCC group changes/hosts by whether they change resource parameters only or not - https://phabricator.wikimedia.org/T330484 (10jbond) 05Open→03Resolved a:03jbond I have added notify to the list of core resources exclusions. Ill close... [11:24:31] 10puppet-compiler, 10Infrastructure-Foundations: puppet-compiler: support diffing file resource source uris - https://phabricator.wikimedia.org/T330732 (10Peachey88) [13:09:32] can I reimage sretest1002 or is it currently used for any other reimage tests? [13:10:02] go ahead for me, but maybe was used by john to test firmware upgrade changes [13:20:57] all good for me [13:24:51] ack, thx [15:02:42] 10netbox, 10Infrastructure-Foundations, 10Patch-For-Review: Upgrade Netbox to 3.2 - https://phabricator.wikimedia.org/T296452 (10jbond) during the recent DC switch over netbox got moved to codfw and it was super slow. this means that in the current set up: * active/active may not be the best idea * we nee... [15:11:29] topranks: hey, you around by any chance? [15:14:32] volans: hey yep what’s up? [15:15:31] I'm debugging a weird failure of a dns patch that failed because of weird autogenerated data: [15:15:34] https://integration.wikimedia.org/ci/job/operations-dns-lint-docker/4437/console [15:16:20] AFAICT those direct and PTR records were removed in commit cacfe24968b8137f783e4e28872290c2bc516bb4 in the generated repo [15:16:36] so I Was wondering if by chance this could ring a bell to you [15:18:13] but that's back from Jan 26th... [15:18:39] yeah we removed that GRE and I tidied things up [15:18:45] just refreshing my memory [15:20:44] as far as I remember we removed them completely from Netbox [15:21:11] I merged this patch to remove the 'include' for the autogenerated file for that /64 then: [15:21:12] https://gerrit.wikimedia.org/r/c/operations/dns/+/883942/ [15:21:51] yeah [15:25:28] topranks: slightly related, akosiaris found https://netbox.wikimedia.org/ipam/ip-addresses/2937/ [15:25:47] assignment is gr-3/3/0.2, dns is gr-4-3-0-2 [15:26:55] ah good spot! [15:27:08] I'll change that, whenever this mess is cleared up [15:27:12] basically I can't understand why CI got the wrong data [15:27:18] and why [15:27:24] unless... [15:27:32] * volans has an idea [15:27:36] yeah I'm trying to find some skeletons here but haven't yet [15:28:09] give me 5 to try something [15:31:39] no, failed attempt [15:37:36] 208.80.154.220/31 was definitely the v4 prefix used on the removed tunnel [15:37:47] matching what's in the CI output, but it really doesn't make sense to me [15:43:59] jbond: what's our level of confidence of running netbox in codfw? [15:44:05] mine is pretty low at this stage [15:44:16] XioNoX too ofc if you're around [15:45:23] I'd probably revert it to eqiad as this setup has never been tested (at least in recent times) [15:46:34] volans: +1 [15:46:58] i have not used it in codfw, XioNoX may have more experience if not i agree its an unknown [15:47:18] volans: also in case yu missed it https://phabricator.wikimedia.org/T296452#8653039 [15:47:34] no I didn't miss it, I agree :D thanks for adding it there [15:48:58] jbond: ack, then I'll ask clement to revert back netbox to eqiad [15:49:54] volans: we could also do it, would be the same https://phabricator.wikimedia.org/P44902 | sed 's/apt/netbox/' [15:50:12] or the cookbook [15:50:37] the cookbook that checks all the records [15:51:31] sre.discovery.service-route i think but its quite new and i have not used it [16:03:51] volans: I didn't know it was live in codfw :) [16:04:27] but +1 on keeping it in eqiad for now [16:04:44] XioNoX: it was switched today as part of the serices switchover [16:05:15] I see [16:05:32] and it's super slow [16:05:35] try changelog [16:08:13] https://grafana.wikimedia.org/d/DvXT6LCnk/netbox?orgId=1&from=now-3h&to=now&viewPanel=17 [16:09:03] ideally we would be able to separate read/write and have them read from DC-local, but afaik it's not possible [16:09:37] 2nd option would be to switchover netbox DB as well, but that's another can of worms [16:10:22] yeah netbox doesn't have RO/RW separation IIRC at the db level [16:10:28] I've commente din https://phabricator.wikimedia.org/T330651#8653355 and asked in -sre [16:17:40] netbox back to eqiad and in normal speed [16:17:54] TODO: find hte proper way to make netbox a/a or a/p with switchover procedures [21:03:46] 10CAS-SSO, 10Infrastructure-Foundations, 10SRE, 10serviceops-collab, and 3 others: migrate gitlab away from the CAS protocol - https://phabricator.wikimedia.org/T320390 (10demon) From the looks of it, we can add OIDC as a second [omniauth provider](https://docs.gitlab.com/ee/integration/omniauth.html). We...