[05:30:02] 10Traffic, 10Analytics: Review use of realloc in varnishkafka - https://phabricator.wikimedia.org/T287561 (10odimitrijevic) 05Open→03Declined Thanks @BBlack I'll go ahead and close the issue as declined for now and we can reopen as new information comes in. [06:24:07] 10Traffic, 10SRE, 10observability, 10Discovery-Search (Current work): flapping icinga Letsencrypt TLS cert alerts around renewal time - https://phabricator.wikimedia.org/T293826 (10elukey) Another option could be to use httpd from buster-backports, but https://packages.debian.org/buster-backports/apache2 s... [06:39:44] hello folks, varnish-frontend on cp5011 refused to reload due to an ABI mismatch for libvmod_std (ABI mismatch, expected , got ), I have depooled + restarted and it works now [06:40:22] (basically followed https://phabricator.wikimedia.org/T294116) [06:40:38] I haven't repooled yet, so you can check that everything is fine [06:40:45] (otherwise lemme know and I'll repool) [06:46:31] there is also [06:46:32] elukey@cp5011:~$ /usr/local/lib/nagios/plugins/check_vcl_reload [06:46:32] reload-vcl failed to run since 0h, 9 minutes. [06:46:49] but it seems to be due to an old state file [07:25:03] thanks elukey [07:26:28] I seem to have missed it during the rolling restarts! [07:33:12] ack thanks! qq - what should somebody do to clear the reload-vcl alert? [07:33:19] (just to know it for the next time) [07:35:14] so the alert IIRC looks at the string KO vs OK in some file which is written by puppet [07:35:34] probably forcing puppet to issue a vcl reload is the easiest way [07:35:38] let me try [07:36:39] or rather, no need to try: I repooled the host, which triggered a confd-based vcl reload, which cleared the alert [07:37:58] super :) [08:07:41] 10Traffic, 10Observability-Logging, 10SRE, 10Patch-For-Review, 10User-ema: varnishmtail metric loss due to mtail not reading from pipe fast enough - https://phabricator.wikimedia.org/T293879 (10ema) The optimizations to varnishxcache.mtail and varnishreqstats.mtail paid off, time spent in `tryBacktrack`... [10:28:24] volans: No drmrs.yaml / dir inside 'hieradata' it seems anyway. [10:28:48] mmandere, bblack: we were chatting with topranks / XioNoX for trying a first reimage in drmrs [10:28:53] and making a small plan for it [10:29:38] technically we could try the reimage even without the host in puppet's site.pp to test the connectivity/ipmi/pxe/tftp part and knowing that it will fail [10:29:59] or we can start adding all the bits to puppet's hiera and site.pp so that it should actually work [10:30:31] at least as spare host [10:30:42] we can use insetup or insetup_noferm [10:31:04] as roles [10:32:32] oh, didn't know, how do they compare with spare? [10:32:39] from the cookbook code I can't spot anything that could obviously fail, IPMI connection is checked before starting so that would be caught quickly [10:32:57] XioNoX: is what https://wikitech.wikimedia.org/wiki/Server_Lifecycle#Preparation says [10:34:01] ipmi is like the dell idrac interface right? [10:35:03] basically yes, it connects to the mgmt FQDN and allows to issue commands [10:35:15] doesn't have the full idrac capability [10:36:43] ok... and this connection? it's some sort of connection provided by iDrac that relays the serial vty output back over TCP? [10:37:33] LMWTFY :-P https://en.wikipedia.org/wiki/Intelligent_Platform_Management_Interface [10:37:36] and we do some sort of "expect" style parsing of that? I know it's all likely on wikitech so sorry for the qestions :) [10:38:34] no, not at all [10:38:41] it's an "api" if you want [10:38:46] we issue commands lik [10:38:53] ipmitool -I lanplus -H "$HOST.mgmt.$DC.wmnet" -U root -E chassis power status [10:40:26] ok very good. well that's a lot better. [10:40:59] I was looking at the instructions to add the server to the "linux-host-entries.ttyS1-xxxxxx" files I think led me wrong. [10:41:14] where? those should be all gone [10:41:19] doesn't apply anymore [10:41:35] I'm guessing the choice of file / serial speed there is just to set the grub config [10:41:38] oh ok ! [10:41:54] Referenced here: https://wikitech.wikimedia.org/wiki/Server_Lifecycle#Preparation [10:42:15] thanks, let me remove that [10:42:31] XioNoX: ok for me to run the sre.hosts.ipmi-password-reset cookbook for drmrs hosts? [10:46:05] volans: I think, yeah :) [10:46:14] the other option is to install first [10:46:18] and change it later [10:46:19] not sure what it entails [10:46:20] so that in case of issues [10:46:27] we can change it from the host locally [10:46:44] I see [10:47:05] should be fine to run it now [10:47:21] either it will be the real password or the default one [10:47:29] let me try with one host first [10:47:33] dunno what are the failure scenarios [10:47:48] volans: on the "or we can start adding all the bits to puppet's hiera and site.pp" had a consultation yesterday with bblack a he indicated some bits on which we could start working on... I'll share the tracking google sheet with you all in a few :) [10:48:50] mmandere: ack, thanks [10:50:56] 👍 shared [10:53:07] mmandere: we should look at the New sheet? [10:53:12] and not OLD_IGNORE, correct? [10:53:51] That's right... sorry didn't mention that [10:54:09] no prob, just to avoid confusion :) [10:54:18] Hidden it to avaoid confusion [10:55:33] We got all instances of `eqsin` in our production code and used that as reference/guide to help add drmrs [10:55:43] nice [10:55:56] mmandere: in theory all the "ip block" should be good to go [10:55:59] learning loads just going through the items on that sheet thanks :) [10:56:38] We also used reviews from initial bulk patch to help in the new sheet add safe non dependent patches [10:57:30] XioNoX that's right, "ip block" should be good to go [10:57:57] "First pass" should be a priority according to bblack: [10:59:32] volans: do you know if the in_setup has dependencies on those files? [11:02:38] they need just base production and base firewall [11:02:44] so most of them will not be needed [11:03:07] I guess we need to check global hiera for the dc and install_server and similar [11:06:29] * volans patching the sre.hosts.ipmi-password-reset cookbook to support this use case [11:20:15] ^^^ https://gerrit.wikimedia.org/r/c/operations/cookbooks/+/735360 [11:24:28] volans: thx :) [11:28:18] volans: spicerack.netbox_server(host.split('.')[0]).mgmt_fqdn is a "ready" to use function I'd guess? [11:28:30] I mean netbox_server().mgmt_fqdn [11:32:55] XioNoX: https://doc.wikimedia.org/spicerack/master/api/spicerack.netbox.html#spicerack.netbox.NetboxServer.mgmt_fqdn [11:33:04] not sure what you mean by ready to use D [11:33:04] :D [11:34:21] yep, that :) [11:40:54] the patch works, the IPMI is not working I'll troubleshoot after lunch [11:43:18] ok thanks [12:38:50] morning :) [12:43:16] mmandere: let me know when you're ready, and we can start working through puppet stuff to get the first hosts up [12:46:22] bblack: morning :) ack! [12:47:01] ok, so: let's run down the first-pass parts of the sheet one by one and build up some commits? [12:48:50] Understood... currently working on `common.yaml` [12:49:05] ok [12:49:16] I'm just gonna run down the list from the order it's in a bit [12:49:31] I removed the "first pass" comment from the first line btw. I realized we don't really need that part yet. [12:49:33] Wasn't sure on how we wanted to categorize instances in `conftool-data/node` [12:49:42] so yeah hieradata/common.yaml is next [12:50:13] Got it... just seen it :P [12:50:54] main things there are adding to the "datacenters:" list, [12:51:18] there's a strange comment above the wikimedia_clusters part that I'm not sure what it means, about not adding new clusters per-site [12:51:27] but I think, probably, we want to add our new site to the ones that have eqsin [12:51:55] we should skip the prometheus node list for now, we don't have a node yet [12:52:38] and ntp_peers is there as well, we can add to that list [12:53:17] unfortunately there's a lot in this one file, and not all of it can go in the first pass :) [12:53:53] maybe we should have some convention for bits we have to intentionally skip over, in files we already looked at like this one [12:53:54] Working on on the long list of cache hosts [12:54:11] maybe leave a comment in those parts like "# DRMRS-TODO" so we can find them later [12:54:28] mmandere: ok [12:55:26] the bastion_hosts list is an example: we don't have a bast yet (we have to have ganeti clusters first), so we can put a todo mark there for later [12:55:48] bblack: understood, we can still have this tracked in our sheet as unfinished for a later visit [12:56:14] yeah, I just don't want things we've seen within this file to get forgotten later. the comment will make it easy to grep for them later and remember. [12:57:44] I see your point... understood 👍 [12:58:05] [aside, the way the commits end up looking for setting up a new site really makes my brain scream to refactor things to be more datacenter-centric where a lot of these disparate changes would all be in per-dc hieradata. In practice, that's probably not the best choice so long as new datacenters are so few and far between though, other practical factoring angles matter more most of the time) [12:59:38] anyways, when you get to a good stopping point, push up a patch for the common.yaml and we can refine from there [13:00:00] this is probably the most-complicated single file, a lot of the other line items in the list will go quicker :) [13:15:28] (in the meantime, I'm looking at some generic cleanup commits where they simplify our upcoming drmrs bits) [13:20:56] XioNoX: so I picked a random host, and remote IPMI was disabled, I've enabled it via racadm [13:21:05] XioNoX: on the bird BGP-to-router stuff (for various edge dns hosts that use that stack), do the switches have the config/capability to point bird at them from our hosts and advertise service IPs? [13:21:26] bblack: not right now but we will get to it [13:22:00] ok. it's on our list to configure the puppet end of that, I guess I'll put in the primary IPs of the switches rather than the non-existent routers :) [13:22:03] fwiw on that point I tested the "dynamic neighbors" thing. It works fine, but won't for us I think as we have different peer-ASNs in the same subnet [13:22:07] well, the loopback IP or something [13:22:13] i.e. for LVS, Kubernetes, DNS etc. [13:22:32] The gateway IP on each subnet is what you can peer with - so generally first usable IP. [13:22:50] yeah, better to use the gateway IP than loopback to not have to enable multihop [13:22:55] ok [13:22:59] got it [13:24:05] we're probably going to bring up dns600[12] as our first examples, so focusing on the puppetization bits that actually affect those first. [13:29:00] [if the BGP doesn't work on the switch end at first that's fine of course] [13:35:09] np.... the BGP config can be added quickly enough. [13:35:11] Note that we aren't announcing any of the public aggregates from drmrs yet either, so even if BGP is up to swtich IP won't be routable from the internet. [13:35:26] again that's not a massive piece of work to put in place when we need to [13:37:58] yeah mostly this is about the internal recdns part, the bgp anycast for 10.3.0.1 [13:38:11] (for these first cases) [13:38:29] bblack: we have our first patch for review here https://gerrit.wikimedia.org/r/c/operations/puppet/+/735389 [13:38:40] ok [13:39:23] checking it out, take me a couple mins! [13:39:53] exciting! [13:39:55] :) [13:40:31] yey [13:41:13] yayay! [13:44:57] mmandere: nice work! I'll just drop commentary here on fixups/additions, as it will be simpler than cycling through notes in gerrit! [13:45:40] bblack: ack! [13:46:01] 1) The #DRMRS-TODO you have on the wikimedia_clusters list for bastion and prometheus sections - I think those don't need TODO, we can just define them now and not break anything. [13:47:19] [for now just work down these in your local copy, and after a few we can re-upload and re-check again!] [13:47:57] 👍 on it [13:48:36] 2) there's an authdns_servers key in this file. we should add dns600[12].wikimedia.org there, but in this case, keep them commented out with DRMRS-TODO for now (as they will affect the live authdns-update script, etc) [13:49:24] 3) Somewhere further down there's a bastion_hosts list of IPs, that starts: [13:49:27] bastion_hosts: [13:49:30] - 208.80.155.110 # bast1003.wikimedia.org [13:49:32] - 2620:0:861:4:208:80:155:110 # bast1003.wikimedia.org [13:49:49] we don't even have the IP assigned for ours yet, but maybe just add a generic # DRMRS-TODO at the bottom so we don't forget this later [13:50:32] XioNoX: or topranks: I assume the internal asn for drmrs is 65006 ? [13:50:54] yep [13:51:15] 4) near the bottom of common.yaml, there's an "asns:" key, we need to add drmrs there with 65006 [13:51:39] actually, the ToR switches might use something different, but you can set that for now, we will update it if needed [13:51:47] ok [13:52:18] mmandere: I think, with those 4 things, the overall patch should be good to go, for the phase of the process we're at. [13:52:34] bblack: ack [13:57:33] and then, up to you: we could add some other other smaller changes from the other checklists and keep growing this commit, or we can move on and merge this one and keep the next one separate [14:02:21] 🙈 I'll take the second option; merging this first [14:02:44] mmandere: ps2 looks good, go ahead and merge up this one and let it start rolling out naturally while we look at the next stuff [14:02:53] I see that we have a entry for cache_canary in hieradata/common.yaml - that can probably be removed (not necessarily now!) [14:03:15] ema: ack thanks, I'll kill it (I've been trying to find things like this and remove them as we go) [14:04:57] ema: found this one earlier too :) https://gerrit.wikimedia.org/r/c/operations/puppet/+/735382 [14:05:03] bblack: should I go ahead and remove it, cache_canary ema has pointed out [14:05:27] mmandere: no, I'll do that in a separate cleanup commit, it doesn't directly affect drmrs [14:05:31] very nice, this feels like spring cleaning [14:06:28] we need a system that randomly drops hiera settings and checks if there are changes at all in the catalog :) [14:07:00] yeah I've be pre-gamging some of the other files we're touching soon, and there's some questionable entries here and there heh [14:07:57] one confusing one to sort out was that some sites define mail_smarthost and wikimail_smarthost in hieradata/eqsin.yaml (or whatever per-site), and some don't, and there is a default, and they all have the same two entries regardless. [14:08:35] the trick is the default has them ordered as eqiad,codfw, and the overridden ones have them in the opposite order, based on which dcs are eqiad-side or codfw-side. So it's actually as-it-should-be, just not obvious at first glance! [14:08:41] bblack, ema: we have the changes merged :) on to the next one :P [14:09:17] mmandere++ [14:09:18] mmandere: the next few are pretty small, so we can do a few in one patch I think [14:09:39] for the monitoring.yaml one: [14:09:49] bblack: understood [14:10:26] under the Network section, we need entries for the two switches: asw1-b12-drmrs and asw1-b13-drmrs (basically just like eqsin, but there's two and the naming is slightly different) [14:10:59] and at the bottom of the file, under mgmt_parents section, one for drmrs using mr1-drmrs [14:11:04] I think that's it in that file [14:12:17] and then the next is hieradata/drmrs.yaml : [14:12:43] FYI I'm going to all racadm to set remote IPMI enabled, it was not set apaprently [14:12:48] for this one, copy from esams instead of eqsin as a starting point [14:12:51] volans: ack [14:13:08] I was planning to also update the ipmi/mgmt password [14:13:18] but let me know if you think is easier with the default one for now [14:13:44] volans: probably easier to have the standard one, so that once puppet stuff is ready we can image with the normal process [14:14:32] the mgmt password is asked interactively to the user, so no need to adapt that anyway [14:14:57] but yeah I'll do that too [14:15:07] yeah but I don't even know what the default one is, so "normal process" is using the one out of the pws repo for me :) [14:15:16] ack [14:16:19] mmandere: for hieradata/drmrs.yaml , the one thing that's tricky there is we need to change that prometheus_node value to the right one. And I'm guessing that will break something on the first nodes we bring up since it doesn't even exist in netbox/dns yet. For now I guess just set it correctly (prometheus6001), and we'll deal with it a little later. [14:17:16] (there's a lot of strange chicken-and-egg issues like this when bringing up a new site. In all likelihood it will take a few passes until even the first set of hosts are all working correctly, especially with infra hosts like these being within the local ganeti cluster that doesn't exist yet) [14:17:59] we might just have to go ahead and define the DNS entries in netbox for that node, actually, since I think the puppetmaster will look up the IP [14:18:09] bblack: ack... working on the `hireadata/drmrs.yaml` [14:21:15] Done with the two [14:21:26] ok [14:21:46] above ... now next step looks like we are going to have to add drmrs directory in hieradata [14:21:47] then there's the next first-pass which is the hieradata/drmrs/profile/bird.yaml [14:21:57] (yes, have to add two levels of directory there) [14:22:14] the two entries there need to be the gateway IPs of our two switches in drmrs for now [14:22:16] ok.. replicating `eqsin` dir [14:22:21] so basically: [14:22:46] 10.136.0.1 # asw1-b12-drmrs gateway [14:22:52] 10.136.1.1 # asw1-b13-drmrs gateway [14:24:08] got it [14:24:39] and then lastly we can stuff manifests/realm.pp change in here [14:24:53] basically we need to add regexes to that set near the top, for the public+private drmrs networks [14:25:49] so the public network is 185.15.58 [14:26:29] and the privates are all within 10.136 [14:26:49] so it's basically just like the regexes for eqsin, but with those sets of numbers instead of the eqsin sets of numbers [14:28:44] oooh got lost a little ... was comparing tthe IPs with netbox. Working on it now [14:29:54] yeah these entries for realm.pp don't have to map to a particular real vlan prefix. it's more like just the broad addressing plan. Those entries are used by puppet to say "if the host's IP address matches this regex, then it exists in site X", they're not used as ACLs or anything like that. [14:31:36] anyways, this is a good stopping point to cut and review a patch, and then next we can move on to the site.pp/dhcp/subnets/macaddrs/etc stuff [14:32:33] and the other chunk is the NTP-related bits. Maybe we can knock those out first actually, they're a little simpler and separate as well. [14:35:50] inside eqsin there was another directory 'hieradata/eqsin/profile/systemd/timesyncd.yaml' [14:35:58] changed that as well [14:37:10] now working on `manifest/real.pp` [14:37:50] yeah the timesyncd part is NTP-related, but it can go here as well, either way [14:37:53] bblac: or should I first push these changes [14:41:03] doesn't matter, either way is fine [14:41:51] just seen your suggestion... commiting and will push shortly [14:43:42] in the meantime, I think something from our first patch temporarily broke icinga config. it's kind of normal to run into things like this during this process :) [14:44:05] more chickens and eegs [14:47:05] remote IPMI done on all, running the sre.hosts.ipmi-password-reset cookbook [14:47:54] bblack: here's the new add https://gerrit.wikimedia.org/r/c/operations/puppet/+/735399 [14:48:39] just a sec, pushing up a fixups for the broken icinga [14:49:28] mmandere: https://gerrit.wikimedia.org/r/c/operations/puppet/+/735401/1 [14:49:28] no problem [14:50:16] ^ this is what I ended up debugging it as. Even though that 'datacenters' key seems like a very low-level abstract thing... in practice it seems to be used to define a bunch of icinga checks for things that don't exist yet (like monitoring text-lb.drmrs.wikimedia.org), so it breaks our monitoring config, so we'll have to comment it back out for now [14:51:12] pushing that now [14:51:49] ok [14:52:44] mmandere: on your new patch [14:52:46] lurking here is a really great way to learn how all this stuff fits together I gotta say :) [14:54:03] 1) For the drmrs.yaml file - delete the bottom half, everything after the public_tls_unified_cert_vendor line. Some datacenters need these mail host definitions, and some use the default. Basically eqsin+ulsfo+codfw need that config, but eqiad, esams, and drmrs use the default. [14:54:30] (the default uses eqiad before codfw in the order of the list, and the non-default ones prefer codfw because they're on that side of the network) [14:55:12] topranks: I see you peeping :P ... indeed lots to learn [14:55:25] bblack: got it making the changes [14:55:50] 2) for the bird.yaml - we don't have a cr[12]-drmrs, because of supply chain shortages/whatever. The two entries here have to be: [14:55:59] 14:22 < bblack> 10.136.0.1 # asw1-b12-drmrs gateway [14:55:59] 14:22 < bblack> 10.136.1.1 # asw1-b13-drmrs gateway [14:57:04] ^^ that will remain the case even when the CRs are delivered actually. But yep. [14:57:33] 3) for the timesyncd.yaml part, the bottom two servers should be dns1001 + dns1002, rather than the 2001/2002 that you have there now. (This is an other example of configuration that depends on which "side of the world" you're on - drmrs is more like esams, not eqsin, in that it's closer to eqiad than to codfw) [14:57:58] rest looks fine! [14:58:36] topranks: oh right, I guess we'll still have bgp per ToR switch with this new layout, even when we do have routers, right? [14:58:54] Yeah exactly it's layer-3 all the way. [15:06:41] mgmt passowrd set to our standard one [15:07:25] mmandere: and I guess the other part is, we need to set up prometheus6001.drmrs.wmnet in netbox and get it an IP in DNS, even though we can't bring it up yet (ganeti dependency). [15:07:37] so that the puppet lookups of that hostname don't fail [15:07:57] bblack: prometheus6001.drmrs.wmnet will be a VM? [15:08:27] can't we set prometheus in eqiad for now? [15:09:28] can we? [15:09:51] ask the experts :-P [15:10:07] I wasn't sure if that would screw things up, if we send some of the first prom metrics from drmrs straight to eqiad or whatever [15:10:10] not sure about side effects [15:10:20] creating metrics on the wrong prometheus [15:10:25] yeah [15:10:27] maybe we could send them to failoid [15:10:33] true [15:10:38] also [15:10:42] I don't want to deal with defining the ganeti cluster at the netbox level yet anyways [15:10:50] prometheus is polling [15:10:54] so what should fail? [15:10:58] I was going to just define an IP, incorrectly, for the name just to make DNS lookup succeed [15:11:18] ok, go ahead if you think that will help, hacky but easy to revert [15:11:22] volans: part of the puppetiation wants to do a runtime lookup (for catalog compile) on the prometheus hostname [15:12:06] and how it uses it? [15:12:08] but yeah, maybe we can just mark it TODO and put some other junk hostname there that doesn't do prometheus [15:12:21] volans: honestly, I have no idea [15:12:32] in the end it's just a CNAME but you could put a static IP in the dns repo [15:12:41] even without touching netbox if you want to trick it [15:13:22] hmmm true [15:13:35] the prometheus_nodes hiera is more concerning to me [15:13:43] we left that one out for now [15:13:53] ah ok must be looking at previous PS [15:13:58] my bad [15:17:22] bblack: I have the new changes here https://gerrit.wikimedia.org/r/c/operations/puppet/+/735399 [15:18:03] mmandere: I think that looks good. Based on the conversation above, I'm gonna push a temporary hack for the prometheus6001 hostname to exist (but not as anything useful) before we merge that. [15:18:28] ok [15:18:38] since this new patch references it [15:18:44] https://gerrit.wikimedia.org/r/c/operations/dns/+/735405 [15:18:52] one thing strikes me - and I could be completely wrong - but how are the profile::bird::neighbors_list IPs used? [15:19:24] We have a situation that a server/VM in rack b12 should peer with 10.136.0.1, and a server in b13 should peer with 10.136.1.1 [15:19:41] yeah, maybe [15:19:52] I think, for single-hop, that might be how it *should* be [15:19:58] as previously everything at a given site peered with the same CRs I'm not sure the templates consuming them will know what to do. [15:20:07] yeah I'm not sure either [15:20:22] we could have a per server definition, but it's not ideal [15:20:22] but since it's heiradata, we can probably fix it with per-host hieradata files for the affected hosts [15:20:28] We won't configure "multi-hop" for the BGP sessions on the switches anyway. [15:20:46] So if an end-server ends up trying to peer with both it should fail to the non-adjacent one. [15:20:55] right [15:21:06] for now this will "work", it will just generate some noise from the failing session on each [15:21:20] Yes I expect so. [15:21:41] the affected cases (dns600[12] and wikidough), we can switch to per-host hieradata for now I guess [15:22:33] In a world where everywhere was like this we could also just change Bird template to do some arithmetic on it's own configured IP address/netmask [15:22:48] Setting the peering always to the first IP in the subnet. [15:23:23] yeah, or for the switch/vlan/etc info to be part of the data nodes know about themselves [15:23:35] (in hieradata terms) [15:23:40] like we do for sites now [15:23:46] yeah exactly. [15:24:13] in any case though, the bird-based usecases are pretty limited and not expected to grow. possibly shrink over time, once we have a new-L4LB design in place later. [15:25:51] I think wikidough will eventually move to new-L4LB anyways. The anycasts for authdns, recdns, and syslog, either will remain as the only low-level-enough things to use the bird-based mechanism, or depending on how design issues work out, maybe one or more eventually moves to new-L4LB as well. [15:26:15] but we'll probably have to solve the same issues for the new-L4LB, differently [15:26:21] syslog could go to L4LB as well [15:26:29] and L4LB could handle public anycast as well [15:26:44] yeah we have the same kind of calculation for the L4LB probably, assuming it does BGP. [15:26:47] I think only DNS is a special case [15:27:48] topranks: yeah it will, it will just be a different kind of deployment model than the existing pybal/lvs [15:28:00] but ultimately, the same L3 to ToR issues will apply [15:28:31] yeah, shouldn't be difficult to come up with a good solution I think. We'll cross that bridge when we get to it. [15:28:47] XioNoX: technically, I think DNS can eventually do it too. The main reasons we don't now are because of dependency loops and weird things like that. I think we can eventually eliminate those problems. [15:29:21] but it will take time. bird will exist for at least some of the current usecases for a couple more years at least :) [15:29:35] it does the job :) [15:29:38] mmandere: ok I pushed the DNS hack in https://gerrit.wikimedia.org/r/c/operations/dns/+/735405 [15:30:37] bblack: great [15:30:40] mmandere: https://gerrit.wikimedia.org/r/c/operations/puppet/+/735399 is good to go if you want to merge up! [15:31:06] mmandere: and I imagine once again our lack of timezone overlap is getting in the way, it's getting pretty late for you by now? [15:31:28] whenever you're ready to log off for the day just say so, we can pick up again tomorrow or whatever. [15:31:41] :D :D I am going to merge it now [15:31:44] ok [15:31:59] bblack: no problem I will let you know [15:37:45] bblack: done merging [15:37:57] looking at ream.pp [15:37:58] great job team! [15:38:30] :) thank you sukhe: [15:40:01] mmandere: looking good :) [15:40:10] realm.pp, I think I had some notes on it above: [15:40:30] 14:24 < bblack> basically we need to add regexes to that set near the top, for the public+private drmrs networks [15:40:33] 14:25 < bblack> so the public network is 185.15.58 [15:40:35] 14:26 < bblack> and the privates are all within 10.136 [15:40:38] 14:26 < bblack> so it's basically just like the regexes for eqsin, but with those sets of numbers instead of the eqsin sets of numbers [15:40:42] ^ basically that - copy the eqsin lines, change the numbers in the two regexes [15:41:14] bblack: got it [15:44:33] topranks: not that I really love our network regex in realm.pp, but assuming we keep that mechanism, we could do the same thing for ToR BGP too (have per-rack/vlan regexes like that, which deduce the ToR gateway IP for puppet in the general case) [15:44:46] or do something fancier so we're not duplicating information netbox has access to, of course :) [15:45:40] we're probably due for some kind of push towards more netbox-driven inputs able to be consumed by puppet [15:46:42] 100% with you on that last point [15:47:04] But that said regex's work it might be straightforward enough way to do to it, at least short term. [15:48:18] yeah, or "if BGP peer not defined but BGP enabled, peer with the default gateway" kind of thing [15:49:35] bblack: missed your not on `realm.pp` :( could have added to the last merge... [15:49:38] "default_routes" is already a facter fact [15:52:13] mmandere: it's ok, we can just push that as a one-off [15:52:52] XioNoX: oh good point! we could definitely change the bird puppetization to default to the def gw if the puppet hieradata is undefined. [15:52:57] ok [15:53:10] that seems like a sane path [15:55:16] +1 [15:57:13] bblack: here's the small change https://gerrit.wikimedia.org/r/c/operations/puppet/+/735409 :P [16:02:30] mmandere: sorry had a phone call I had to take, looks good, ship it! :) [16:03:29] mmandere: if you're ready to cut out, we can stop there. Or if you want to go a little further, the next-simplest bit is to re-do the NTP patch you did last week that I reverted (with one small change, and the dns stuff that exists now, it should work now) [16:04:18] bblack: it's ok.. let me merge it [16:05:07] Nice once, we can hit the NTP as well, I must say it's been troubling me 🙈 [16:08:24] ok [16:09:05] mmandere: as a starting point, you can re-do the same NTP patch you had before: https://gerrit.wikimedia.org/r/c/operations/puppet/+/732954/2/modules/profile/manifests/ntp.pp#6 [16:09:31] the only new change we need on top of that, is on these lines further down in the file: [16:09:34] eqiad => 'tos maxclock 14 minsane 2 orphan 12', [16:09:36] 64 [16:09:36] bblack: https://gerrit.wikimedia.org/r/c/operations/puppet/+/735410 [16:09:39] codfw => 'tos maxclock 14 minsane 2 orphan 12', [16:09:41] 67 [16:09:44] codfw => 'tos maxclock 14 minsane 2 orphan 12', [16:09:55] (sorry bad paste), but those two lines) - change "maxclock 14" to "maxclock 16" [16:10:37] ok understood [16:16:45] the only other issue with that patch the first time, was we didn't have the ntp_peers hostnames for the site define, but we already fixed that earlier in your common.yaml commit at: https://gerrit.wikimedia.org/r/c/operations/puppet/+/735389/3/hieradata/common.yaml#1189 [16:17:04] (and they weren't in DNS back then either, but now they are via netbox) [16:17:40] correct... was checking your mention here as well https://phabricator.wikimedia.org/P17587 [16:18:09] the maxclock part isn't worth trying to understand. Our NTP network design is not great, and we'll fix it all globally at a later time and not need crazy parameters like that :) [16:19:55] XioNoX: yeah that's kinda what I was thinking too. re: ipv6 - do we establish a separate v6 bgp sessions to advertise v6 service IPs, or do we just send v6 service IPs over the ipv4 bgp session? If the latter, we don't have to worry about v6 at this level I don't think. [16:20:22] but I think I remember something about v6 needing ot be over v6, I don't remember for sure though [16:21:18] mmandere: yeah looks mergeable, let's deploy it! [16:22:01] bblack: on it [16:22:55] what that leaves now, from the "first pass" bits we haven't touched, is basically site.pp (defining which nodes puppetize as which roles, probably initially just a couple of hosts like the dns600[12] having the dnsbox role), and all of the install_server bits (which define how DHCP works for installs, and the subnets it operates on, and the DHCP meta-parameters that are sent, etc) - basically [16:23:01] the parts that directly facilitate imaging new hosts via dhcp/tftp. [16:23:33] but that's another complex chunk of work with a few new files and tricky bits, probably best left for another day given how late it is over there :) [16:26:07] ok we can work on them again tomorrow if that's ok with you [16:26:34] yeah sounds good! [16:26:54] I'll keep an eye out for now in case there are any other minor issues resulting from our merges today and see if there's any small fixups, etc [16:27:05] see you tomorrow! :) [16:28:09] Great, I have updated our sheet with the progress as well :) [16:28:17] see you tomorrow :) [16:28:51] nice! see you! [16:52:40] bblack: my preference usually is for separate v6 peering for v6 addressing. [16:52:51] trying to be ready for "v4 switchoff day" [16:53:02] but I'm not 100% sure what we do now let me check. [16:53:06] :) [16:53:17] I like your optimism! :) [16:53:54] haha closer to sarcasm unfortunately :( [16:54:13] I imagine maybe there's a realistic interpretation, that we could aim for an internal v4 switchoff at some point in the foreseeable future. [16:54:27] (for all our private vlans, basically, and all infrastructure-y things) [16:54:41] yes for machine-to-machine traffic internally [16:54:57] where v4 only exists for public subnets for public traffic, and public LVS adverts for v4 public service Ips [16:54:59] and there is a good argument it's easier than managing two network stacks [16:55:06] yeah [16:55:10] but yeah need dual stack for the internet [16:55:28] our dns is fairly good, so in theory nobody notices :) [16:55:52] I imagine by the time v4 can fully sunset on the public Internet, the term "sunset" won't make sense anymore because our sun will have already exploded :P [16:56:07] sounds about right :( [16:57:20] still, we could develop a staged roadmap to phase our way towards mostly-v6 on the inside, and start with just giving v6 primacy in our dual-stack environment. [16:57:37] e.g. having imaging and dhcp/tftp and the "primary" interface/IP, etc... all be v6 [16:57:46] Happy Eyeballs is probably doing the latter for us anyway. [16:57:57] I don't think so, for some of that early imaging stuff, but not sure [16:58:13] and in a lot of places, the ipv4 of a host is what defines it or relationally-keys it to something else. [16:58:33] sorry no not for that internal stuff, I was thinking about internet based read you wrong [16:58:34] the dhcp stuff is definitely all ipv4 based for our netboot installs [16:59:33] risking to make a dumb question and going a bit on the opposite direction, what would be the net gain of a v6-only internal traffic? As opposed to a v4-only internal and public-facing dual stack. [16:59:35] I haven't thought it all through, but I think that's where we'd start: making v6 the primary/necessary thing (netboot, hieradata/database keys, etc) even though we're dual-stack. [16:59:35] yeah 100%. DHCPv6 is a different thing altogether. [17:00:00] volans: because on the real "v4 switchoff day" we don't want to still be running IPv4 because our whole internal network is v4 [17:00:11] volans: it's not a dumb question. in my mind, if we're not going to eventually target v6-only on the inside, we might as well stay v4-only on the inside. [17:00:15] this inbetween state is ewww [17:00:15] but it's a fair point. we're not going to run out of addresses internally. [17:00:24] and v4 is still probably better supported [17:00:26] because of the complexity of the v6 stack and also not so well tested like the v4 one on some cases too [17:00:35] indeed [17:00:49] volans: let's configure IPv6 on the idrac! what could go wrong? [17:00:52] but it is The Future, and we can't just stop doing ipv6 altogether [17:01:18] at $JOB-N they were running out of IPv4 internal space, but we're so far from that [17:01:22] poor idrac yeah first thing that came into my mind too [17:01:27] so it makes sense that we should at least aim in the general direction of giving ipv6 primacy and eventually deprecating/removing internal ipv4 cases for simplicity [17:01:35] don't ask me to automate idrac v6 :D [17:01:36] 10/8 should give us 255 sites. [17:01:45] and tbh no need at all for a /16 per site really [17:01:56] there's other spaces we could eat up, too [17:02:48] someone would have to map out what that all looks like (the internal ipv6 transition phases), which parts are easy, what challenges need real projects to tackle them in what order, etc. [17:03:08] attacking it ad-hoc going after one random thing at a time would be chaos [17:03:17] But to bblack's point, like it or not IPv6 is "the future". Only realistic one. We don't want internal IPv4 forever. [19:16:36] Heyas i just powered off ganeti4003 accidentially and powering it back up [19:16:41] so i just caused a large outage [19:16:44] of ganeti ulsfo serivces [19:22:16] Ok, its coming back but now some bgp alarms on ulsfo routers [19:22:24] and im in both racks but im not near the routers in top of racks [19:22:26] so not sure whats up [19:35:03] the VMs all came back and Icinga is now also happy again [19:35:11] that included ncredir4 for a short time [19:35:18] they rebooted [19:35:28] -operations [20:06:00] 10Traffic, 10DC-Ops, 10SRE, 10ops-ulsfo: Q1:(Need By: TBD) rack/setup/install cp403[3-6].ulsfo.wmnet - https://phabricator.wikimedia.org/T290694 (10RobH) [20:06:10] 10Traffic, 10DC-Ops, 10SRE, 10ops-ulsfo: Q1:(Need By: TBD) rack/setup/install cp403[3-6].ulsfo.wmnet - https://phabricator.wikimedia.org/T290694 (10RobH) a:03RobH [20:24:57] 10Traffic, 10DC-Ops, 10SRE, 10ops-ulsfo: Q1:(Need By: TBD) rack/setup/install cp403[3-6].ulsfo.wmnet - https://phabricator.wikimedia.org/T290694 (10RobH) [21:52:54] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10ops-drmrs: (Need By: TBD) setup/config PDU in drmrs ( ps1-b12 and ps1-b13) - https://phabricator.wikimedia.org/T294597 (10BBlack) I think what we're missing here is the necessary network hardware entries in `modules/netops/manifests/monitoring.pp` to crea... [23:33:16] 10Traffic, 10DC-Ops, 10SRE, 10ops-ulsfo: Q1:(Need By: TBD) rack/setup/install cp403[3-6].ulsfo.wmnet - https://phabricator.wikimedia.org/T290694 (10RobH) [23:43:20] hrmm, can you not batch with the new install script? [23:43:28] seems its singular [23:43:59] 10Traffic, 10DC-Ops, 10SRE, 10ops-ulsfo, 10Patch-For-Review: Q1:(Need By: TBD) rack/setup/install cp403[3-6].ulsfo.wmnet - https://phabricator.wikimedia.org/T290694 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by robh@cumin1001 for host cp4033.ulsfo.wmnet with OS buster [23:46:45] 10Traffic, 10DC-Ops, 10SRE, 10ops-ulsfo: Q1:(Need By: TBD) rack/setup/install cp403[3-6].ulsfo.wmnet - https://phabricator.wikimedia.org/T290694 (10RobH) [23:46:51] 10Traffic, 10DC-Ops, 10SRE, 10ops-ulsfo: Q1:(Need By: TBD) rack/setup/install cp403[3-6].ulsfo.wmnet - https://phabricator.wikimedia.org/T290694 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by robh@cumin1001 for host cp4034.ulsfo.wmnet with OS buster [23:50:10] 10Traffic, 10DC-Ops, 10SRE, 10ops-ulsfo: Q1:(Need By: TBD) rack/setup/install cp403[3-6].ulsfo.wmnet - https://phabricator.wikimedia.org/T290694 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by robh@cumin1001 for host cp4034.ulsfo.wmnet with OS buster executed with errors: - cp4034 (*... [23:50:21] 10Traffic, 10DC-Ops, 10SRE, 10ops-ulsfo: Q1:(Need By: TBD) rack/setup/install cp403[3-6].ulsfo.wmnet - https://phabricator.wikimedia.org/T290694 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by robh@cumin1001 for host cp4033.ulsfo.wmnet with OS buster executed with errors: - cp4033 (*... [23:51:32] 10Traffic, 10DC-Ops, 10SRE, 10ops-ulsfo: Q1:(Need By: TBD) rack/setup/install cp403[3-6].ulsfo.wmnet - https://phabricator.wikimedia.org/T290694 (10RobH) Not sure why these are failing, but I'm out of mental bandwidth for them today. They are remotely accessible via idrac and will accept script commands....