[07:06:39] reminder, I'm going to reboot cumin1001 shortly [08:57:54] jynus: hey, there's a pending puppet patch you own, can I merge it? [08:58:10] let me see, I may have forgotten to merge [08:58:19] Jcrespo: dbbackups: Copy database check password in anticipation of role move (a2e9a82) [08:58:42] a yes, it is a labs one, so I forgot to deploy the noop to production [08:58:48] 👍 [08:58:51] can you cancel your process so I can merge or merge? [08:59:08] I'll merge, I have a patch too :), np [08:59:29] done! [09:53:21] <_joe_> jeena: is "gitaly" something related to gitlab? [09:53:28] <_joe_> err I meant jelto [09:53:43] <_joe_> I see it failing regularly prometheus probes [09:54:20] I guess so from the docs: Gitaly provides high-level RPC access to Git repositories. It is used by GitLab to read and write Git data. [09:54:32] also is quite a funny name for italians :D [09:59:43] <_joe_> yeah [10:00:14] I was creating a vm, and the dns diff has a weird extra entry on it, regarding openstack [10:00:58] jynus: check with dcaro, he's runnign the netbox cookbook too [10:01:03] ah! [10:01:24] I think is a change related to something I pointed out in a CR fwiw [10:01:46] dcaro, I got this: https://phabricator.wikimedia.org/P27916 (in addition to the expected change on my side) [10:02:25] yes is that one, go ahead, is related to https://gerrit.wikimedia.org/r/c/operations/dns/+/793003 [10:02:49] _joe_: yes gitaly is a gitlab component. Its failing on gitlab1003 which is beeing migrated currently [10:03:11] ok, I was worried it was unexpected and it could cause a networking issue on cloud [10:03:14] will merge now [10:04:47] volans: that script is very nice, BTW [10:05:23] thanks :) [10:05:54] volans: sorry, just saw [10:06:07] jynus: thanks for merging :) [10:06:26] dcaro: no issue happened, just worried something was bad regarding cloud networking or that I could break something [10:06:44] thanks for double checking too ;) [10:06:45] good that I asked [10:08:00] ah, I notice now the "1dev" which I guess means it wasn't the cloud production dns [10:08:47] because I was confused about yoloing a recursor change :-) [10:09:34] it was a typo in netbox's data [10:10:25] ok [11:30:58] one last question- how can I tell a ganeti instance to install bullseye, given that the os option is no longer on a static file? [11:32:23] ah, it seems the old way still works, right? [11:41:30] yeah, these still need the old school DHCP settings at this point [11:42:08] make sure to add the pathprefix setting for bullseye, the DHCP file still defaults to buster [11:42:25] moritzm: too late :-D [11:42:32] but nothing that a reimage cannot fix :-) [12:20:31] dcaro: sorry to bother you again for the same records, but something is not adding up. If you look at https://netbox.wikimedia.org/ipam/ip-addresses/?q=openstack.codfw1dev (ignore the first one), the 2 IPv4 are in public1-b-codfw, but the 3 IPv6 are in public1-c-codfw. Also why the :118 one is managed differently from the others? [12:23:34] and then in templates/0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa (manual repo) there are 2 manual PTRs for the equivalent IPs in the public1-b-codfw (2620:0:860:2::/64) vlan [13:03:52] volans: looking, though I have little knowledge (specially historical) on that regard [13:06:06] dcaro: where are those IPs attached? they should match the same row. So at first sight it seems a mistake in when they got assigned in netbox (2620:0:860:3 vs 2620:0:860:2) [13:06:25] but I'm not sure, hence not yet proposing a fix :) [13:08:19] volans: they are used by cloudservices2004/2005-dev, both in B1 so I'm guessing the public1-b-codfw ones are the correct one [13:08:38] we recently had to renumber those when moving servers around, there's a task to move them to row-independent addressing [13:09:13] ^ there you go :), thanks taavi! [13:10:41] taavi: that would mean that the IPv6 ones are wrong and should be re-assigned with the correct ones [13:11:32] also I'm not sure why the 3rd IPv6, that doesn't have a matching IPv4, at least in Netbox [13:11:43] volans: definitely possible! though hard to say without being able to see what's actually in netbox [13:12:03] let me screenshot that for you :) [13:13:49] taavi: https://phabricator.wikimedia.org/F35151365 (ignore the first IP, unrelated to the above) [13:14:47] volans: the entry is protty old (2020-09), and was created by you according to the logs (maybe there's a way to dig out from there?) [13:15:14] it seems it was recursor1.openstack.codfw1dev.wikimediacloud.org for some time [13:15:43] then crusnov changed it to what it currently is [13:16:26] I don't have a memory of it, it might have been part of some automated way to align netbox to reality at some point [13:16:44] maybe [13:17:25] so the v4 addresses are correct [13:19:51] I don't see the IPv6 allocated on the hosts, according to puppetdb [13:19:57] our puppet code doesn't currently assign the v6 service addresses at all [13:20:11] (this seems like a separate bug) [13:20:14] probably why it was not catched [13:20:18] *that's why [13:20:29] yeah, plus no v6 connectivity on the clients (wcms vms) anyways [13:20:37] so I think we can just fix the subnets on netbox [13:21:08] ok, I'll re-assign the IPv6s on the row B public vlan and fix the DNS repo accordingly [13:21:12] sounds good [13:21:16] perfect, thanks [13:21:28] slightly related: was 208.80.153.118 used as a recursor IP in the past? I see that assigned to the interface on 2005 [13:21:30] as for the 2620:0:860:3:208:80:153:118 ? [13:22:07] ah, the IPv4 [13:22:12] I see, but not assigned on netbox [13:22:22] that's worrying [13:22:32] all used IPs should be tracked in netbox [13:22:50] I suspect that was the previous service ip (before we renumbered them due to the row change), that was just never cleaned up [13:23:35] so I'm tempted to just remove that and call it that [13:23:47] 208.80.153.118 is row D... [13:23:53] to simplify things [13:24:22] https://gerrit.wikimedia.org/r/plugins/gitiles/operations/dns/+/7bd402ab386553e8b57b117f185eee125d18b102%5E%21/#F2 [13:25:06] I think we can nuke from netbox the IPv6 mapping 208.80.153.118, and probably the IPv4 on the host too [13:25:21] agreed [13:25:27] I'll remove it from the host [13:25:45] ack, thx [13:31:09] patch at https://gerrit.wikimedia.org/r/c/operations/dns/+/793050 [13:31:19] I'm running the cookbook to propagate the changes in netbox [13:54:31] taavi: ^^ if you want to have a look at the CR [13:55:52] volans: lgtm! [13:56:13] I just updated it to fix the AAAA manual records too [13:56:19] not sure which PS you saw :) [13:57:01] last one, great [14:05:14] all done, thanks all [14:06:08] BTW v-olans, looking at that IPv6 ticket again, should have list for you by end of wk. [14:07:12] inflatador: ack, no hurry, it's not urgent, but if there is no blocker let's just get rid of a bit of tech debt ;) [14:07:31] just make sure v6 works fine :) [14:09:38] Yeah, I'm all for it if it helps. I was the IPv6 champion at one of my former jobs (sadly, didn't convince too many others) ;P [14:10:15] that's sad [14:13:09] slightly related: is there a reason our internal lvs services are still mostly ipv4 only? [14:13:28] yeah, it was just the type of customer we had, they never wanted to do anything (including getting off EOL) [14:14:39] ^^ kind of interested in this question too, I guess LVS can do its magic with NDP on v6? [14:16:18] <_joe_> taavi: yes [14:16:29] <_joe_> k8s and ipv6 don't love each other [14:16:41] <_joe_> also, personal opinion, we have no reason to do ipv6 internally [14:16:54] <_joe_> inflatador: we do have ipv6 services load-balanced [14:17:27] yeah I don't see much point in trying to move internal LB stuff to ipv6, unless/until we think we actually have a viable and supportable plan to go ipv6-only on the inside [14:17:27] <_joe_> the opinion comes from - I don't want a mess of ipv4/v6 mixed [14:17:35] <_joe_> ^^ [14:18:33] yeah, as long as v6 only clients can use our stuff, it doesn't seem urgent for the backend to use IPv6 [14:19:29] there's a good chance that, in practice, the public internet has to be very nearly fully ipv6 before we start wanting to make the change on the inside [14:20:09] otherwise we're dealing with strange issues at the border with refragmenting mtu=1500 ipv4 traffic into mtu=1280 internal ipv6 services and other such messes. [14:20:22] it's just a lot of pain for little gain at this stage [14:20:43] <_joe_> I think the transition to ipv6 is moving fast for consumer internet [14:20:57] <_joe_> one year ago there was no ISP supporting ipv6 in italy [14:21:04] https://www.google.com/intl/en/ipv6/statistics.html [14:21:07] <_joe_> (always at the forefront of innovation!) [14:21:23] there is progress, but it's still kinda slow [14:21:44] <_joe_> now there are two offering it by default [14:21:55] supposedly (some? all?) of the mobile networks in the US are v6 [14:22:02] <_joe_> bblack: yeah well is that about consumer clients or does it include everything else? [14:22:46] _joe_: it says "google users", I assume it's derived from chrome data or gmail/gsearch consumers' data, or some combination of such things [14:22:59] <_joe_> yeah probably [14:23:09] <_joe_> and the 5% figure in italy seems realistic [14:23:15] <_joe_> although it should increase quickly [14:23:35] I've had ipv6 support from my ISP (AT&T) for like 10-12 years now, and while it has improved in steps occasionally, it's still not good enough that I actually turn it on consistently and leave it on. [14:24:12] every couple of years I try it again, and still run into some weird issues that turn out to be AT&T's router not handling MTU stuff or NAT stuff correctly in a way I can't fix that affects some IOT device and/or my laptop :/ [14:24:32] the Canadian numbers for IPv6 are probably the mobile networks (the graph doesn't seem to make this distinction) as no major Canadian ISP I have been on has supported IPv6. it's pretty bad [14:24:43] I'm overdue to try again soon, it's been ~3 years since last attempt! [14:25:06] that reminds me, I need to hound my ISP, last time I asked they didn't support it [14:25:12] s/NAT/firewall/, whatever [14:26:01] what I really wish is that AT&T would better-support me hooking up my own router straight to their ONT, where I could probably work out getting ipv6 working. [14:26:17] <_joe_> ah as we speak ipv6 to eqiad from my isp doesn't work great [14:26:30] but they play games with keyed authorization to get the IP assignment to try to lock out 3rd party routers [14:26:43] yeah I've heard some of the "fun" replay attack-type stuff you have to do for those [14:26:46] and I'm not willing to go deal with that breaking again and working around it again over and over for my whole house [14:27:18] plus at the end of the day, that would just be a geek solution for me, it doesn't help with ipv6 working for everyone on AT&T :P [14:37:52] well, then you illegally resell their services with working IPv6 , of course ;) [14:38:30] My ISP has full IPv6, but it's DS-lite, i.e. you lose your public IPv4 if you switch to it. [14:38:40] Which is kind of a shitty choice to have to make [14:39:44] It's DOCSIS so you need their modem, similar lock out to what bblack described. [14:39:53] But they do at least have a 'bridge mode' option so it's not too bad [14:41:15] yeah I use a bridging type of option on my AT&T router too, but it's not really raw L2 or whatever, they're still interfering somewhat. [14:41:19] I think they call it "IP passthrough" [14:41:34] but it never completely disables their interference in the traffic [14:42:11] (their rationale is they also offer TV service over the same fiber, so therefore everyone must use their modem, so that they can always offer the possibility of using a MOCA coax out of it to run your TV set top boxes, which I don't :P) [14:43:44] that reminds me, I need to kick my MOCA stuff, it randomly decided to drop below fast ethernet speeds [14:44:27] but yeah, on the inside, I have an L3 switch and my own wifi routers, and everything about their router/modem that I can disable is disabled. [14:49:40] yeah.. the L3 switch is really useful to power up the APs over PoE [14:53:25] yeah that's what I do. I didn't buy the HW, it came with the house, but it's all Ruckus-branded stuff [14:54:05] a 12 port gig-E switch with L3 firmware (that I could add 2x10G SFPs to if I wanted), and a pair of their wifi APs mounted in the ceiling and PoE-powered from the switch. [14:55:03] and then formy office (the only wired-ethernet-heavy room really), I bought a little in-wall PoE-powered PoE switch, to give me 4x ports in this room from a single cable run. [14:55:46] this thing: https://www.amazon.com/PoE-Texas-GBT-4-IW-Gigabit-Extender/dp/B07Z59SG17?th=1 [14:56:12] to be clear: it's PoE-powered from the upstream switch, and offers 4x PoE ports of its own [14:56:54] How is Ruckus' stuff? I've been playing around with Miktotik, but always fun to check out random net device vendors ;) [14:57:16] well, I've never been completely happy with *any* wifi or router vendor, ever, so let's start there :) [14:57:44] but all things considered, I've probably had less random headaches and BS with this rucks setup than with many others I've tried over the years. It works well, relatively-speaking! :) [14:57:52] s/rucks/ruckus/ [14:58:41] but there are occasionally issues. I've had a few firmware updates I had to live with for a month or two that had bugs, where the APs would crash after ~1-2 weeks uptime for unknown reasons, and I'd have to powercycle them from the poe switch to get them back. [14:58:53] blech [14:59:24] and their wifi config is pretty complex, there's a lot of options you can easily get wrong that result in poor performance around the house, until you find the right magic setup of tunables. [15:00:01] but overall, really not bad [15:01:44] the witch I have is Ruckus ICX 7150-C12P, and the 2x APs are (now deprecated/replaced by newer models I think) the R510 models [15:01:50] *switch :) [15:03:34] the house came with it all wired up and ready, and it wasn't worth the cost to me to rip it out and try to find my own way, so I just kept it and worked with it [15:03:42] my wifi is pretty vanilla. I used to be a small-time cable operator, still obsessed with running wires, particularly coax . One of my many unhealthy obsessions [15:04:03] it was nice that they pre-ran wiring in the walls to all the main rooms, for ports to the ethernet switch [15:04:57] one runs that wall-switch in my office, most are unused. but we have a couple TVs that I wired up as well, so that when people are streaming TV it's not chewing up wireless capacity on those bulky streams. [15:07:59] Yeah, I wish I had that...I have a weird old house with a big extension in back. Easy to run cables in the extension, not so much in the old part [15:30:23] I am going to test database backups checks in production to make sure they keep working as expected, expect icinga IRC complains soon