[07:36:23] 10netops, 06Infrastructure-Foundations, 10procurement, 06SRE, 13Patch-For-Review: Decom prod infra side of the ulsfo-office link - https://phabricator.wikimedia.org/T379778#10321324 (10ayounsi) [08:12:17] 10SRE-tools, 06Infrastructure-Foundations, 06SRE, 07IPv6: Enable ipv6 on ganeti2019-ganeti2024 - https://phabricator.wikimedia.org/T379890 (10MoritzMuehlenhoff) 03NEW [08:13:59] 10SRE-tools, 06Infrastructure-Foundations, 06SRE, 07IPv6: Enable ipv6 on ganeti2019-ganeti2024 - https://phabricator.wikimedia.org/T379890#10321458 (10MoritzMuehlenhoff) p:05Triage→03Medium [09:04:47] Netbox users, let me know what you think of https://gerrit.wikimedia.org/r/c/operations/puppet/+/1091187 :) [09:14:16] Gute Idee, +1d! [09:15:36] lgtm, I didn't even know we had that [09:17:08] it even slightly breaks the UI, given German words like "Konsolenserver-Anschlüsse" tend to be too long :-) [09:17:21] ahahahha [09:21:20] 10SRE-tools, 06Data-Persistence-SRE, 06DBA, 06Infrastructure-Foundations, and 2 others: spicerack mysql_legacy: support fetch metrics for instance - https://phabricator.wikimedia.org/T376596#10321649 (10ABran-WMF) >>! In T376596#10205946, @Volans wrote: > Spicerack has support for prometheus, why not getti... [09:22:38] the french version was slowly getting me crazy [09:25:28] cool, tested in -next and deploying it now [09:54:36] 10SRE-tools, 06Data-Persistence-SRE, 06DBA, 06Infrastructure-Foundations, and 2 others: spicerack mysql_legacy: support fetch metrics for instance - https://phabricator.wikimedia.org/T376596#10321785 (10ABran-WMF) >>>! In T376596#10205946, @Volans wrote: >> why not getting the metrics directly from there i... [10:43:38] I find myself in the uncomfortable situation where Safari yields a better error than Firefox [10:48:51] lol [11:54:36] How do we feel about supporting FIDO2 on Safari? I say no, but only because I can't make it work :-) [12:00:35] 10netbox, 10netops, 06Infrastructure-Foundations: Netbox: librenms report errors - https://phabricator.wikimedia.org/T379907 (10ayounsi) 03NEW [12:01:32] let's stick with webauthn for now [12:03:24] Same same [12:03:43] CAS calls it FIDO2 webauthn [12:04:01] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, and 2 others: Q1:eqiad:frack network upgrade tracking task - https://phabricator.wikimedia.org/T371435#10322139 (10cmooney) >>! In T371435#10318507, @RobH wrote: > I'd hand this over to either John or Valerie as ops-eqiad for them to... [12:04:30] It works in Firefox. [12:10:48] Nevermind, it's a Safari bug [12:11:19] It's fixed in the Safari Technology Preview and possibly in the latest macOS [13:29:20] 10netops, 06Infrastructure-Foundations, 10procurement, 06SRE, 13Patch-For-Review: Decom prod infra side of the ulsfo-office link - https://phabricator.wikimedia.org/T379778#10322386 (10RobH) [14:42:38] elukey: I tested the WIP efi patch last night, and it at least resolves the issue on my test host [14:50:16] jhathaway: o/ saw it, I commented in the ms-be task and in the patch, IIRC I added that bit originally because of the double d-i :( [14:50:28] it wasn't there originally [14:51:18] 10netops, 06Infrastructure-Foundations, 06serviceops, 07Kubernetes: Reimage one of the wikikube-worker1240 to wikikube-worker1304 node in eqiad as a replacement for wikikube-ctrl1001 - https://phabricator.wikimedia.org/T379790#10322697 (10akosiaris) Cool, thanks. In that case, I randomly picked `wikikube-w... [14:51:30] hmm, well that means my reproduction is flawed, or there was a different bug when you hit the doulbe d-i [14:52:14] 10netops, 06Infrastructure-Foundations, 06serviceops, 07Kubernetes: Reimage one of the wikikube-worker1240 to wikikube-worker1304 node in eqiad as a replacement for wikikube-ctrl1001 - https://phabricator.wikimedia.org/T379790#10322699 (10akosiaris) [14:52:18] I am confident that calling redfish to override the boot order during the d-i causes grub's just installed boot settings to be lost [14:52:32] okok then we can definitely discard that bit of code [14:53:01] at this point let's merge it and test it with the new thanos-be nodes, same as ms-be2xxx [14:53:16] if we see double-di again, then we can investigate [14:53:33] sounds good, its been a wild ride :D [14:54:58] I'll update the patch to add some more context, then ask for a review [14:55:06] ack! [14:55:16] lemme ask to dcops if the thanos-be nodes are ready [15:21:13] jhathaway: ahhhh now I get it, the issue is patching the EFI settings while d-i happens [15:21:23] okok this bit wasn't clear to me, now it makes perfect sense [15:21:29] nod [15:22:04] now we could think about using the Hdd setting, if still needed, right before line 554 (self.host_actions.success(f'Host up (new fresh {distro} OS)')) [15:23:21] yup, then it should work, though future grub-installs will not have any effect, though maybe that doesn't matter? [15:23:41] s/will not/may not/ [15:28:14] 07Puppet, 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations: Puppet removed "nameserver" line from /etc/resolv.conf - https://phabricator.wikimedia.org/T379927 (10fnegri) 03NEW [15:29:05] 07Puppet, 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations: Puppet removed "nameserver" line from /etc/resolv.conf - https://phabricator.wikimedia.org/T379927#10322866 (10fnegri) 05Open→03Resolved a:03fnegri The issue is resolved, I created this task to track it in case it happens... [16:27:51] elukey: happy to try with one of the thanos hosts, let me know your plans [16:28:45] jhathaway: I am fighting with thanos-be2005 for an unrelated BMC network setting that causes the provision cookbook to fail, sigh [16:29:03] I already configured the disks etc.., so once ready you'll be able to reimage [16:29:13] no joy with Supermicro [16:34:49] I don't think Joy and Supermicro are used often together, :D [16:35:09] but I do like the supermicro logo in ascii on the boot screen :P [17:00:35] jhathaway: I asked to Jenn if it was possible to flip a setting off via mgmt cart, so I guess it will be done later.. I'll kick off reimage tomorrow morning EU time and report back [17:01:01] sounds good, thanks [17:47:25] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: cr1-eqiad: disk failure - https://phabricator.wikimedia.org/T372781#10323538 (10Papaul) 05Open→03Resolved This is done, re0 is now the master. Closing this task ` re0.cr1-eqiad> show chassis routing-engine Routing Engine statu... [17:48:06] 10netops, 06Infrastructure-Foundations, 06SRE: Upgrade core routers to Junos 23.4R2 - https://phabricator.wikimedia.org/T364092#10323542 (10Papaul) [19:39:39] 10netops, 06Infrastructure-Foundations, 06SRE: Upgrade core routers to Junos 23.4R2 - https://phabricator.wikimedia.org/T364092#10324084 (10Papaul) [20:53:29] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, and 3 others: Decom prod infra side of the ulsfo-office link - https://phabricator.wikimedia.org/T379778#10324366 (10RobH) [20:58:00] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, and 3 others: Decom prod infra side of the ulsfo-office link - https://phabricator.wikimedia.org/T379778#10324353 (10RobH) [22:17:25] FIRING: SystemdUnitFailed: generate_vrts_aliases.service on mx-in1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [22:32:25] FIRING: [2x] SystemdUnitFailed: generate_vrts_aliases.service on mx-in1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed