[01:28:24] (SystemdUnitFailed) firing: httpbb_kubernetes_mw-api-ext_hourly.service Failed on cumin1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [02:28:24] (SystemdUnitFailed) resolved: httpbb_kubernetes_mw-api-ext_hourly.service Failed on cumin1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:48:59] AI for Netbox :) https://aolabs-netbox.streamlit.app/ [07:04:25] 10homer, 10Infrastructure-Foundations: Replace Capirca with Aerleon - https://phabricator.wikimedia.org/T337082 (10ayounsi) Found a few "regressions": ` WARNING:absl:Term allow_ok_icmp6 will not be rendered, as it has icmpv6 match specified but the ACL is of inet address family. WARNING:absl:Term allow_ok_icmp... [09:31:29] 10CAS-SSO, 10Data-Platform-SRE, 10Infrastructure-Foundations: Switch DataHub authentication to OIDC - https://phabricator.wikimedia.org/T305874 (10BTullis) a:05BTullis→03Stevemunene @Stevemunene we're no longer going to be the early adopters of OIDC now within the foundation. There are now wo other proj... [09:32:35] 10SRE-tools, 10Infrastructure-Foundations, 10Spicerack: Unrelated DNS diffs shown if decommission and makevm cookbooks run at the same time - https://phabricator.wikimedia.org/T342130 (10jbond) >>! In T342130#9024276, @bking wrote: > Was thinking a bit more about this...would it work to do some minimal sanit... [09:34:46] 10CAS-SSO, 10Data-Platform-SRE, 10Infrastructure-Foundations: Switch DataHub authentication to OIDC - https://phabricator.wikimedia.org/T305874 (10jbond) cc @SLyngshede-WMF who worked on both the netbox and gitlab integrations, as well as the initial idm implementation. [10:29:20] 10SRE-tools, 10netbox, 10Infrastructure-Foundations, 10Puppet-Infrastructure, and 2 others: update systems to use new puppetdb instance - https://phabricator.wikimedia.org/T342214 (10jbond) [10:29:32] 10SRE-tools, 10netbox, 10Infrastructure-Foundations, 10Puppet-Infrastructure, and 2 others: update systems to use new puppetdb instance - https://phabricator.wikimedia.org/T342214 (10jbond) 05Open→03In progress p:05Triage→03Medium [14:01:54] 10SRE-tools, 10netbox, 10Infrastructure-Foundations, 10Puppet-Infrastructure, and 3 others: update systems to use new puppetdb instance - https://phabricator.wikimedia.org/T342214 (10jbond) [14:40:42] XioNoX could i get a quick review on https://gerrit.wikimedia.org/r/c/operations/puppet/+/939710/ [14:41:36] jbond: what's the impact? I had a look at the parent change is -next the next one or the current one? like puppetboard-next [14:42:04] XioNoX: next is the puppet7 infrastructre [14:42:19] however once things are tested ill do the same swap as i did with puppetboard [14:42:25] ok! [14:42:29] i.e. move the old one to next just to make sure nothing is missed [14:43:04] I don't know enough to have a real opinion here, but looks like it's easy to revert and it's not a critical part so +1 [14:43:15] thanks [14:43:23] ah and it'snetbox-next too [14:43:24] so yeah [14:43:38] yes planned to test all the reports there first [15:01:14] XioNoX: are theses errors expected https://netbox-next.wikimedia.org/extras/reports/results/4568726/ [15:01:57] jbond: it's netbox-next so data is stale there [15:02:16] yjays what i thought but just wanted to make sure, thanks [15:10:14] 10SRE-tools, 10netbox, 10Infrastructure-Foundations, 10Puppet-Infrastructure, and 3 others: update systems to use new puppetdb instance - https://phabricator.wikimedia.org/T342214 (10jbond) for puppetdb-api. i have updated netbox-next and tested the following: === Reports * [[ https://netbox-next.wikime... [15:11:48] XioNoX: topranks: can you check https://phabricator.wikimedia.org/T342214#9028553 and let me know if there is anything else i should test on netbox [15:13:33] jbond: from the top of my mind and a quick look at the doc that seems all [15:13:42] cheers [15:23:31] 10SRE-tools, 10netbox, 10Infrastructure-Foundations, 10Puppet-Infrastructure, and 3 others: update systems to use new puppetdb instance - https://phabricator.wikimedia.org/T342214 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jbond@cumin1001 for host sretest1002.eqiad.wmnet wit... [15:44:57] 10SRE-tools, 10netbox, 10Infrastructure-Foundations, 10Puppet-Infrastructure, and 3 others: update systems to use new puppetdb instance - https://phabricator.wikimedia.org/T342214 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jbond@cumin1001 for host sretest1002.eqiad.wmnet with OS... [16:57:54] jbond or slyngs , I've got a VM that's stuck from a failed makevm cookbook ( flink-zk1003)...should I just run the decom cookbook and try again? [17:01:05] I tried to re-run makevm, but it wants to add more DNS records instead of replacing the ones from the failed VM [17:34:55] hi folks [17:38:04] (pinged in the other channeL) [19:35:47] running decom cookbook and will try to run makevm again afterwards [20:39:04] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-eqiad: Cabling for Eqiad racke E5-7 and F5-7 - https://phabricator.wikimedia.org/T334231 (10cmooney) @Jclark-ctr my apologies for some reason I thought these links had been cabled but seems from T338789 I didn't update the optic type so we need got them...