[00:03:40] (SystemdUnitFailed) firing: (2) ferm.service Failed on aux-k8s-ctrl1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [04:03:40] (SystemdUnitFailed) firing: (2) ferm.service Failed on aux-k8s-ctrl1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:48:23] 10netops, 10Infrastructure-Foundations, 10SRE: Announce internal/core routes from CRs to L3 switches - https://phabricator.wikimedia.org/T344547 (10ayounsi) On the resiliency side, this protects us from a double failure: the cr1-cr2 link to fail as well as a transport link. Low risk but still a risk. I agree... [08:03:40] (SystemdUnitFailed) firing: (2) ferm.service Failed on aux-k8s-ctrl1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:14:32] (SystemdUnitFailed) firing: (2) ferm.service Failed on aux-k8s-ctrl1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:31:42] 10CAS-SSO, 10Data-Platform-SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Switch DataHub authentication to OIDC - https://phabricator.wikimedia.org/T305874 (10Gehel) [08:33:40] (SystemdUnitFailed) resolved: ferm.service Failed on aux-k8s-ctrl1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [10:22:24] topranks: yay new /24 : 195.200.68.0/24 ! [10:22:56] woot! [11:17:45] Might as well signup for the next one right away :-) [12:03:35] 10netops, 10Infrastructure-Foundations, 10Patch-For-Review: Adjust routing policy to increase SSH session speed from East Asia to toolforge - https://phabricator.wikimedia.org/T334530 (10ayounsi) @cmooney mentioned me that the previous syntax didn't work, this is because the `as-path-calc-length` term ignore... [13:07:01] 10CAS-SSO, 10Data-Platform-SRE, 10Infrastructure-Foundations: Switch DataHub authentication to OIDC - https://phabricator.wikimedia.org/T305874 (10Stevemunene) Got some errors from the first test, but they're mostly related to the current setup. Looking into this ` ERROR: exit status 1 EXIT STATUS 1 S... [14:40:27] 10CAS-SSO, 10Data-Platform-SRE, 10Infrastructure-Foundations: Switch DataHub authentication to OIDC - https://phabricator.wikimedia.org/T305874 (10Stevemunene) Was able to get the deployment to staging done, login redirected to the right SSO page and I was able to enter my login details, however authenticati... [15:37:56] 10CAS-SSO, 10Data-Platform-SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Switch DataHub authentication to OIDC - https://phabricator.wikimedia.org/T305874 (10Stevemunene) [16:02:44] I am getting a duplicate IP from a previous cookbook run [16:02:44] netbox/wikimedia.org-esams:57 doh3003.wikimedia.org. A 185.15.59.35 [16:02:47] netbox/wikimedia.org-esams:58 doh3003.wikimedia.org. A 185.15.59.37 [16:02:50] netbox/wikimedia.org-esams:59 doh3003.wikimedia.org. AAAA 2a02:ec80:300:2:185:15:59:35 [16:02:53] netbox/wikimedia.org-esams:60 doh3003.wikimedia.org. AAAA 2a02:ec80:300:2:185:15:59:37 [16:02:56] is it just fine to delete the ones on netbox and remove the DNS name? [16:08:04] removed the IPs from netbox, ran the cookbook, all good