[08:24:21] (ProbeDown) firing: (2) Service mirror1001:443 has failed probes (http_mirrors_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#mirror1001:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [08:34:21] (ProbeDown) resolved: (2) Service mirror1001:443 has failed probes (http_mirrors_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#mirror1001:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [08:39:36] (ProbeDown) firing: (2) Service mirror1001:443 has failed probes (http_mirrors_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#mirror1001:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [08:49:36] (ProbeDown) resolved: (2) Service mirror1001:443 has failed probes (http_mirrors_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#mirror1001:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [08:54:51] (ProbeDown) firing: (2) Service mirror1001:443 has failed probes (http_mirrors_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#mirror1001:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [08:59:36] (ProbeDown) resolved: (2) Service mirror1001:443 has failed probes (http_mirrors_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#mirror1001:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [09:13:11] 10netops, 10Infrastructure-Foundations, 10SRE: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10ayounsi) [09:13:30] 10netops, 10Infrastructure-Foundations, 10SRE: Create automation to move servers in Netbox from old to new switch - https://phabricator.wikimedia.org/T348129 (10ayounsi) 05Open→03Resolved a:05cmooney→03ayounsi https://netbox.wikimedia.org/extras/scripts/move_server.MoveServersUplinks/ is live! [09:29:26] volans, topranks, can I get a quick +1 https://gerrit.wikimedia.org/r/c/operations/dns/+/972332 ? [09:29:47] * topranks looking [09:29:59] ci disagree [09:30:27] volans: I've just deleted them from Netbox [09:30:43] so run the cookbook first :D [09:31:03] volans: it looks safer though to remove the include first, no? [09:31:31] https://wikitech.wikimedia.org/wiki/DNS/Netbox#Atomically_deploy_auto-generated_records_and_a_manual_change [09:33:36] yeah I know, I just wanted the +1 :) [09:34:11] otherwise if people are not around I have to either self merge, or block all of dns [09:35:01] :D [11:11:08] 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 2 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10MoritzMuehlenhoff) [11:15:46] 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 2 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10MoritzMuehlenhoff) [11:32:30] 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 2 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10MoritzMuehlenhoff) [11:37:27] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, 10ops-eqiad: eqiad: Connect IC-374549 - https://phabricator.wikimedia.org/T350504 (10cmooney) I added the config to set the port to 100G and bounced the PIC (the other, asw facing, ports on it were already VRRP backup). Light levels inbound look g... [12:00:39] 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 2 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10MoritzMuehlenhoff) [12:23:10] 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 2 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10MoritzMuehlenhoff) [12:32:02] 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 2 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10MoritzMuehlenhoff) [12:43:53] 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 2 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10MoritzMuehlenhoff) [13:09:21] (ProbeDown) firing: (2) Service mirror1001:443 has failed probes (http_mirrors_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#mirror1001:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [13:24:21] (ProbeDown) firing: (2) Service mirror1001:443 has failed probes (http_mirrors_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#mirror1001:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [13:29:21] (ProbeDown) firing: (2) Service mirror1001:443 has failed probes (http_mirrors_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#mirror1001:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [13:29:50] 10netops, 10Infrastructure-Foundations, 10SRE: Netbox PuppetDB Import Script Failing for cloudnet2006 - https://phabricator.wikimedia.org/T350479 (10Volans) The code is not checking if he autoselection of the parent is None or not. That said re-running the script now works fine. What was changed in the Netbo... [13:33:35] 10netops, 10Infrastructure-Foundations, 10SRE: Do we need to generate aggregates for LVS service IP ranges? - https://phabricator.wikimedia.org/T350354 (10ayounsi) That predates me so the real reason might be lost or not valid anymore. However I see that they're redistributed in OSPF: `set policy-options po... [13:44:21] (ProbeDown) resolved: (2) Service mirror1001:443 has failed probes (http_mirrors_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#mirror1001:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [14:11:22] 10netops, 10Infrastructure-Foundations, 10SRE: Do we need to generate aggregates for LVS service IP ranges? - https://phabricator.wikimedia.org/T350354 (10BBlack) I don't suspect it serves any real purpose at present, unless it was to avoid some filtering that exists elsewhere to avoid cross-site sharing of... [14:26:29] 10CFSSL-PKI, 10Ganeti, 10Infrastructure-Foundations: Migrate Ganeti-rapi to use pki - https://phabricator.wikimedia.org/T350686 (10jbond) p:05Triage→03Medium [14:31:18] 10CFSSL-PKI, 10Puppet, 10Infrastructure-Foundations, 10SRE, 10User-jbond: PKI server don't reimage cleanly - https://phabricator.wikimedia.org/T270269 (10jbond) [14:35:46] 10CFSSL-PKI, 10Infrastructure-Foundations: cfssl: cfssl signeres should correctly inject default values to profiles - https://phabricator.wikimedia.org/T299562 (10jbond) 05Open→03Resolved This work was completed in https://gerrit.wikimedia.org/r/c/operations/puppet/+/702343 [14:39:10] 10CFSSL-PKI, 10Infrastructure-Foundations, 10Observability-Metrics: cfssl: export metricts from sql database - https://phabricator.wikimedia.org/T327768 (10jbond) [14:42:10] 10CFSSL-PKI, 10Infrastructure-Foundations: PKI: configure a check for ocsp - https://phabricator.wikimedia.org/T350688 (10jbond) p:05Triage→03Medium [14:46:46] 10CFSSL-PKI, 10Infrastructure-Foundations, 10Observability-Metrics: PKI: create check for renewal - https://phabricator.wikimedia.org/T350690 (10jbond) p:05Triage→03Medium [15:34:21] (ProbeDown) firing: (2) Service mirror1001:443 has failed probes (http_mirrors_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#mirror1001:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [15:42:52] 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 2 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10MoritzMuehlenhoff) [15:49:21] (ProbeDown) resolved: (2) Service mirror1001:443 has failed probes (http_mirrors_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#mirror1001:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [16:11:06] 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 2 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10MoritzMuehlenhoff) [16:16:23] 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 2 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10MoritzMuehlenhoff) [17:24:33] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, 10ops-eqiad: eqiad: Connect IC-374549 - https://phabricator.wikimedia.org/T350504 (10RobH) 05Open→03Resolved >>! In T350504#9312052, @cmooney wrote: > I added the config to set the port to 100G and bounced the PIC (the asw facing ports on it we... [17:39:41] (SystemdUnitFailed) firing: check_netbox_uncommitted_dns_changes.service Failed on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [17:44:41] (SystemdUnitFailed) resolved: check_netbox_uncommitted_dns_changes.service Failed on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [18:04:32] * jbond wonders if it dcops might be the better target for this^ alert? [18:04:53] or perhaps both us and them (not sure thats possible) [18:11:58] 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 2 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10jbond) [18:55:11] 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 2 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10jbond) [19:01:59] 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 2 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10jbond) [19:16:25] 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 2 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10jbond) [19:22:19] 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 2 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10jbond) [19:25:09] 10CAS-SSO, 10Infrastructure-Foundations: Create OpenID Connect client - https://phabricator.wikimedia.org/T350725 (10CCicalese_WMF) [19:28:13] 10CAS-SSO, 10Infrastructure-Foundations: Upgrade Apereo CAS to include PKCE functionality when it becomes available - https://phabricator.wikimedia.org/T350727 (10CCicalese_WMF) [20:37:46] 10SRE-tools, 10Data-Persistence, 10Infrastructure-Foundations, 10Spicerack, and 2 others: Switch conftool to use the version 3 etcd datastore - https://phabricator.wikimedia.org/T350565 (10KOfori)