[00:13:39] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/extras/scripts/12/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [02:53:39] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/extras/scripts/18/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [04:13:39] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/extras/scripts/12/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [05:26:52] https://github.com/netbox-community/netbox/releases/tag/v4.3.0-beta1 [05:28:35] No big features for us, but some nice ones, like "Adopt advanced query filtering in GraphQL API to support filtering by custom fields" or "Hierarchical Device Roles" [06:09:00] 10netops, 06Infrastructure-Foundations: Junos: investigate BGP rib sharding - https://phabricator.wikimedia.org/T320264#10742262 (10ayounsi) More vulns : https://supportportal.juniper.net/s/article/2025-04-Security-Bulletin-Junos-OS-and-Junos-OS-Evolved-A-specific-CLI-command-will-cause-a-RPD-crash-when-rib-sh... [06:53:39] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/extras/scripts/18/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [08:13:39] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/extras/scripts/12/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [10:04:48] refreshing to see the homer email without 100 lines of capirca expired :D [10:05:00] sorry for not having fixed this earlier [10:05:03] yup :) [10:05:11] haha not your fault! [10:07:28] XioNoX: not sure if committable, but the diff for 'cr*-ulsfo.wikimedia.org' is the same for 2 devices, might be a good test for the 'all' option :) [10:08:01] removes metric 100; from ospf and ospf3 [10:08:53] volans: yeah but we want to keep that until https://phabricator.wikimedia.org/T390731#10734723 is fixed [10:09:11] but we can test the `none` option :) [10:09:23] true :D [10:09:26] if you're brave enough [10:53:39] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/extras/scripts/18/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [12:13:39] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/extras/scripts/12/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [12:41:32] 10netops, 06Infrastructure-Foundations, 10Observability-Alerting: Migrate network icinga alerts to gNMI/prometheus - https://phabricator.wikimedia.org/T388641#10743419 (10ayounsi) For OSPF it looks like the interface states are there, but not the neighbor states: {P75019} That's subscribing to `/network-ins... [13:04:42] 10netops, 06Infrastructure-Foundations, 10Observability-Alerting, 13Patch-For-Review: Migrate network icinga alerts to gNMI/prometheus - https://phabricator.wikimedia.org/T388641#10743565 (10ayounsi) [14:53:39] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/extras/scripts/18/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [16:13:39] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/extras/scripts/12/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [16:20:40] 10netops, 06Infrastructure-Foundations, 06SRE: Create alerting for saturation on sub-rated interfaces - https://phabricator.wikimedia.org/T374614#10744550 (10cmooney) >>! In T374614#10707267, @cmooney wrote: >>>! In T374614#10147994, @ayounsi wrote: >> Short term I think if you add `[4Gbps]` to the interface... [18:02:40] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, 10ops-eqiad: eqiad: second frack parent tracking task - https://phabricator.wikimedia.org/T392006 (10RobH) 03NEW p:05Triage→03High [18:04:52] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, 10ops-eqiad: eqiad: determine second frack - https://phabricator.wikimedia.org/T392007 (10RobH) 03NEW [18:11:51] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, 10ops-eqiad: eqiad: determine second frack - https://phabricator.wikimedia.org/T392007#10744966 (10RobH) [18:14:10] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, 10ops-eqiad: eqiad: determine second frack - https://phabricator.wikimedia.org/T392007#10744975 (10RobH) @ayounsi & @cmooney: Per our conversation today in our codfw/eqiad buildout meetings, this was brought up and I've created th... [18:14:35] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, 10ops-eqiad: eqiad: determine second frack - https://phabricator.wikimedia.org/T392007#10744979 (10RobH) @Jclark-ctr & @VRiley-WMF Per today's meeting, one of the action items was to have an eqiad onsite detrmine how many free cro... [18:28:48] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, 10ops-eqiad: eqiad: determine second frack - https://phabricator.wikimedia.org/T392007#10745024 (10cmooney) >>! In T392007#10744966, @RobH wrote: > Please detail via comment specifically how using D6 would cause a network imbalance... [18:30:36] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, and 2 others: eqiad: determine second frack - https://phabricator.wikimedia.org/T392007#10745029 (10RobH) [18:31:26] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, and 2 others: eqiad: determine second frack - https://phabricator.wikimedia.org/T392007#10745033 (10RobH) [18:49:57] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, and 2 others: eqiad: second frack parent tracking task - https://phabricator.wikimedia.org/T392006#10745097 (10RobH) [18:49:59] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, and 2 others: Migrate non-fundraising hosts out of eqiad D6 - https://phabricator.wikimedia.org/T390240#10745098 (10RobH) [18:50:28] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, and 2 others: eqiad: second frack parent tracking task - https://phabricator.wikimedia.org/T392006#10745104 (10RobH) Please note I've tied original task T390240 to this for ease of tracking. If rack D6 is not selected (likely wont b... [18:53:39] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/extras/scripts/18/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [19:12:26] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, and 2 others: eqiad: determine second frack - https://phabricator.wikimedia.org/T392007#10745165 (10Jclark-ctr) @RobH we have 1 free cross connect circuit id 21996480 [19:13:12] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, and 2 others: eqiad: determine second frack - https://phabricator.wikimedia.org/T392007#10745171 (10Jclark-ctr) [19:27:13] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, and 2 others: eqiad: determine second frack - https://phabricator.wikimedia.org/T392007#10745240 (10RobH) [20:03:08] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, and 2 others: eqiad: determine second frack - https://phabricator.wikimedia.org/T392007#10745411 (10Jclark-ctr) [20:09:26] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, and 2 others: eqiad: determine second frack - https://phabricator.wikimedia.org/T392007#10745517 (10Jclark-ctr) [20:13:39] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/extras/scripts/12/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [20:23:39] RESOLVED: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/extras/scripts/12/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [20:23:39] RESOLVED: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/extras/scripts/18/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [20:24:34] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/extras/scripts/12/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [20:24:35] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/extras/scripts/18/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [22:41:25] FIRING: SystemdUnitFailed: check_netbox_uncommitted_dns_changes.service on netbox1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [22:46:25] RESOLVED: SystemdUnitFailed: check_netbox_uncommitted_dns_changes.service on netbox1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed