[06:25:51] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Homer trying to delete BGP peerings for VMs on new Eqiad ganeti nodes - https://phabricator.wikimedia.org/T381175#11885209 (10ayounsi) 05Open→03Resolved I think we're all good here, the issue has been tackled in 2 different ways and... [06:28:22] 10netops, 06Infrastructure-Foundations, 06SRE, 07Epic: [tracking] Don't keep on the public vlans hosts that don't require it - https://phabricator.wikimedia.org/T317177#11885212 (10ayounsi) [08:02:25] FIRING: SystemdUnitFailed: netbox_ganeti_codfw_test_sync.service on netbox1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:13:45] 10netbox, 06Infrastructure-Foundations: Upgrade Netbox to 4.6.x - https://phabricator.wikimedia.org/T371889#11885475 (10ayounsi) [08:17:25] RESOLVED: SystemdUnitFailed: netbox_ganeti_codfw_test_sync.service on netbox1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:47:30] 10netops, 06Infrastructure-Foundations, 10Observability-Metrics, 13Patch-For-Review: gNMIc: investigate new "collector" command - https://phabricator.wikimedia.org/T416360#11885702 (10ayounsi) 05Open→03Resolved a:03ayounsi All done. [09:06:22] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Network telemetry - collect device sub-interface statistics with gnmic - https://phabricator.wikimedia.org/T424683#11885878 (10ayounsi) Nice! We can also filter out the `.16386`, `.16384`, `.16385`, `.16383`, `.32769` - weird juniper... a... [09:12:08] 10Mail, 06Infrastructure-Foundations, 10MediaWiki-Email, 10MediaWiki-extensions-EmailAuth, and 5 others: Could not send confirmation email: Unknown error in PHP's mail() function. - https://phabricator.wikimedia.org/T383047#11885927 (10Elitre82) >>! In T383047#11834654, @TAndic wrote: > Hi all, I just expe... [10:45:25] 10SRE-tools, 06Infrastructure-Foundations, 13Patch-For-Review: Cookbook for rack depool - https://phabricator.wikimedia.org/T327300#11886199 (10ayounsi) [11:13:03] 10Packaging, 06Abstract Wikipedia team, 10function-evaluator, 06Infrastructure-Foundations, 03Abstract Wikipedia Fix-It tasks: Package rustc from forky for wikimedia-bookworm so we can use it in an image like abstractwiki-rust - https://phabricator.wikimedia.org/T425341 (10Jdforrester-WMF) 03NEW [11:13:40] 10Packaging, 06Abstract Wikipedia team, 10function-evaluator, 06Infrastructure-Foundations, 03Abstract Wikipedia Fix-It tasks: Package rustc from forky for wikimedia-bookworm so we can use it in an image like abstractwiki-rust - https://phabricator.wikimedia.org/T425341#11886291 (10Jdforrester-WMF) a:03... [12:59:10] 10netops, 06Infrastructure-Foundations: POPs - free up 2xQSFP ports - https://phabricator.wikimedia.org/T424611#11886556 (10ayounsi) > suggest corebgp-- for it I suggest `core1` instead of `corebgp` but that lgtm! For v4 I'd have thought a /31 for a vlan used only between 2 CRs. So if we add anoth... [14:01:37] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11886785 (10SLyngshede-WMF) Minor error in command, should have been: ` $ ssh cumin1003.eqiad.wmnet $ sudo cookbook sre.dns.admin depool ulsfo -t T408892 -r... [14:03:14] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11886801 (10SLyngshede-WMF) Depooling command output, for the records: ` slyngshede@cumin1003:~$ sudo cookbook sre.dns.admin depool ulsfo -t T408892 -r "New... [14:25:45] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11886863 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=6733bed9-572f-4b81-9a71-76b2217ca3b5) set by pt1979@cumin1003 for 4:00:00 on 4 hos... [14:30:44] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11886897 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=ea06e422-63a1-4feb-89ac-13f0b89b4956) set by pt1979@cumin1003 for 4:00:00 on 5 hos... [14:37:21] 10Packaging, 06Abstract Wikipedia team, 10function-evaluator, 06Infrastructure-Foundations, 03Abstract Wikipedia Fix-It tasks: Package rustc from forky for wikimedia-bookworm so we can use it in an image like abstractwiki-rust - https://phabricator.wikimedia.org/T425341#11886919 (10LSobanski) p:05Triage... [15:33:39] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11887131 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=241a7848-479d-48b2-8824-9a08c17249ab) set by ayounsi@cumin1003 for 20:00:00 on 39... [15:35:25] FIRING: SystemdUnitFailed: netbox_ganeti_ulsfo02_sync.service on netbox1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [16:32:55] 10SRE-tools, 06Infrastructure-Foundations, 10Spicerack, 13Patch-Needs-Improvement: Some SAL log entries (e.g. switchdc, scap backport) are getting cut off because long lines are being split over IRC - https://phabricator.wikimedia.org/T285709#11887352 (10A_smart_kitten) [19:35:40] FIRING: SystemdUnitFailed: netbox_ganeti_ulsfo02_sync.service on netbox1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [22:52:36] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11888002 (10Papaul) [23:10:31] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11888032 (10Papaul) [23:35:40] FIRING: SystemdUnitFailed: netbox_ganeti_ulsfo02_sync.service on netbox1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed