[00:24:35] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/extras/scripts/12/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [00:24:35] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/extras/scripts/18/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [02:05:25] FIRING: SystemdUnitFailed: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [03:15:25] FIRING: [2x] SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [04:24:35] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/extras/scripts/12/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [04:24:35] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/extras/scripts/18/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [04:57:25] FIRING: MirrorHighLag: Mirrors - /srv/mirrors/debian synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [06:19:14] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, and 2 others: Migrate non-fundraising hosts out of eqiad D6 - https://phabricator.wikimedia.org/T390240#10754475 (10ayounsi) a:05ayounsi→03None [06:31:44] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, and 2 others: eqiad: determine second frack - https://phabricator.wikimedia.org/T392007#10754484 (10ayounsi) @robh, thanks for that task, well summed up ! One more point is that we need to account for expected overall growth of the... [06:36:03] elukey: thanks a lot for https://phabricator.wikimedia.org/T391925 !!! [06:52:13] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, and 2 others: eqiad: determine second frack - https://phabricator.wikimedia.org/T392007#10754495 (10wiki_willy) Hey @ayounsi - after some feedback from my staff meeting earlier today, I reached out to Equinix to see if there's any wa... [06:52:25] RESOLVED: MirrorHighLag: Mirrors - /srv/mirrors/debian synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [07:15:25] FIRING: [2x] SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:27:13] 10netops, 06Infrastructure-Foundations, 06SRE: Update network SSH keys to ssh-ed25519 - https://phabricator.wikimedia.org/T336769#10754513 (10ayounsi) @BBlack ping ? :) [08:24:35] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/extras/scripts/12/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [08:24:35] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/extras/scripts/18/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [08:30:11] 10netops, 06Infrastructure-Foundations, 06SRE: Netbox: Use LAG interface MAC address field to store LACP system-id for MC-LAG - https://phabricator.wikimedia.org/T392056#10754547 (10ayounsi) Some other thoughts : * There is also a netbox plugin, but that doesn't seen like a great way : https://github.com/pv... [08:36:22] 10netops, 06Infrastructure-Foundations, 06SRE: Create alerting for saturation on sub-rated interfaces - https://phabricator.wikimedia.org/T374614#10754568 (10ayounsi) For the record, as mentioned on IRC, if the description parsing doesn't work, we can also set it in the UI. For example in https://librenms.wi... [09:10:25] FIRING: [2x] SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:40:41] 10netbox, 10ChangeProp, 06cloud-services-team, 06collaboration-services, and 10 others: Figure out a plan to move forward with regarding Redis License changes - https://phabricator.wikimedia.org/T360596#10754632 (10taavi) Untagging #Quarry which has an active subtask. [11:50:19] 10homer, 10SRE-tools, 06Infrastructure-Foundations: Homer: add parallelization support - https://phabricator.wikimedia.org/T250415#10754744 (10ayounsi) Indeed ! Very useful ! One risk is that the "all-and-skip" doesn't incite us to maintain a "no outstanding changes" policy. Otherwise maybe the "all-and-sk... [12:24:35] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/extras/scripts/12/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [12:24:35] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/extras/scripts/18/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [13:10:25] FIRING: SystemdUnitFailed: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [14:32:42] 10Mail, 06Infrastructure-Foundations: Trouble reaching Microsoft email domains - https://phabricator.wikimedia.org/T390307#10754870 (10nisrael) Sounds good, thank you Jesse! [14:49:45] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, and 2 others: eqiad: determine second frack - https://phabricator.wikimedia.org/T392007#10754936 (10RobH) a:05Jclark-ctr→03wiki_willy Assigning from John to Willy pending EQ update on pricing and build out of an additional rack i... [14:50:57] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, and 2 others: Migrate non-fundraising hosts out of eqiad D6 - https://phabricator.wikimedia.org/T390240#10754938 (10RobH) 05Open→03Invalid Please note that after further deliberation and investigation, it has been determined... [14:51:59] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, and 2 others: eqiad: second frack parent tracking task - https://phabricator.wikimedia.org/T392006#10754971 (10RobH) [14:52:01] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, and 2 others: Migrate non-fundraising hosts out of eqiad D6 - https://phabricator.wikimedia.org/T390240#10754972 (10RobH) [14:52:32] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, and 2 others: eqiad: second frack parent tracking task - https://phabricator.wikimedia.org/T392006#10754975 (10RobH) >>! In T392006#10745097, @RobH wrote: > Please note I've tied original task T390240 to this for ease of tracking. I... [16:24:35] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/extras/scripts/12/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [16:24:35] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/extras/scripts/18/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [17:10:25] FIRING: SystemdUnitFailed: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [20:24:35] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/extras/scripts/12/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [20:24:35] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/extras/scripts/18/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [21:10:25] FIRING: SystemdUnitFailed: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed