[00:05:26] FIRING: SystemdUnitFailed: dump_cloud_ip_ranges.service on puppetserver2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [03:20:26] FIRING: [2x] SystemdUnitFailed: gitlab-package-puller.service on apt-staging2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [03:50:26] FIRING: [2x] SystemdUnitFailed: gitlab-package-puller.service on apt-staging2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [04:00:26] FIRING: [2x] SystemdUnitFailed: gitlab-package-puller.service on apt-staging2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [04:15:26] FIRING: [2x] SystemdUnitFailed: gitlab-package-puller.service on apt-staging2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [04:20:26] FIRING: [2x] SystemdUnitFailed: gitlab-package-puller.service on apt-staging2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:10:26] FIRING: [2x] SystemdUnitFailed: gitlab-package-puller.service on apt-staging2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:15:26] FIRING: [2x] SystemdUnitFailed: gitlab-package-puller.service on apt-staging2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:25:26] FIRING: [3x] SystemdUnitFailed: gitlab-package-puller.service on apt-staging2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:30:26] FIRING: [3x] SystemdUnitFailed: requestctl-credential-refresh.service on puppetserver1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:40:26] FIRING: [5x] SystemdUnitFailed: requestctl-credential-refresh.service on puppetserver1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:55:26] FIRING: [5x] SystemdUnitFailed: requestctl-credential-refresh.service on puppetserver1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:25:26] FIRING: [4x] SystemdUnitFailed: requestctl-credential-refresh.service on puppetserver1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:40:26] FIRING: [4x] SystemdUnitFailed: requestctl-credential-refresh.service on puppetserver1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:55:26] FIRING: [2x] SystemdUnitFailed: gitlab-package-puller.service on apt-staging2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:56:21] jelto: it says Gitlab so I ping you, but any idea what this alert is? [07:56:28] or who might know [07:57:18] Hello! Yes apt-staging is pulling deb package builds from GitLab. GitLab suffers a bit from scraping at the moment unfortunately. I can silence the alert until this issue is fixed [07:57:32] cool, thx! [07:58:37] slyngs: hello, and maybe you have pointers for https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed&q=%40state%3Dactive&q=name%3Ddump_cloud_ip_ranges.service ? As it's a bit traffic and a bit I/F :) [07:58:51] I shall take a look [08:03:14] Hmm, I wonder if this is one of these were there are no changes and it then assumes that something has gone wrong [08:13:14] ERROR:requestctl:Error occurred: Error fetching ipblock_source known-clients/googlebot: 400 - {"detail":"Error parsing ipblock data: 'prefixes'"} [08:16:29] XioNoX: GoogleBot is temporarily broken, according to Google [08:16:55] should the script gracefully handle that usecase? [08:17:06] I'd argue: Yes [08:18:30] The error message is a bit weird. The error message is that it's temporarily broken, but that we should move to another feed. [08:18:57] https://developers.google.com/crawling/docs/crawlers-fetchers/verify-google-requests [08:23:07] so https://developers.google.com/static/crawling/ipranges/common-crawlers.json ? [08:29:02] Yes, the link in the error message and on the page from volans is the same. I'll just change it [08:32:40] Special crawlers had also changed [08:34:26] if you're doing some cleanup there, FYI there are also some stale known-clients, dunno why. For example amazonbot last updated 2025-11-24 [08:35:03] not sure if that was intended for some $reasons [08:35:16] Maybe they are just very stable :-) [08:35:19] I'll check [08:36:05] the first IP I tested is not there :D [08:37:07] It's probably just reserved for future use :-) [08:37:29] XioNoX Alerts have cleared. I'll take a look at the AWS one. [08:37:45] <3 [08:40:26] RESOLVED: SystemdUnitFailed: dump_cloud_ip_ranges.service on puppetserver2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:53:33] 10SRE-tools, 06DBA, 06Infrastructure-Foundations, 10Spicerack, 13Patch-For-Review: Provide downtime duration information in sre.mysql cookbooks - https://phabricator.wikimedia.org/T427780#11992668 (10CWilliams-WMF) 05Open→03Resolved [09:13:48] FIRING: PuppetFailure: Puppet has failed on puppetserver1002:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [09:28:48] RESOLVED: PuppetFailure: Puppet has failed on puppetserver1002:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [10:38:01] 10netops, 06Discovery-Search, 06Infrastructure-Foundations, 06Machine-Learning-Team, and 3 others: codfw: rack A4 maintenance - https://phabricator.wikimedia.org/T427357#11993251 (10jijiki) [10:40:32] 10netops, 06Discovery-Search, 06Infrastructure-Foundations, 06Machine-Learning-Team, and 3 others: codfw: rack A4 maintenance - https://phabricator.wikimedia.org/T427357#11993265 (10jijiki) >>! In T427357#11958744, @jijiki wrote: > @ayounsi `mc2055` and `mc-gp2004` are on A4, and that is by accident. `mc-g... [11:54:30] 10netops, 06Infrastructure-Foundations: Create public vlans in eqiad and codfw - https://phabricator.wikimedia.org/T422043#11993668 (10ayounsi) [12:02:30] volans: The AWS list is manually entered it seems, but the differences are minor. AWS added six IPs [12:06:08] slyngs: do you know why it's manual if they publish it? I see upstream last update was on 2026-04-30 for amazonbot, then in the page [1] there are also other ranges: https://developer.amazon.com/amazonbot [12:06:35] I think it's manual because they publish it in a stupid format [12:07:11] Or because it's from before some automation of these things in requestctl [12:07:43] actually the format is standard, but it's embedded in HTML. [12:11:54] 10netops, 06Infrastructure-Foundations: Create public vlans in eqiad and codfw - https://phabricator.wikimedia.org/T422043#11993710 (10ayounsi) [12:12:32] 10netops, 06Infrastructure-Foundations: Create public vlans in eqiad and codfw - https://phabricator.wikimedia.org/T422043#11993711 (10ayounsi) 05Open→03Resolved a:03ayounsi All possible vlans have been created. We will add eqiad A3 and C1 once the new switches are ready. [12:31:17] In any case the Amazonbot list is now up to date. [12:39:00] <3 [20:12:38] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: Moving switches to make space for the refreshed switches. - https://phabricator.wikimedia.org/T428195#11996138 (10VRiley-WMF) 05Open→03Resolved All of the switches in row A and most of row B have been shifted in order to make s...