[02:15:13] (DiskSpace) firing: Disk space krb1001:9100:/ 4.71% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=krb1001 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [06:15:13] (DiskSpace) firing: Disk space krb1001:9100:/ 3.439% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=krb1001 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [06:40:13] (DiskSpace) resolved: Disk space krb1001:9100:/ 3.52% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=krb1001 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [08:28:42] (SystemdUnitFailed) firing: update-tails-mirror.service Failed on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:28:42] (SystemdUnitFailed) resolved: update-tails-mirror.service Failed on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:37:01] * jbond taking a look at tails [09:37:19] nevermind missed the resolved [09:50:37] these probably shouldn't alert unless they fail for longer than a day or so, they are not really actionable anyway (typically the remote server has an issue) [09:50:55] ack [11:04:44] 10Mail, 10Infrastructure-Foundations: Look into behaviour of /etc/exim4/update-exim4.conf.conf related to updates - https://phabricator.wikimedia.org/T154665 (10MoritzMuehlenhoff) 05Open→03Declined Yes, I think we can close this. This didn't cause any other issue AFAICT (in fact I don't remeber the issue t... [11:07:06] 10Packaging, 10Infrastructure-Foundations, 10User-MoritzMuehlenhoff: Sort out which RAID packages are still needed - https://phabricator.wikimedia.org/T216043 (10MoritzMuehlenhoff) 05Open→03Resolved a:03MoritzMuehlenhoff I think this is resolved. Since this task was opened we obsoleted some controllers... [12:13:32] 10SRE-tools, 10Ganeti, 10Infrastructure-Foundations, 10SRE, and 2 others: Create a spicerack cookbook to empty a ganeti node from VMs - https://phabricator.wikimedia.org/T203964 (10MoritzMuehlenhoff) a:03MoritzMuehlenhoff [14:58:39] slyngs: fyi i have added some secrets for idm-test. the are all just random strings apart from the oiidc secret which is fine to share between the two (i wasn;t sure about the otheres so just made them random) [14:59:06] this means that puppet at least will run and we remove the debmoinitor alert email [16:30:42] (SystemdUnitFailed) firing: httpbb_kubernetes_mw-api-ext_hourly.service Failed on cumin2002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [17:30:42] (SystemdUnitFailed) resolved: httpbb_kubernetes_mw-api-ext_hourly.service Failed on cumin2002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed