[08:15:11] Despite the ongoing alert, it seems that swift originals slowed down in February? https://grafana.wikimedia.org/goto/YxgeSXhIk?orgId=1 [09:18:34] (SystemdUnitFailed) firing: prometheus-mysqld-exporter.service on db2194:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:58:56] arnaudb: you working on db2194? If it's not going to be finished today, maybe extend the downtime 'til Monday? [10:02:36] Emperor: downtiming it, indeed :-) sorry for the noise [10:03:14] NP, thanks [13:58:35] (SystemdUnitFailed) firing: (3) ferm.service on ms-be1044:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [13:59:49] (PuppetDisabled) firing: (6) Puppet disabled on ms-be1044:9100 - https://wikitech.wikimedia.org/wiki/Puppet/Runbooks#Puppet_Disabled - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet?var-cluster=swift&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DPuppetDisabled [14:03:08] ^-- those are to be decom; I'll either actually decom them today (depending on some other testing I'm using them for) or extend the silence [14:11:33] will reboot dbproxy2* [16:32:20] arnaudb: did you reload the proxies? They seem down [16:46:01] I may have missed one [16:46:07] will check in 10 [16:47:32] decom done [16:56:00] I did a pass on all 4 just to be safe [16:59:06] looking good: check_alive recent restart 160s [16:59:36] I saw no alert on this, did I miss one? [16:59:49] good catch anyway jynus! thanks <3 [16:59:53] they were red but for some reason notifications were disabled [17:00:07] oh indeed that makes sense [17:01:21] Not something to do often, but at least once a week I like to check silenced notifications, as it is easy to forget about mistaken ones [17:01:40] specially as I week was oncall and wanted to be aware of ongoing maintenance