[01:02:24] FIRING: SystemdUnitFailed: ceph-3f38ada2-2d88-11ef-8c7c-bc97e1bb7c18@osd.10.service on moss-be1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [05:02:24] FIRING: SystemdUnitFailed: ceph-3f38ada2-2d88-11ef-8c7c-bc97e1bb7c18@osd.10.service on moss-be1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:19:38] that's going to be a bad disk [07:32:25] RESOLVED: SystemdUnitFailed: ceph-3f38ada2-2d88-11ef-8c7c-bc97e1bb7c18@osd.10.service on moss-be1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:36:21] T395103 opened [07:36:22] T395103: Disk (sde) failed in moss-be1002 - https://phabricator.wikimedia.org/T395103 [08:15:05] OK, the new thanos backend nodes look to be unhappy because the disks are not in fact set up to be JBOD :-/ [09:43:02] Having fixed that on thanos-be1006 I discover I missed one step in setting new-style storage for these nodes - could I get a +1 to https://gerrit.wikimedia.org/r/c/operations/puppet/+/1149619 please? It contains a link to the docs (and the previous CR that missed one step) [09:49:34] (should be a reasonably easy review) [09:53:11] Emperor: looking [09:58:02] thanks :)