[10:17:06] FIRING: SystemdUnitFailed: swift_ring_manager.service on ms-fe2009:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [10:51:57] ms-fe2009 is not happy [10:57:03] everything swift in codfw was unhappy for a bit just now [11:12:06] RESOLVED: SystemdUnitFailed: swift_ring_manager.service on ms-fe2009:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [11:33:15] hnowlan: thank you for taking care of it [15:15:00] > Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Error: connecting slave requested to start from GTID 171970606-171970606-33444, [15:15:07] s3 wikireplicas broken [15:15:32] oof [15:15:46] :( [15:15:49] checkin [15:15:55] arnaudb: I'm on it [15:16:14] thanks Amir1 [15:16:27] I think the schema change broke it [15:16:46] very likely it ran the schema change with replication and that wasn't needed or something like that [15:17:17] so in that case the course of action would be to resync the replica from a "proper" source? [15:17:38] I need to check [15:17:55] I don't think we can resync the replica from a proper source