[08:26:20] Morning folks, can I get a +1 to do a similar change to codfw as I did to eqiad yesterday, please? https://gerrit.wikimedia.org/r/c/operations/puppet/+/1176432 [08:26:42] That's adding one storage node with the new controller, and draining 3 (2 for controller swaps, 1 to get moved to the newer per-rack networking) [09:35:22] looking [10:05:45] TY [12:16:25] FIRING: SystemdUnitFailed: swift_ring_manager.service on ms-fe1009:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [12:31:25] RESOLVED: SystemdUnitFailed: swift_ring_manager.service on ms-fe1009:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [12:32:11] Ah, houston we have a problem [12:32:21] sudo swift-ring-builder /tmp/object.builder set_weight 10.64.48.21/objects_exp0_0 640.0 [12:32:29] => Search value matched 0 devices. [12:33:11] so device creation worked, but then it treats 10.64.48.21/objects_exp0_0 as being a device with options [12:33:28] search looks like: drz-:/_ [12:35:39] So we're going to have to re-do these names to be hyphens not underscores and then do some manual cleanup :( [12:50:32] This is T401387 [12:50:33] T401387: Swift device names should not contain underscores - https://phabricator.wikimedia.org/T401387 [13:17:25] FIRING: SystemdUnitFailed: swift_ring_manager.service on ms-fe1009:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [13:27:25] RESOLVED: SystemdUnitFailed: swift_ring_manager.service on ms-fe1009:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [13:28:01] eqiad cleaned up, codfw will need ~9h before the rings can be changed again. I'll reimage ms-be1091 now, ms-be2088 will have to wait until next week. [14:02:17] I'm kindof grumpy swift-ring-builder will let you make devices with names it then can't search for :(