[00:32:17] FIRING: JobUnavailable: Reduced availability for job openstack in cloud@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [02:07:56] FIRING: [2x] SystemdUnitDown: The service unit remove_dangling_cinder_snapshots.service is in failed status on host cloudbackup1001-dev. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [03:17:56] FIRING: [3x] SystemdUnitDown: The service unit remove_dangling_cinder_snapshots.service is in failed status on host cloudbackup1001-dev. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [04:02:56] FIRING: [2x] SystemdUnitDown: The systemd unit remove_dangling_cinder_snapshots.service on node cloudbackup1001-dev has been failing for more than two hours. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [04:32:56] RESOLVED: SystemdUnitDown: The service unit remove_dangling_cinder_snapshots.service is in failed status on host cloudbackup1002-dev. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudbackup1002-dev - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [04:33:46] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment codfw1dev for all services [04:33:49] !log andrew@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.restart_openstack (exit_code=99) on deployment codfw1dev for all services [04:43:26] RESOLVED: [2x] SystemdUnitDown: The systemd unit remove_dangling_cinder_snapshots.service on node cloudbackup1001-dev has been failing for more than two hours. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [04:46:41] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment codfw1dev for all services [04:47:51] !log andrew@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.restart_openstack (exit_code=99) on deployment codfw1dev for all services [04:48:50] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment codfw1dev for all services [04:48:54] !log andrew@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.restart_openstack (exit_code=99) on deployment codfw1dev for all services [04:50:11] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment codfw1dev for all services [04:53:47] RESOLVED: JobUnavailable: Reduced availability for job openstack in cloud@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [04:54:21] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) on deployment codfw1dev for all services [05:10:56] FIRING: SystemdUnitDown: The systemd unit opentofu-infra-diff.service on node cloudcontrol1007 has been failing for more than two hours. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [05:40:53] 10Cloud-VPS (Project-requests): Request creation of Gutendex VPS project - https://phabricator.wikimedia.org/T411158 (10Ijon) 03NEW [06:10:44] FIRING: MaintainDBUsersManyErrors: Maintain-dbusers is having sustained errors - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainDBUsersManyErrors - https://grafana.wikimedia.org/d/ae240a06-c13e-49f3-b12c-58432c551e85/wmcs-maintain-dbusers - https://alerts.wikimedia.org/?q=alertname%3DMaintainDBUsersManyErrors [06:15:44] RESOLVED: MaintainDBUsersManyErrors: Maintain-dbusers is having sustained errors - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainDBUsersManyErrors - https://grafana.wikimedia.org/d/ae240a06-c13e-49f3-b12c-58432c551e85/wmcs-maintain-dbusers - https://alerts.wikimedia.org/?q=alertname%3DMaintainDBUsersManyErrors [06:26:39] 06cloud-services-team (FY2025/26-Q1-Q2), 10Data-Services: [wikireplicas] Create views for new wiki tokwiki - https://phabricator.wikimedia.org/T404570#11411843 (10Marostegui) 05Stalled→03Open This can happen now. [06:41:45] 10Tool-wsindex, 10Wikisource Reader App: Books with title having virama characters are being joined in the API - https://phabricator.wikimedia.org/T411159 (10Bodhisattwa) 03NEW [07:51:07] (03CR) 10Muehlenhoff: [V:03+2 C:03+2] Remove obsolete stub secrets [labs/private] - 10https://gerrit.wikimedia.org/r/1211684 (https://phabricator.wikimedia.org/T381565) (owner: 10Muehlenhoff) [07:53:48] FIRING: PuppetConstantChange: Puppet performing a change on every puppet run on cloudweb2002-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange