[00:50:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [01:00:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [07:50:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [08:30:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [12:50:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [13:00:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [13:50:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:00:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:50:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [15:00:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [15:26:28] 06cloud-services-team, 10Cloud-VPS, 13Patch-For-Review: openstack: consider removing labs-ip-aliaser - https://phabricator.wikimedia.org/T374129#10429285 (10Andrew) [16:15:36] 06cloud-services-team, 10Cloud-VPS: Clean up horizon/deploy branches - https://phabricator.wikimedia.org/T382957 (10Andrew) 03NEW [16:16:13] 06cloud-services-team, 10Horizon: Clean up horizon/deploy branches - https://phabricator.wikimedia.org/T382957#10429416 (10taavi) [16:24:16] (03PS1) 10Krinkle: write_config: index labs/countervandalism repos [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1108099 [18:16:00] 06cloud-services-team, 10Cloud-VPS: Kernel error metrics have overlapping definitions - https://phabricator.wikimedia.org/T382961 (10fnegri) 03NEW [18:17:33] 06cloud-services-team, 10Cloud-VPS, 13Patch-For-Review: Kernel error metrics have overlapping definitions - https://phabricator.wikimedia.org/T382961#10429587 (10fnegri) p:05Triage→03Low [18:20:45] 06cloud-services-team, 10Toolforge: Missing replica.my.cnf for freshly created Toolforge account vehicle-keeper-markings - https://phabricator.wikimedia.org/T382962 (10Sascha) 03NEW [18:25:03] 10cloud-services-team (FY2024/2025-Q1-Q2), 13Patch-For-Review: Kernel alerts disappear too quickly - https://phabricator.wikimedia.org/T379378#10429609 (10fnegri) 05In progress→03Resolved I'm closing this as Resolved, as the problem of alerts disappearing too quickly was solved by https://gerrit.wikime... [18:27:13] 06cloud-services-team, 10Toolforge: Missing replica.my.cnf for freshly created Toolforge account vehicle-keeper-markings - https://phabricator.wikimedia.org/T382962#10429617 (10Sascha) [18:33:01] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Cloud-VPS, 13Patch-For-Review: Kernel error metrics have overlapping definitions - https://phabricator.wikimedia.org/T382961#10429619 (10fnegri) [18:33:09] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Cloud-VPS, 13Patch-For-Review: Kernel error metrics have overlapping definitions - https://phabricator.wikimedia.org/T382961#10429621 (10fnegri) 05Open→03In progress [18:50:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [19:00:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [20:20:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [20:30:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [21:04:21] 06cloud-services-team, 10Toolforge: Missing replica.my.cnf for freshly created Toolforge account vehicle-keeper-markings - https://phabricator.wikimedia.org/T382962#10429947 (10bd808) 05Open→03Resolved a:03bd808 The credentials provisioning service had hung somehow. Restarting it seems to have fixed... [21:14:30] 06cloud-services-team, 10Data-Services: maintain-dbusers failing to create user for 'u4692'@'%' on instance-tools-db-4.tools.wmcloud.org - https://phabricator.wikimedia.org/T382974 (10bd808) 03NEW [21:20:53] 06cloud-services-team, 10Data-Services: maintain-dbusers failing to create user for 'u4692'@'%' on instance-tools-db-4.tools.wmcloud.org - https://phabricator.wikimedia.org/T382974#10429985 (10bd808) [21:23:29] (03CR) 10Krinkle: [C:03+2] write_config: index labs/countervandalism repos [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1108099 (owner: 10Krinkle) [21:24:29] (03Merged) 10jenkins-bot: write_config: index labs/countervandalism repos [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1108099 (owner: 10Krinkle) [21:31:37] 06cloud-services-team, 10Toolforge: cfdw-28928147-9qtjx stuck in Terminating state - https://phabricator.wikimedia.org/T382863#10429998 (10LucasWerkmeister) Based on [Toolforge Admin docs](https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Kubernetes#Drain_and_undrain_a_node), [runbooks](https://wikite... [21:35:07] !log bd808@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.drain for node tools-k8s-worker-nfs-69 [21:40:24] !log bd808@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.k8s.worker.drain (exit_code=99) for node tools-k8s-worker-nfs-69 [21:41:09] !log bd808@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-nfs-69 [21:46:19] !log bd808@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-nfs-69 [21:54:05] 06cloud-services-team, 10Toolforge: cfdw-28928147-9qtjx stuck in Terminating state - https://phabricator.wikimedia.org/T382863#10430012 (10bd808) 05Open→03Resolved a:03bd808 `lang=shell-session $ ssh cloudcumin1001.eqiad.wmnet $ sudo cookbook wmcs.toolforge.k8s.worker.drain --cluster-name tools --hos... [22:02:06] 06cloud-services-team, 10Toolforge: [jobs-emailer] duplicate failure emails - https://phabricator.wikimedia.org/T382866#10430028 (10JJMC89) [22:06:23] 06cloud-services-team, 10Cloud-VPS, 10Toolforge, 07Kubernetes: Allow Toolforge roots to reboot k8s worker nodes (without wmcs-root) - https://phabricator.wikimedia.org/T382977 (10LucasWerkmeister) 03NEW [22:15:24] 06cloud-services-team, 10Cloud-VPS, 10Toolforge, 07Kubernetes: Allow Toolforge roots to reboot k8s worker nodes (without wmcs-root) - https://phabricator.wikimedia.org/T382977#10430067 (10Andrew) So either we need to get toolforge roots automatic access to cloudcuminxxxx hosts, or we need to split out a su...