[00:05:55] FIRING: MaxConntrack: Max conntrack at 82.37% on cloudvirt1040:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:10:55] RESOLVED: MaxConntrack: Max conntrack at 83% on cloudvirt1040:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:11:25] FIRING: MaxConntrack: Max conntrack at 81.98% on cloudvirt1040:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:16:25] RESOLVED: MaxConntrack: Max conntrack at 80.42% on cloudvirt1040:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:21:10] FIRING: MaxConntrack: Max conntrack at 80.25% on cloudvirt1040:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:25:25] RESOLVED: MaxConntrack: Max conntrack at 80.63% on cloudvirt1040:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:38:55] FIRING: MaxConntrack: Max conntrack at 81.1% on cloudvirt1040:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:41:10] RESOLVED: MaxConntrack: Max conntrack at 81.19% on cloudvirt1040:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:43:55] FIRING: MaxConntrack: Max conntrack at 81.41% on cloudvirt1040:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:50:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [00:56:10] RESOLVED: MaxConntrack: Max conntrack at 83.54% on cloudvirt1040:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [01:00:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [02:52:45] (03CR) 10Andrew Bogott: [C:03+1] "Now that I understand what this is for... I'm not in great shape to test this locally but it looks ok to me!" [labs/striker] - 10https://gerrit.wikimedia.org/r/1035718 (https://phabricator.wikimedia.org/T362318) (owner: 10Slyngshede) [03:04:17] 06cloud-services-team, 10Cloud-VPS (Debian Buster Deprecation): Replace all codfw1dev Buster VMs - https://phabricator.wikimedia.org/T368341 (10Andrew) 03NEW [03:06:27] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack [03:12:07] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) [03:12:47] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack [03:17:33] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) [03:39:21] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack [03:42:38] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) [04:50:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [05:00:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [05:50:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [05:57:37] 10Data-Services, 06Data-Persistence, 10Data-Platform-SRE (2024.06.17 - 2024.07.07): Modify db-mysql to connect to an-redacteddb1001 from cumin hosts - https://phabricator.wikimedia.org/T368354 (10Marostegui) 03NEW [06:00:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [06:28:45] 06cloud-services-team, 10Cloud-VPS (Debian Buster Deprecation): Cloud-vps Buster deprecation - https://phabricator.wikimedia.org/T331738#9920203 (10Ahecht) Anyone know why I got an email saying that I own a Cloud VPS project called "tools" that is going to be shut down? Did this go out to everyone with a Toolf... [07:08:28] FIRING: WidespreadPuppetAgentFailure: Widespread puppet agent failures in project metricsinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure [07:12:41] (03open) 10marostegui: task_template: Add curl commands [toolforge-repos/switchmaster] - 10https://gitlab.wikimedia.org/toolforge-repos/switchmaster/-/merge_requests/4 [07:12:52] (03update) 10marostegui: task_template: Add curl commands [toolforge-repos/switchmaster] - 10https://gitlab.wikimedia.org/toolforge-repos/switchmaster/-/merge_requests/4 [07:13:00] (03update) 10marostegui: task_template: Add curl commands [toolforge-repos/switchmaster] - 10https://gitlab.wikimedia.org/toolforge-repos/switchmaster/-/merge_requests/4 [07:14:37] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS, 06DC-Ops, 10ops-eqiad, and 2 others: cloudcephosd1021-1034: hard drive sector errors increasing - https://phabricator.wikimedia.org/T348643#9920267 (10dcaro) >>! In T348643#9919050, @CDanis wrote: > Unfortunately `cloudcephosd1020` has too old a Debia... [07:18:28] RESOLVED: WidespreadPuppetAgentFailure: Widespread puppet agent failures in project metricsinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure [07:19:32] 06cloud-services-team, 10Cloud-VPS (Debian Buster Deprecation): Cloud-vps Buster deprecation - https://phabricator.wikimedia.org/T331738#9920271 (10JJMC89) >>! In T331738#9920203, @Ahecht wrote: > Anyone know why I got an email saying that I own a Cloud VPS project called "tools" that is going to be shut down?... [07:23:52] (03close) 10arnaudb: task_template: Add curl commands [toolforge-repos/switchmaster] - 10https://gitlab.wikimedia.org/toolforge-repos/switchmaster/-/merge_requests/4 (owner: 10marostegui) [07:23:57] (03reopen) 10arnaudb: task_template: Add curl commands [toolforge-repos/switchmaster] - 10https://gitlab.wikimedia.org/toolforge-repos/switchmaster/-/merge_requests/4 (owner: 10marostegui) [07:24:00] (03approved) 10arnaudb: task_template: Add curl commands [toolforge-repos/switchmaster] - 10https://gitlab.wikimedia.org/toolforge-repos/switchmaster/-/merge_requests/4 (owner: 10marostegui) [07:29:29] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.drain for node tools-k8s-worker-nfs-33 [07:30:08] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.drain (exit_code=0) for node tools-k8s-worker-nfs-33 [07:31:54] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.openstack.migrate_server_to_ovs for server tools-k8s-worker-nfs-33 [07:31:56] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.drain for node tools-k8s-worker-nfs-34 [07:32:32] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.drain (exit_code=0) for node tools-k8s-worker-nfs-34 [07:33:01] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.drain for node tools-k8s-worker-nfs-35 [07:33:11] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server tools-k8s-worker-nfs-33 [07:33:39] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.drain (exit_code=0) for node tools-k8s-worker-nfs-35 [07:36:21] 06cloud-services-team, 10Data-Services: maintain-dbusers.service failing on cloudcontrol1005 - https://phabricator.wikimedia.org/T368316#9920300 (10taavi) The relevant firewall rule is the `wiki-replica-account-creation` one [[ https://gerrit.wikimedia.org/r/plugins/gitiles/operations/homer/public/+/refs/heads... [07:37:06] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.openstack.migrate_server_to_ovs for server tools-k8s-worker-nfs-34 [07:37:08] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.drain for node tools-k8s-worker-nfs-36 [07:37:47] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.drain (exit_code=0) for node tools-k8s-worker-nfs-36 [07:39:09] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server tools-k8s-worker-nfs-34 [07:40:08] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.openstack.migrate_server_to_ovs for server tools-k8s-worker-nfs-35 [07:40:10] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.drain for node tools-k8s-worker-nfs-37 [07:40:48] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.drain (exit_code=0) for node tools-k8s-worker-nfs-37 [07:41:18] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server tools-k8s-worker-nfs-35 [07:46:41] (03merge) 10marostegui: task_template: Add curl commands [toolforge-repos/switchmaster] - 10https://gitlab.wikimedia.org/toolforge-repos/switchmaster/-/merge_requests/4 [07:53:36] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.openstack.migrate_server_to_ovs for server tools-k8s-worker-nfs-36 [07:53:41] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.drain for node tools-k8s-worker-nfs-38 [07:54:10] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS, 06DC-Ops, 10ops-eqiad, 06SRE: cloudcephosd1021-1034: hard drive sector errors increasing - https://phabricator.wikimedia.org/T348643#9920337 (10dcaro) Just created a silly dashboard with the data that's coming in: https://grafana-rw.wikimedia.org/d/... [07:54:20] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.drain (exit_code=0) for node tools-k8s-worker-nfs-38 [07:54:37] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server tools-k8s-worker-nfs-36 [07:55:06] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.openstack.migrate_server_to_ovs for server tools-k8s-worker-nfs-37 [07:55:11] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.drain for node tools-k8s-worker-nfs-39 [07:55:50] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.drain (exit_code=0) for node tools-k8s-worker-nfs-39 [07:56:16] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server tools-k8s-worker-nfs-37 [07:56:20] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.openstack.migrate_server_to_ovs for server tools-k8s-worker-nfs-38 [07:56:22] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.drain for node tools-k8s-worker-nfs-40 [07:57:01] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.drain (exit_code=0) for node tools-k8s-worker-nfs-40 [07:57:36] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server tools-k8s-worker-nfs-38 [07:58:08] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.openstack.migrate_server_to_ovs for server tools-k8s-worker-nfs-39 [07:58:11] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.drain for node tools-k8s-worker-nfs-41 [07:58:37] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.drain (exit_code=0) for node tools-k8s-worker-nfs-41 [07:59:25] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server tools-k8s-worker-nfs-39 [07:59:48] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.openstack.migrate_server_to_ovs for server tools-k8s-worker-nfs-40 [08:01:58] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server tools-k8s-worker-nfs-40 [08:02:23] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.openstack.migrate_server_to_ovs for server tools-k8s-worker-nfs-41 [08:02:25] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.drain for node tools-k8s-worker-nfs-42 [08:03:34] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server tools-k8s-worker-nfs-41 [08:07:41] !log taavi@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.k8s.worker.drain (exit_code=99) for node tools-k8s-worker-nfs-42 [08:08:01] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.drain for node tools-k8s-worker-nfs-42 [08:10:05] (03CR) 10Majavah: [C:04-1] "The documentation in `contrib/docker/README.md` needs updating. Also, see inline." [labs/striker] - 10https://gerrit.wikimedia.org/r/1035718 (https://phabricator.wikimedia.org/T362318) (owner: 10Slyngshede) [08:13:05] !log taavi@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.k8s.worker.drain (exit_code=99) for node tools-k8s-worker-nfs-42 [08:20:59] (03merge) 10sstefanova: api: remove unprefixed endpoints [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/97 [08:25:05] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS, 06DC-Ops, 10ops-eqiad, and 2 others: cloudcephosd1021-1034: hard drive sector errors increasing - https://phabricator.wikimedia.org/T348643#9920418 (10dcaro) Data is coming in now from both nodes, latencies look similar so far, with sdc on 1034 being... [08:26:09] 10Cloud-VPS (Debian Buster Deprecation): Cloud VPS "sso" project Buster deprecation - https://phabricator.wikimedia.org/T367554#9920417 (10MoritzMuehlenhoff) >>! In T367554#9899621, @jbond wrote: > hi all i wanted to say that the sso project is used so that users have an SSO testing infrastructure to use in clou... [08:27:57] (03open) 10project_1317_bot_df3177307bed93c3f34e421e26c86e38: builds-api: bump to 0.0.156-20240625082108-71537e14 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/346 [08:36:52] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.openstack.migrate_server_to_ovs for server tools-k8s-worker-nfs-42 [08:36:53] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.drain for node tools-k8s-worker-nfs-43 [08:37:32] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.drain (exit_code=0) for node tools-k8s-worker-nfs-43 [08:38:16] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.drain for node tools-k8s-worker-nfs-44 [08:38:54] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.drain (exit_code=0) for node tools-k8s-worker-nfs-44 [08:38:56] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server tools-k8s-worker-nfs-42 [08:39:56] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.openstack.migrate_server_to_ovs for server tools-k8s-worker-nfs-43 [08:40:00] !log taavi@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=99) for server tools-k8s-worker-nfs-43 [08:40:01] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.drain for node tools-k8s-worker-nfs-45 [08:40:39] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.drain (exit_code=0) for node tools-k8s-worker-nfs-45 [08:40:39] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.openstack.migrate_server_to_ovs for server tools-k8s-worker-nfs-44 [08:41:11] 10Cloud-VPS (Debian Buster Deprecation): Cloud VPS "sso" project Buster deprecation - https://phabricator.wikimedia.org/T367554#9920491 (10MoritzMuehlenhoff) [08:42:00] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.drain for node tools-k8s-worker-nfs-46 [08:42:37] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.drain (exit_code=0) for node tools-k8s-worker-nfs-46 [08:42:49] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server tools-k8s-worker-nfs-44 [08:42:56] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.openstack.migrate_server_to_ovs for server tools-k8s-worker-nfs-45 [08:43:11] 10Cloud-VPS (Debian Buster Deprecation): Cloud VPS "sso" project Buster deprecation - https://phabricator.wikimedia.org/T367554#9920493 (10MoritzMuehlenhoff) 05Open→03Resolved a:03MoritzMuehlenhoff The Buster instances have been removed. [08:43:14] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.drain for node tools-k8s-worker-nfs-47 [08:43:51] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.drain (exit_code=0) for node tools-k8s-worker-nfs-47 [08:44:06] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server tools-k8s-worker-nfs-45 [08:44:23] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.openstack.migrate_server_to_ovs for server tools-k8s-worker-nfs-46 [08:45:35] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server tools-k8s-worker-nfs-46 [08:45:39] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.openstack.migrate_server_to_ovs for server tools-k8s-worker-nfs-47 [08:45:43] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.drain for node tools-k8s-worker-nfs-48 [08:46:21] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.drain (exit_code=0) for node tools-k8s-worker-nfs-48 [08:46:49] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server tools-k8s-worker-nfs-47 [08:47:02] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.drain for node tools-k8s-worker-nfs-49 [08:47:04] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.openstack.migrate_server_to_ovs for server tools-k8s-worker-nfs-48 [08:47:40] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.drain (exit_code=0) for node tools-k8s-worker-nfs-49 [08:47:54] (03open) 10dcaro: functional-tests,webservice: add shell test [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/347 [08:48:15] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server tools-k8s-worker-nfs-48 [08:48:21] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.openstack.migrate_server_to_ovs for server tools-k8s-worker-nfs-49 [08:49:16] (03open) 10aborrero: kubernetes: drop ProcMount from securityContext [repos/cloud/toolforge/tools-webservice] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/43 (https://phabricator.wikimedia.org/T362050) [08:49:31] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server tools-k8s-worker-nfs-49 [08:53:14] (03open) 10aborrero: jobs: drop ProcMount from securityContext [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/97 (https://phabricator.wikimedia.org/T362050) [08:54:11] (03update) 10aborrero: kubernetes: drop ProcMount from securityContext [repos/cloud/toolforge/tools-webservice] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/43 (https://phabricator.wikimedia.org/T362050) [08:55:50] (03approved) 10dcaro: jobs: drop ProcMount from securityContext [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/97 (https://phabricator.wikimedia.org/T362050) (owner: 10aborrero) [08:56:01] (03approved) 10dcaro: kubernetes: drop ProcMount from securityContext [repos/cloud/toolforge/tools-webservice] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/43 (https://phabricator.wikimedia.org/T362050) (owner: 10aborrero) [08:57:21] (03update) 10aborrero: kubernetes: drop ProcMount from securityContext [repos/cloud/toolforge/tools-webservice] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/43 (https://phabricator.wikimedia.org/T362050) [08:59:21] (03update) 10aborrero: jobs: drop ProcMount from securityContext [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/97 (https://phabricator.wikimedia.org/T362050) [08:59:55] (03merge) 10aborrero: kubernetes: drop ProcMount from securityContext [repos/cloud/toolforge/tools-webservice] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/43 (https://phabricator.wikimedia.org/T362050) [09:01:55] (03merge) 10aborrero: jobs: drop ProcMount from securityContext [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/97 (https://phabricator.wikimedia.org/T362050) [09:03:58] (03update) 10sstefanova: dev: fix bump script [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/43 [09:05:29] (03open) 10project_1317_bot_df3177307bed93c3f34e421e26c86e38: jobs-api: bump to 0.0.310-20240625090205-108e6a0f [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/348 (https://phabricator.wikimedia.org/T362050) [09:11:08] (03update) 10dcaro: functional-tests,webservice: add shell test [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/347 [09:12:04] (03update) 10dcaro: functional-tests,webservice: add shell test [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/347 [09:15:05] 06cloud-services-team, 10Toolforge (Toolforge iteration 11), 13Patch-For-Review: toolforge: review pod templates for PSP replacement - https://phabricator.wikimedia.org/T362050#9920627 (10aborrero) >>! In T362050#9919714, @bd808 wrote: > ` > $ webservice perl5.36 shell --mount=all > Error from server (Forbid... [09:19:41] FIRING: CloudVPSDesignateLeaks: Detected 10 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [09:21:17] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.openstack.migrate_server_to_ovs for server tools-k8s-ingress-9 [09:22:17] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server tools-k8s-ingress-9 [09:23:10] (03approved) 10aborrero: functional-tests,webservice: add shell test [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/347 (owner: 10dcaro) [09:23:14] FIRING: [2x] ToolforgeKubernetesHAproxyServerDown: Toolforge HAproxy server down: tools-k8s-ingress-9.tools.eqiad1.wikimedia.cloud - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesHAproxyServerDown - https://grafana.wmcloud.org/d/toolforge-k8s-haproxy/toolforge-k8s-haproxy?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesHAproxyServerDown [09:23:26] !log aborrero@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.component.deploy for component jobs-api [09:23:37] !log aborrero@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component jobs-api [09:23:39] !log aborrero@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.component.deploy for component jobs-api [09:23:49] !log aborrero@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component jobs-api [09:26:20] (03merge) 10aborrero: jobs-api: bump to 0.0.310-20240625090205-108e6a0f [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/348 (https://phabricator.wikimedia.org/T362050) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [09:27:59] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.openstack.migrate_server_to_ovs for server tools-k8s-control-9 [09:28:14] RESOLVED: [4x] ToolforgeKubernetesHAproxyServerDown: Toolforge HAproxy server down: tools-k8s-ingress-9.tools.eqiad1.wikimedia.cloud - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesHAproxyServerDown - https://grafana.wmcloud.org/d/toolforge-k8s-haproxy/toolforge-k8s-haproxy?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesHAproxyServerDown [09:29:44] FIRING: [2x] ToolforgeKubernetesHAproxyServerDown: Toolforge HAproxy server down: tools-k8s-control-9.tools.eqiad1.wikimedia.cloud - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesHAproxyServerDown - https://grafana.wmcloud.org/d/toolforge-k8s-haproxy/toolforge-k8s-haproxy?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesHAproxyServerDown [09:30:04] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server tools-k8s-control-9 [09:30:58] RESOLVED: PuppetAgentNoResources: No Puppet resources found on instance cloudinfra-idp-1 on project cloudinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [09:30:59] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.openstack.migrate_server_to_ovs for server tools-k8s-haproxy-6 [09:31:48] 06cloud-services-team, 10Data-Services, 13Patch-For-Review: maintain-dbusers.service failing on cloudcontrol1005 - https://phabricator.wikimedia.org/T368316#9920723 (10cmooney) >>! In T368316#9920300, @taavi wrote: > The relevant firewall rule is the `wiki-replica-account-creation` one [[ https://gerrit.wiki... [09:32:09] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server tools-k8s-haproxy-6 [09:33:29] RESOLVED: [6x] ToolforgeKubernetesHAproxyServerDown: Toolforge HAproxy server down: tools-k8s-control-9.tools.eqiad1.wikimedia.cloud - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesHAproxyServerDown - https://grafana.wmcloud.org/d/toolforge-k8s-haproxy/toolforge-k8s-haproxy?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesHAproxyServerDown [09:35:10] (03open) 10aborrero: d/changelog: bump to 0.103.8 [repos/cloud/toolforge/tools-webservice] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/44 (https://phabricator.wikimedia.org/T362050) [09:35:51] (03update) 10aborrero: d/changelog: bump to 0.103.8 [repos/cloud/toolforge/tools-webservice] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/44 (https://phabricator.wikimedia.org/T362050) [09:36:02] (03update) 10aborrero: d/changelog: bump to 0.103.8 [repos/cloud/toolforge/tools-webservice] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/44 (https://phabricator.wikimedia.org/T362050) [09:36:08] 10Cloud-VPS (Debian Buster Deprecation), 06Research: Cloud VPS "research-collaborations-api" project Buster deprecation - https://phabricator.wikimedia.org/T367551#9920754 (10diego) Just for the records, we have migrated the fact-checking API to another instance and deleted the old one. [09:36:52] 10Toolforge: Elasticsearch credential request for wikitermbase - https://phabricator.wikimedia.org/T368376 (10ForzaGreen) 03NEW [09:37:53] (03open) 10aborrero: utils/bump_version.sh: add Git-Dch header to commit message [repos/cloud/toolforge/tools-webservice] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/45 [09:38:51] 06cloud-services-team, 10Toolforge (Toolforge iteration 11), 13Patch-For-Review: toolforge: review pod templates for PSP replacement - https://phabricator.wikimedia.org/T362050#9920768 (10aborrero) We have decided to drop the `procMount` entry entirely, as it refers to a feature gate we don't even use in our... [09:39:09] (03merge) 10aborrero: d/changelog: bump to 0.103.8 [repos/cloud/toolforge/tools-webservice] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/44 (https://phabricator.wikimedia.org/T362050) [09:46:11] 14MediaWiki-extensions-OpenStackManager, 06Diffusion-Repository-Administrators, 10Projects-Cleanup, 06translatewiki.net, and 2 others: Archive the OpenStackManager extension - https://phabricator.wikimedia.org/T367220#9920817 (10hashar) [09:49:00] (03open) 10dcaro: helpers:add toolforge_get_versions [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/150 [09:50:02] 10Data-Services, 06Data-Persistence, 10Data-Platform-SRE (2024.06.17 - 2024.07.07): Modify db-mysql to connect to an-redacteddb1001 from cumin hosts - https://phabricator.wikimedia.org/T368354#9920842 (10Ladsgroup) > I believe dbutil.py is what we need to start changing - @Ladsgroup can you confirm? Actually... [09:53:14] (03approved) 10aborrero: helpers:add toolforge_get_versions [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/150 (owner: 10dcaro) [09:55:41] 06cloud-services-team, 10Toolforge (Toolforge iteration 11), 13Patch-For-Review: toolforge: review pod templates for PSP replacement - https://phabricator.wikimedia.org/T362050#9920876 (10aborrero) 05In progress→03Resolved [09:55:57] (03update) 10dcaro: functional-tests,webservice: add shell test [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/347 [09:56:10] 06cloud-services-team, 10Toolforge: [k8s,infra] track PSP migration plan - https://phabricator.wikimedia.org/T364297#9920883 (10aborrero) [09:56:31] 06cloud-services-team, 10Toolforge: [k8s,infra] track PSP migration plan - https://phabricator.wikimedia.org/T364297#9920885 (10aborrero) [09:58:48] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS, 06DC-Ops, 10ops-eqiad, 06SRE: cloudcephosd1021-1034: hard drive sector errors increasing - https://phabricator.wikimedia.org/T348643#9920891 (10dcaro) >>! In T348643#9920418, @dcaro wrote: > Data is coming in now from both nodes, latencies look simi... [10:00:24] (03update) 10dcaro: functional-tests,webservice: add shell test [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/347 [10:00:25] (03approved) 10dcaro: functional-tests,webservice: add shell test [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/347 [10:00:30] (03merge) 10dcaro: functional-tests,webservice: add shell test [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/347 [10:01:11] (03approved) 10dcaro: utils/bump_version.sh: add Git-Dch header to commit message [repos/cloud/toolforge/tools-webservice] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/45 (owner: 10aborrero) [10:11:10] (03update) 10aborrero: utils/bump_version.sh: add Git-Dch header to commit message [repos/cloud/toolforge/tools-webservice] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/45 [10:12:59] (03merge) 10aborrero: utils/bump_version.sh: add Git-Dch header to commit message [repos/cloud/toolforge/tools-webservice] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/45 [10:14:10] (03approved) 10dcaro: helpers:add toolforge_get_versions [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/150 [10:14:13] (03update) 10dcaro: helpers:add toolforge_get_versions [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/150 [10:14:14] (03merge) 10dcaro: helpers:add toolforge_get_versions [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/150 [10:25:30] (03open) 10dcaro: functional-tests: show the installed versions [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/349 [10:29:20] 06cloud-services-team, 10Data-Services, 06SRE: [wikireplicas] Make sure there is no sensitive data in clouddb hosts - https://phabricator.wikimedia.org/T368136#9921045 (10fnegri) This was linked in the parent task but I'm not sure if it's really a blocker here: T103011 [10:30:24] (03open) 10aborrero: helpers: rework many resources creation script [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/151 [10:39:57] 06cloud-services-team, 10Data-Services, 06SRE: [wikireplicas] Make sure there is no sensitive data in clouddb hosts - https://phabricator.wikimedia.org/T368136#9921082 (10Marostegui) >>! In T368136#9919314, @bd808 wrote: > What sort of data y'all are concerned about exposing to new roots on the replica db ho... [10:59:23] 06cloud-services-team, 10Data-Services, 06SRE: [wikireplicas] Make sure there is no sensitive data in clouddb hosts - https://phabricator.wikimedia.org/T368136#9921203 (10fnegri) > I assume wmcs-roots is just WMCS staff and those would be the ones having root access? wmcs-roots is defined in [admin/data/da... [11:01:34] (03PS1) 10Majavah: vps: Add a cookbook to move a floating IP address to an another server [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1049519 [11:01:39] (03update) 10aborrero: helpers: rework many resources creation script [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/151 [11:02:48] !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.vps.migirate_floating_ip for address 185.15.56.33 to server 'toolsbeta-proxy-6' [11:03:00] !log taavi@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.vps.migirate_floating_ip (exit_code=0) for address 185.15.56.33 to server 'toolsbeta-proxy-6' [11:04:26] (03CR) 10CI reject: [V:04-1] vps: Add a cookbook to move a floating IP address to an another server [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1049519 (owner: 10Majavah) [11:05:18] (03PS2) 10Majavah: vps: Add a cookbook to move a floating IP address to an another server [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1049519 [11:06:18] !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.vps.migrate_floating_ip for address 185.15.56.33 to server 'toolsbeta-proxy-5' [11:39:25] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.drain for node tools-k8s-worker-nfs-50 [11:40:03] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.drain (exit_code=0) for node tools-k8s-worker-nfs-50 [11:45:51] FIRING: ProbeDown: Service tools-k8s-haproxy-6:30000 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-k8s-haproxy-6:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [11:50:51] RESOLVED: ProbeDown: Service tools-k8s-haproxy-6:30000 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-k8s-haproxy-6:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [11:51:25] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.openstack.migrate_server_to_ovs for server tools-k8s-worker-50 [11:51:27] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.drain for node tools-k8s-worker-nfs-51 [11:51:28] !log taavi@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=99) for server tools-k8s-worker-50 [11:52:06] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.drain (exit_code=0) for node tools-k8s-worker-nfs-51 [11:56:18] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.drain for node tools-k8s-worker-nfs-52 [11:56:24] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.openstack.migrate_server_to_ovs for server tools-k8s-worker-50 [11:56:26] !log taavi@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=99) for server tools-k8s-worker-50 [11:56:37] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.openstack.migrate_server_to_ovs for server tools-k8s-worker-nfs-50 [11:56:56] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.drain (exit_code=0) for node tools-k8s-worker-nfs-52 [11:57:54] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server tools-k8s-worker-nfs-50 [12:02:42] (03update) 10aborrero: kyverno: raise CPU request and limits [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/345 [12:07:50] (03update) 10aborrero: kyverno: raise CPU request and limits [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/345 [12:26:55] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.drain (exit_code=0) for node tools-k8s-worker-103 [12:27:50] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server tools-k8s-worker-102 [12:28:23] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.openstack.migrate_server_to_ovs for server tools-k8s-worker-103 [12:28:24] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.drain for node tools-k8s-worker-104 [12:29:06] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.drain (exit_code=0) for node tools-k8s-worker-104 [12:29:51] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server tools-k8s-worker-103 [12:30:05] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.openstack.migrate_server_to_ovs for server tools-k8s-worker-104 [12:31:47] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server tools-k8s-worker-104 [12:40:44] !log taavi@cloudcumin1001 project-proxy START - Cookbook wmcs.openstack.migrate_server_to_ovs for server proxy-03 [12:42:05] !log taavi@cloudcumin1001 project-proxy END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server proxy-03 [12:47:58] FIRING: MetricsinfraAlertmanagerDown: Metricsinfra alertmanager is unreachable #page - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/MetricsinfraAlertmanagerDown - TODO - https://alerts.wikimedia.org/?q=alertname%3DMetricsinfraAlertmanagerDown [13:01:20] !log taavi@cloudcumin1001 bastioninfra-codfw1dev START - Cookbook wmcs.openstack.migrate_server_to_ovs for server bastion-codfw1dev-02 [13:01:21] taavi@cloudcumin1001: Unknown project "bastioninfra-codfw1dev" [13:02:45] !log taavi@cloudcumin1001 bastioninfra-codfw1dev END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server bastion-codfw1dev-02 [13:02:45] taavi@cloudcumin1001: Unknown project "bastioninfra-codfw1dev" [13:03:29] FIRING: WidespreadPuppetAgentFailure: Widespread puppet agent failures in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure [13:05:28] FIRING: WidespreadPuppetAgentFailure: Widespread puppet agent failures in project project-proxy - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure [13:06:28] FIRING: WidespreadPuppetAgentFailure: Widespread puppet agent failures in project metricsinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure [13:09:28] FIRING: WidespreadPuppetAgentFailure: Widespread puppet agent failures in project cvn - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure [13:09:28] RESOLVED: [2x] MetricsinfraAlertmanagerDown: Metricsinfra alertmanager is unreachable #page - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/MetricsinfraAlertmanagerDown - TODO - https://alerts.wikimedia.org/?q=alertname%3DMetricsinfraAlertmanagerDown [13:11:22] !log taavi@cloudcumin1001 procbot START - Cookbook wmcs.openstack.migrate_server_to_ovs for server bastion [13:12:25] !log taavi@cloudcumin1001 procbot END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server bastion [13:12:47] !log taavi@cloudcumin1001 language START - Cookbook wmcs.openstack.migrate_server_to_ovs for server cxserver [13:13:28] FIRING: WidespreadPuppetAgentFailure: Widespread puppet agent failures in project gitlab-runners - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure [13:14:06] !log taavi@cloudcumin1001 language END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server cxserver [13:14:10] !log taavi@cloudcumin1001 deployment-prep START - Cookbook wmcs.openstack.migrate_project_to_ovs [13:15:15] !log taavi@cloudcumin1001 project-proxy START - Cookbook wmcs.vps.migrate_floating_ip for address 185.15.56.55 to server 'maps-proxy-04' [13:15:27] !log taavi@cloudcumin1001 project-proxy END (PASS) - Cookbook wmcs.vps.migrate_floating_ip (exit_code=0) for address 185.15.56.55 to server 'maps-proxy-04' [13:15:42] !log taavi@cloudcumin1001 project-proxy START - Cookbook wmcs.openstack.migrate_server_to_ovs for server maps-proxy-03 [13:16:28] FIRING: WidespreadPuppetAgentFailure: Widespread puppet agent failures in project cloudinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure [13:16:53] !log taavi@cloudcumin1001 project-proxy END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server maps-proxy-03 [13:17:07] !log taavi@cloudcumin1001 etytree START - Cookbook wmcs.openstack.migrate_server_to_ovs for server etytree-b [13:17:28] FIRING: WidespreadPuppetAgentFailure: Widespread puppet agent failures in project paws - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure [13:17:51] !log taavi@cloudcumin1001 entity-detection START - Cookbook wmcs.openstack.migrate_project_to_ovs [13:18:10] !log taavi@cloudcumin1001 etytree END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server etytree-b [13:19:13] !log taavi@cloudcumin1001 entity-detection END (PASS) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=0) [13:19:34] !log taavi@cloudcumin1001 wikiapiary START - Cookbook wmcs.openstack.migrate_project_to_ovs [13:19:56] FIRING: CloudVPSDesignateLeaks: Detected 12 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [13:20:11] !log taavi@cloudcumin1001 wikiapiary END (FAIL) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=1) [13:21:09] !log taavi@cloudcumin1001 wikiapiary START - Cookbook wmcs.openstack.migrate_project_to_ovs [13:22:02] !log taavi@cloudcumin1001 wikiapiary END (PASS) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=0) [13:24:42] !log taavi@cloudcumin1001 tf-infra-test START - Cookbook wmcs.openstack.migrate_project_to_ovs [13:24:45] !log taavi@cloudcumin1001 tf-infra-test END (PASS) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=0) [13:25:19] !log taavi@cloudcumin1001 tofuinfratest START - Cookbook wmcs.openstack.migrate_project_to_ovs [13:27:21] !log taavi@cloudcumin1001 deployment-prep END (FAIL) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=1) [13:27:42] !log taavi@cloudcumin1001 tofuinfratest END (PASS) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=0) [13:27:53] !log taavi@cloudcumin1001 language START - Cookbook wmcs.openstack.migrate_project_to_ovs [13:29:06] !log taavi@cloudcumin1001 language END (PASS) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=0) [13:30:33] !log taavi@cloudcumin1001 linkwatcher START - Cookbook wmcs.openstack.migrate_project_to_ovs [13:32:00] !log taavi@cloudcumin1001 linkwatcher END (FAIL) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=1) [13:32:06] !log taavi@cloudcumin1001 dumps START - Cookbook wmcs.openstack.migrate_project_to_ovs [13:32:29] !log taavi@cloudcumin1001 google-api-proxy START - Cookbook wmcs.openstack.migrate_project_to_ovs [13:33:41] !log taavi@cloudcumin1001 google-api-proxy END (PASS) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=0) [13:34:09] !log taavi@cloudcumin1001 redirects START - Cookbook wmcs.openstack.migrate_project_to_ovs [13:34:49] !log taavi@cloudcumin1001 centralnotice-staging START - Cookbook wmcs.openstack.migrate_project_to_ovs [13:35:14] !log taavi@cloudcumin1001 redirects END (PASS) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=0) [13:35:35] !log taavi@cloudcumin1001 globaleducation START - Cookbook wmcs.openstack.migrate_project_to_ovs [13:35:40] !log taavi@cloudcumin1001 dumps END (FAIL) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=1) [13:35:51] !log taavi@cloudcumin1001 eventmetrics START - Cookbook wmcs.openstack.migrate_project_to_ovs [13:36:02] !log taavi@cloudcumin1001 centralnotice-staging END (PASS) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=0) [13:36:29] !log taavi@cloudcumin1001 globaleducation END (PASS) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=0) [13:36:48] !log taavi@cloudcumin1001 eventmetrics END (FAIL) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=1) [13:44:28] RESOLVED: WidespreadPuppetAgentFailure: Widespread puppet agent failures in project cvn - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure [13:53:28] RESOLVED: WidespreadPuppetAgentFailure: Widespread puppet agent failures in project gitlab-runners - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure [14:25:18] 06cloud-services-team, 10Toolforge, 13Patch-For-Review: toolforge: kyverno: change policies to Enforce - https://phabricator.wikimedia.org/T368141#9921915 (10aborrero) This I've checked: 1) re-read the upstream docs about what happens if you set policies to enforce while there are offending resources define... [14:30:07] 10cloud-services-team (FY2023/2024-Q3-Q4), 10superset.wmcloud.org: Allow Superset to query ToolsDB public databases - https://phabricator.wikimedia.org/T367393#9921956 (10fnegri) @KCVelaga_WMF I think your plan should work, and I don't see any problem unless the size of the aggregated data gets too big (we cur... [14:54:12] 06cloud-services-team, 10Data-Services, 06SRE: [wikireplicas] Make sure there is no sensitive data in clouddb hosts - https://phabricator.wikimedia.org/T368136#9922021 (10bd808) >>! In T368136#9921082, @Marostegui wrote: > Also, the issue with root is that that user can make changes to replication, grants, s... [14:58:51] 06cloud-services-team, 10Data-Services, 06SRE: [wikireplicas] Make sure there is no sensitive data in clouddb hosts - https://phabricator.wikimedia.org/T368136#9922052 (10fnegri) > That is true, but also not clearly in the scope of this ticket which seems to be specifically about addressing claims of data pr... [15:04:28] FIRING: WidespreadPuppetAgentFailure: Widespread puppet agent failures in project gitlab-runners - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure [15:07:28] FIRING: WidespreadPuppetAgentFailure: Widespread puppet agent failures in project cvn - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure [15:11:28] FIRING: WidespreadPuppetAgentFailure: Widespread puppet agent failures in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure [15:27:28] RESOLVED: WidespreadPuppetAgentFailure: Widespread puppet agent failures in project cvn - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure [15:31:28] RESOLVED: WidespreadPuppetAgentFailure: Widespread puppet agent failures in project metricsinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure [15:31:28] RESOLVED: WidespreadPuppetAgentFailure: Widespread puppet agent failures in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure [15:32:28] RESOLVED: WidespreadPuppetAgentFailure: Widespread puppet agent failures in project paws - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure [15:34:28] RESOLVED: WidespreadPuppetAgentFailure: Widespread puppet agent failures in project gitlab-runners - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure [15:38:45] (03open) 10aborrero: kyverno_pod_policy: validate fsGroup setting only if present [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/50 (https://phabricator.wikimedia.org/T362050) [15:40:28] RESOLVED: WidespreadPuppetAgentFailure: Widespread puppet agent failures in project cloudinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure [15:40:29] RESOLVED: WidespreadPuppetAgentFailure: Widespread puppet agent failures in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure [15:41:29] RESOLVED: WidespreadPuppetAgentFailure: Widespread puppet agent failures in project project-proxy - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure [15:42:34] (03update) 10aborrero: kyverno_pod_policy: validate fsGroup setting only if present [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/50 (https://phabricator.wikimedia.org/T362050) [15:46:52] (03update) 10aborrero: kyverno_pod_policy: validate fsGroup setting only if present [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/50 (https://phabricator.wikimedia.org/T362050) [15:48:00] (03approved) 10dcaro: kyverno_pod_policy: validate fsGroup setting only if present [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/50 (https://phabricator.wikimedia.org/T362050) (owner: 10aborrero) [15:51:01] (03merge) 10aborrero: kyverno_pod_policy: validate fsGroup setting only if present [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/50 (https://phabricator.wikimedia.org/T362050) [15:53:46] (03open) 10project_1317_bot_df3177307bed93c3f34e421e26c86e38: maintain-kubeusers: bump to 0.0.154-20240625155114-8428f7d3 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/350 (https://phabricator.wikimedia.org/T362050) [15:54:07] !log aborrero@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.component.deploy for component maintain-kubeusers [15:54:21] !log aborrero@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component maintain-kubeusers [15:56:45] !log aborrero@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.component.deploy for component maintain-kubeusers [15:56:55] !log aborrero@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component maintain-kubeusers [15:57:21] (03merge) 10aborrero: maintain-kubeusers: bump to 0.0.154-20240625155114-8428f7d3 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/350 (https://phabricator.wikimedia.org/T362050) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [16:13:21] FIRING: MaintainKubeusersHang: maintain-kubeusers last finished run is 28.66M minutes old - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainKubeusersDown - https://prometheus-alerts.wmcloud.org/?q=alertname%3DMaintainKubeusersHang [16:15:51] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.drain on host 'cloudvirt2006-dev.codfw.wmnet' [16:15:51] !log andrew@cloudcumin1001 admin END (ERROR) - Cookbook wmcs.openstack.cloudvirt.drain (exit_code=97) on host 'cloudvirt2006-dev.codfw.wmnet' [16:19:15] 06cloud-services-team, 10Cloud-VPS: Migrate codfw1dev hypervisors to Neutron OVS agent - https://phabricator.wikimedia.org/T368426 (10Andrew) 03NEW [16:20:59] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.drain on host 'cloudvirt2004-dev.codfw.wmnet' [16:26:15] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.drain (exit_code=0) on host 'cloudvirt2004-dev.codfw.wmnet' [16:28:21] RESOLVED: MaintainKubeusersHang: maintain-kubeusers last finished run is 28.66M minutes old - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainKubeusersDown - https://prometheus-alerts.wmcloud.org/?q=alertname%3DMaintainKubeusersHang [16:43:47] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS: Migrate eqiad1 hypervisors to Neutron OVS agent - https://phabricator.wikimedia.org/T364457#9922804 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1002 for host cloudvirt2004-dev.codfw.wmnet with OS bookworm [16:57:54] 10Data-Services, 10VPS-Projects: Request access to NFS mount /public/dumps for research-collaborations-api Cloud VPS project - https://phabricator.wikimedia.org/T368432 (10Isaac) 03NEW [17:02:22] 10Data-Services, 10VPS-Projects: Request access to NFS mount /public/dumps for research-collaborations-api Cloud VPS project - https://phabricator.wikimedia.org/T368432#9922870 (10Isaac) 05Open→03Resolved a:03Isaac Actually sorry I realize that the project already has access, I just needed to enable... [17:02:23] 10Cloud-VPS (Debian Buster Deprecation), 06Research: Cloud VPS "research-collaborations-api" project Buster deprecation - https://phabricator.wikimedia.org/T367551#9922875 (10Isaac) > Oh, the current instance also has access to the dumps so we never downloaded files over the internet. Oh great, this makes it e... [17:08:09] !log dcaro@urcuchillay redirects START - Cookbook wmcs.openstack.cloudvirt.vm_console [17:08:11] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Redirects/SAL [17:08:17] !log dcaro@urcuchillay redirects END (ERROR) - Cookbook wmcs.openstack.cloudvirt.vm_console (exit_code=255) [17:08:18] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Redirects/SAL [17:15:19] !log dcaro@urcuchillay redirects START - Cookbook wmcs.openstack.cloudvirt.vm_console [17:15:21] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Redirects/SAL [17:15:37] !log dcaro@urcuchillay redirects END (PASS) - Cookbook wmcs.openstack.cloudvirt.vm_console (exit_code=0) [17:15:38] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Redirects/SAL [17:19:56] FIRING: CloudVPSDesignateLeaks: Detected 23 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [17:20:01] !log dcaro@urcuchillay redirects START - Cookbook wmcs.openstack.cloudvirt.vm_console [17:20:02] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Redirects/SAL [17:24:29] 10Cloud-VPS (Debian Buster Deprecation), 06Release-Engineering-Team: Cloud VPS "integration" project Buster deprecation - https://phabricator.wikimedia.org/T367534#9922959 (10thcipriani) [17:27:14] 10Cloud-VPS (Debian Buster Deprecation), 06Infrastructure-Foundations, 06Release-Engineering-Team: Cloud VPS "integration" project Buster deprecation - https://phabricator.wikimedia.org/T367534#9922964 (10thcipriani) Hrm, those pkgbuilder hosts are used for the Jenkins debian glue jobs—testing debian package... [17:29:00] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS: Migrate eqiad1 hypervisors to Neutron OVS agent - https://phabricator.wikimedia.org/T364457#9923014 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1002 for host cloudvirt2004-dev.codfw.wmnet with OS bookworm completed:... [17:37:15] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS, 13Patch-For-Review: Migrate eqiad1 hypervisors to Neutron OVS agent - https://phabricator.wikimedia.org/T364457#9923081 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1002 for host cloudvirt2004-dev.codfw.wmnet wi... [17:45:50] 10Cloud-VPS, 10Tool-spacemedia, 10Toolforge: DNS name resolution failure with cdn.esahubble.org from Cloud VPS & Toolforge - https://phabricator.wikimedia.org/T368439 (10Don-vip) 03NEW [18:22:43] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS, 13Patch-For-Review: Migrate eqiad1 hypervisors to Neutron OVS agent - https://phabricator.wikimedia.org/T364457#9923286 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1002 for host cloudvirt2004-dev.codfw.wmnet with O... [18:29:41] RESOLVED: CloudVPSDesignateLeaks: Detected 26 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [18:41:19] !log andrew@cloudcumin1001 cloudvirt-canary START - Cookbook wmcs.openstack.cloudvirt.lib.ensure_canary on codfw1dev, with recreate True, for hosts list: ['cloudvirt2004-dev'] [18:41:43] !log andrew@cloudcumin1001 cloudvirt-canary END (PASS) - Cookbook wmcs.openstack.cloudvirt.lib.ensure_canary (exit_code=0) on codfw1dev, with recreate True, for hosts list: ['cloudvirt2004-dev'] [19:32:00] 10Cloud-VPS (Debian Buster Deprecation), 10WMIT-Infrastructure: Cloud VPS "osmit" project Buster deprecation - https://phabricator.wikimedia.org/T367543#9923551 (10valerio.bozzolan) [19:32:04] 10Cloud-VPS (Debian Buster Deprecation), 10WMIT-Infrastructure: Cloud VPS "osmit" project Buster deprecation - https://phabricator.wikimedia.org/T367543#9923548 (10valerio.bozzolan) @LorenzoStucchi [20:14:41] 10Tool-containers: Document at containers.toolforge.org - https://phabricator.wikimedia.org/T368391#9923666 (10LucasWerkmeister) This ought to work, I think: `lang=yaml apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: containers-root namespace: tool-containers labels: name: containers-ro... [21:14:43] !log andrew@cloudcumin1001 cloudinfra-codfw1dev START - Cookbook wmcs.openstack.migrate_project_to_ovs [21:14:44] andrew@cloudcumin1001: Unknown project "cloudinfra-codfw1dev" [21:22:59] !log andrew@cloudcumin1001 cloudinfra-codfw1dev END (FAIL) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=1) [21:23:01] andrew@cloudcumin1001: Unknown project "cloudinfra-codfw1dev" [21:24:44] !log andrew@cloudcumin1001 cloudinfra-codfw1dev START - Cookbook wmcs.openstack.migrate_server_to_ovs for server enc-1 [21:24:44] andrew@cloudcumin1001: Unknown project "cloudinfra-codfw1dev" [21:24:46] !log andrew@cloudcumin1001 cloudinfra-codfw1dev END (FAIL) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=99) for server enc-1 [21:24:46] andrew@cloudcumin1001: Unknown project "cloudinfra-codfw1dev" [21:26:40] !log andrew@cloudcumin1001 cloudinfra-codfw1dev START - Cookbook wmcs.openstack.migrate_server_to_ovs for server enc-1 [21:26:40] andrew@cloudcumin1001: Unknown project "cloudinfra-codfw1dev" [21:26:45] !log andrew@cloudcumin1001 cloudinfra-codfw1dev END (FAIL) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=99) for server enc-1 [21:26:45] andrew@cloudcumin1001: Unknown project "cloudinfra-codfw1dev" [21:28:23] !log andrew@cloudcumin1001 cloudinfra-codfw1dev START - Cookbook wmcs.openstack.migrate_server_to_ovs for server enc-1 [21:28:24] andrew@cloudcumin1001: Unknown project "cloudinfra-codfw1dev" [21:29:30] !log andrew@cloudcumin1001 cloudinfra-codfw1dev END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server enc-1 [21:29:35] andrew@cloudcumin1001: Unknown project "cloudinfra-codfw1dev" [21:29:44] !log andrew@cloudcumin1001 cloudinfra-codfw1dev START - Cookbook wmcs.openstack.migrate_project_to_ovs [21:29:44] andrew@cloudcumin1001: Unknown project "cloudinfra-codfw1dev" [21:31:41] !log andrew@cloudcumin1001 cloudinfra-codfw1dev END (PASS) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=0) [21:31:41] andrew@cloudcumin1001: Unknown project "cloudinfra-codfw1dev" [21:31:49] !log andrew@cloudcumin1001 andrewtestproject START - Cookbook wmcs.openstack.migrate_project_to_ovs [21:31:49] andrew@cloudcumin1001: Unknown project "andrewtestproject" [21:33:03] !log andrew@cloudcumin1001 andrewtestproject END (FAIL) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=1) [21:33:03] andrew@cloudcumin1001: Unknown project "andrewtestproject" [21:33:29] !log andrew@cloudcumin1001 tools-codfw1dev START - Cookbook wmcs.openstack.migrate_project_to_ovs [21:33:30] andrew@cloudcumin1001: Unknown project "tools-codfw1dev" [21:36:51] !log andrew@cloudcumin1001 tools-codfw1dev END (FAIL) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=1) [21:36:51] andrew@cloudcumin1001: Unknown project "tools-codfw1dev" [21:37:40] !log andrew@cloudcumin1001 pawsdev START - Cookbook wmcs.openstack.migrate_project_to_ovs [21:37:40] andrew@cloudcumin1001: Unknown project "pawsdev" [21:38:51] 10Toolforge: `webservice shell` (build 0.103.8) crashes on login-buster.toolforge.org (python 3.7) - https://phabricator.wikimedia.org/T368463 (10bd808) 03NEW [21:39:03] 10Toolforge: `webservice shell` (build 0.103.8) crashes on login-buster.toolforge.org (python 3.7) - https://phabricator.wikimedia.org/T368463#9923992 (10bd808) [21:39:05] 06cloud-services-team, 10Toolforge (Toolforge iteration 11): toolforge: review pod templates for PSP replacement - https://phabricator.wikimedia.org/T362050#9923993 (10bd808) [21:39:50] !log andrew@cloudcumin1001 pawsdev END (PASS) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=0) [21:39:51] andrew@cloudcumin1001: Unknown project "pawsdev" [21:40:20] !log andrew@cloudcumin1001 taavi-test-project START - Cookbook wmcs.openstack.migrate_project_to_ovs [21:40:20] andrew@cloudcumin1001: Unknown project "taavi-test-project" [21:40:24] 10Toolforge: `webservice shell` (build 0.103.8) crashes on login-buster.toolforge.org (python 3.7) - https://phabricator.wikimedia.org/T368463#9923994 (10bd808) >>! In https://bash.toolforge.org/quip/Qy03oX0B1jz_IcWu7ynf @bd808 said: > (snark injected into a discussion of strong typing in python) > are y... [21:40:26] 10Cloud-VPS (Quota-requests), 10Tool-spacemedia: Request quota increase for spacemedia project - https://phabricator.wikimedia.org/T368464 (10Don-vip) 03NEW [21:41:33] 10Cloud-VPS (Quota-requests), 10Tool-spacemedia: Request quota increase for spacemedia project - https://phabricator.wikimedia.org/T368464#9924006 (10bd808) [21:42:52] 06cloud-services-team, 10Cloud-VPS (Quota-requests), 10Tool-spacemedia: Request quota increase for spacemedia project - https://phabricator.wikimedia.org/T368464#9924008 (10bd808) +1 [21:43:42] !log andrew@cloudcumin1001 taavi-test-project END (PASS) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=0) [21:43:42] andrew@cloudcumin1001: Unknown project "taavi-test-project" [21:44:06] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.drain on host 'cloudvirt2005-dev.codfw.wmnet' [21:45:48] 10Toolforge: `webservice shell` (build 0.103.8) crashes on login-buster.toolforge.org (python 3.7) - https://phabricator.wikimedia.org/T368463#9924017 (10bd808) We need to keep supporting Buster and Python 3.7 until we have a fix for {T360488} or {T360818}. [21:46:25] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.drain (exit_code=0) on host 'cloudvirt2005-dev.codfw.wmnet' [21:47:48] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS, 13Patch-For-Review: Migrate eqiad1 hypervisors to Neutron OVS agent - https://phabricator.wikimedia.org/T364457#9924022 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1002 for host cloudvirt2005-dev.codfw.wmnet wi... [21:48:39] 10Toolforge: `webservice` (build 0.103.8) crashes on login-buster.toolforge.org (python 3.7) - https://phabricator.wikimedia.org/T368463#9924020 (10bd808) [21:48:53] !log andrew@cloudcumin1001 proxy-codfw1dev START - Cookbook wmcs.openstack.migrate_project_to_ovs [21:48:53] andrew@cloudcumin1001: Unknown project "proxy-codfw1dev" [21:54:05] !log andrew@cloudcumin1001 proxy-codfw1dev END (PASS) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=0) [21:54:06] andrew@cloudcumin1001: Unknown project "proxy-codfw1dev" [21:54:18] 10Toolforge: `webservice` (build 0.103.8) crashes on login-buster.toolforge.org (python 3.7) - https://phabricator.wikimedia.org/T368463#9924029 (10bd808) p:05Triage→03High A live hack is in place that makes `webservice` work under Python 3.7 again, but it will break if a newer toolforge-webservice is pushed... [21:54:27] !log andrew@cloudcumin1001 bastioninfra-codfw1dev START - Cookbook wmcs.openstack.migrate_project_to_ovs [21:54:28] andrew@cloudcumin1001: Unknown project "bastioninfra-codfw1dev" [21:55:17] !log andrew@cloudcumin1001 bastioninfra-codfw1dev END (PASS) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=0) [21:55:17] andrew@cloudcumin1001: Unknown project "bastioninfra-codfw1dev" [21:56:10] !log andrew@cloudcumin1001 tf-infra-dev START - Cookbook wmcs.openstack.migrate_project_to_ovs [21:56:10] andrew@cloudcumin1001: Unknown project "tf-infra-dev" [21:58:39] !log andrew@cloudcumin1001 tf-infra-dev END (PASS) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=0) [21:58:40] andrew@cloudcumin1001: Unknown project "tf-infra-dev" [21:58:44] !log andrew@cloudcumin1001 k8s-dev START - Cookbook wmcs.openstack.migrate_project_to_ovs [21:58:44] andrew@cloudcumin1001: Unknown project "k8s-dev" [22:04:53] !log andrew@cloudcumin1001 k8s-dev END (PASS) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=0) [22:04:54] andrew@cloudcumin1001: Unknown project "k8s-dev" [22:07:15] !log andrew@cloudcumin1001 k8s-dev START - Cookbook wmcs.openstack.migrate_dbinstance_to_ovs for server tbd [22:07:15] andrew@cloudcumin1001: Unknown project "k8s-dev" [22:18:47] 06cloud-services-team, 10Data-Services: maintain-dbusers.service failing on cloudcontrol1005 - https://phabricator.wikimedia.org/T368316#9924061 (10Andrew) 05Open→03Resolved a:03Andrew I've enabled puppet and restarted maintain-dbusers, everything seems to be working. Thanks all! [22:20:05] !log andrew@cloudcumin1001 k8s-dev END (ERROR) - Cookbook wmcs.openstack.migrate_dbinstance_to_ovs (exit_code=97) for server tbd [22:20:06] andrew@cloudcumin1001: Unknown project "k8s-dev" [22:20:59] !log andrew@cloudcumin1001 testlabs START - Cookbook wmcs.openstack.migrate_project_to_ovs [22:22:21] !log andrew@cloudcumin1001 testlabs END (PASS) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=0) [22:24:24] (03open) 10anticomposite: Replace Python 3.9 type aliases with 3.7-compatible aliases [repos/cloud/toolforge/tools-webservice] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/46 (https://phabricator.wikimedia.org/T368463) [22:25:32] !log andrew@cloudcumin1001 testlabs START - Cookbook wmcs.openstack.migrate_project_to_ovs [22:25:36] !log andrew@cloudcumin1001 testlabs END (PASS) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=0) [22:25:40] !log andrew@cloudcumin1001 k8s-dev START - Cookbook wmcs.openstack.migrate_dbinstance_to_ovs for server tbd [22:25:41] andrew@cloudcumin1001: Unknown project "k8s-dev" [22:30:25] 10Toolforge, 13Patch-For-Review: `webservice` (build 0.103.8) crashes on login-buster.toolforge.org (python 3.7) - https://phabricator.wikimedia.org/T368463#9924088 (10AntiCompositeNumber) Looks like the 3.9 type aliases were introduced in [[https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-... [22:30:48] (03update) 10bd808: Replace Python 3.9 type aliases with 3.7-compatible aliases [repos/cloud/toolforge/tools-webservice] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/46 (https://phabricator.wikimedia.org/T368463) (owner: 10anticomposite) [22:31:31] !log andrew@cloudcumin1001 cloudvirt-canary START - Cookbook wmcs.openstack.cloudvirt.lib.ensure_canary on codfw1dev, with recreate True, for hosts list: ['cloudvirt2005-dev'] [22:31:55] !log andrew@cloudcumin1001 cloudvirt-canary END (PASS) - Cookbook wmcs.openstack.cloudvirt.lib.ensure_canary (exit_code=0) on codfw1dev, with recreate True, for hosts list: ['cloudvirt2005-dev'] [22:33:49] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS, 13Patch-For-Review: Migrate eqiad1 hypervisors to Neutron OVS agent - https://phabricator.wikimedia.org/T364457#9924114 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1002 for host cloudvirt2005-dev.codfw.wmnet with O... [22:35:38] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.drain on host 'cloudvirt2001-dev.codw.wmnet' [22:35:57] !log andrew@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.cloudvirt.drain (exit_code=99) on host 'cloudvirt2001-dev.codw.wmnet' [22:36:16] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.drain on host 'cloudvirt2001-dev.codfw.wmnet' [22:37:35] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.drain (exit_code=0) on host 'cloudvirt2001-dev.codfw.wmnet' [22:39:52] !log andrew@cloudcumin1001 k8s-dev END (FAIL) - Cookbook wmcs.openstack.migrate_dbinstance_to_ovs (exit_code=99) for server tbd [22:39:52] andrew@cloudcumin1001: Unknown project "k8s-dev" [22:40:23] !log andrew@cloudcumin1001 trove START - Cookbook wmcs.openstack.migrate_server_to_ovs for server superset-dev [22:42:35] !log andrew@cloudcumin1001 trove END (PASS) - Cookbook wmcs.openstack.migrate_server_to_ovs (exit_code=0) for server superset-dev [22:43:36] !log andrew@cloudcumin1001 k8s-dev START - Cookbook wmcs.openstack.migrate_dbinstance_to_ovs for server tbd [22:43:37] andrew@cloudcumin1001: Unknown project "k8s-dev" [22:43:43] !log andrew@cloudcumin1001 k8s-dev END (FAIL) - Cookbook wmcs.openstack.migrate_dbinstance_to_ovs (exit_code=99) for server tbd [22:43:43] andrew@cloudcumin1001: Unknown project "k8s-dev" [22:44:32] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS, 13Patch-For-Review: Migrate eqiad1 hypervisors to Neutron OVS agent - https://phabricator.wikimedia.org/T364457#9924137 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1002 for host cloudvirt2006-dev.codfw.wmnet wi... [22:54:38] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.drain on host 'cloudvirt2002-dev.codfw.wmnet' [22:55:12] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.drain (exit_code=0) on host 'cloudvirt2002-dev.codfw.wmnet' [22:55:18] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.drain on host 'cloudvirt2003-dev.codfw.wmnet' [22:55:39] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.drain (exit_code=0) on host 'cloudvirt2003-dev.codfw.wmnet' [23:17:58] 10Tool-bub2: Documentation for BUB2 Tool - https://phabricator.wikimedia.org/T364082#9924237 (10Pppery) [23:18:52] 10Tool-refill: Properly handle WebCite links - https://phabricator.wikimedia.org/T352409#9924240 (10Pppery) [23:19:02] 10Tool-refill: Take title from bare reference when no title found on page - https://phabricator.wikimedia.org/T352385#9924242 (10Pppery) [23:20:54] 10Tool-bub2, 10Outreach-Programs-Projects, 13Patch-For-Review: Integrate Wikimedia Ecosystem within BUB2 tool - https://phabricator.wikimedia.org/T346386#9924260 (10Pppery) [23:27:52] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS, 13Patch-For-Review: Migrate eqiad1 hypervisors to Neutron OVS agent - https://phabricator.wikimedia.org/T364457#9924298 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1002 for host cloudvirt2006-dev.codfw.wmnet with O... [23:28:30] 10Toolforge: Toolforge fourohfour similar name list should link directly to tools instead of toolsadmin - https://phabricator.wikimedia.org/T368475 (10AntiCompositeNumber) 03NEW [23:35:58] !log andrew@cloudcumin1001 cloudvirt-canary START - Cookbook wmcs.openstack.cloudvirt.lib.ensure_canary on codfw1dev, with recreate True, for hosts list: ['cloudvirt2006-dev'] [23:36:22] !log andrew@cloudcumin1001 cloudvirt-canary END (PASS) - Cookbook wmcs.openstack.cloudvirt.lib.ensure_canary (exit_code=0) on codfw1dev, with recreate True, for hosts list: ['cloudvirt2006-dev']