[05:20:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [05:30:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [07:22:17] 06cloud-services-team, 10Cloud-VPS, 07IPv6: IPv6 support in cloud-private - https://phabricator.wikimedia.org/T379283#10325460 (10taavi) The allocations SGTM too, thanks! >>! In T379283#10316397, @cmooney wrote: > We may also eventually need to allocate some public IPv6 addressing, for similar use to [[ htt... [08:07:22] 06cloud-services-team, 10wikitech.wikimedia.org, 06Data-Persistence: Decommission clouddb2002-dev.codfw.wmnet - https://phabricator.wikimedia.org/T369308#10325507 (10taavi) Labtestwiki is gone (sans a few config cleanups) so I think this can be done now. [08:58:48] (03approved) 10aborrero: package: adopted the common setup [repos/cloud/toolforge/jobs-emailer] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-emailer/-/merge_requests/8 (owner: 10dcaro) [09:01:29] 10Toolforge (Toolforge iteration 16): [jobs-emailer] http requests are blocked by the loops - https://phabricator.wikimedia.org/T379924#10325621 (10aborrero) [09:01:37] 06cloud-services-team, 10Toolforge (Toolforge iteration 16), 13Patch-For-Review: [jobs-api,jobs-emailer] Prometheus monitoring toolforge-jobs server side components - https://phabricator.wikimedia.org/T320284#10325622 (10aborrero) [09:07:23] 06cloud-services-team, 10Cloud-VPS, 07IPv6: Some WMCS clusters have inconsistent AAAA DNS records for the primary IPv6 of the hosts - https://phabricator.wikimedia.org/T312557#10325639 (10taavi) [09:07:27] 06cloud-services-team, 10wikitech.wikimedia.org, 06Data-Persistence: Decommission clouddb2002-dev.codfw.wmnet - https://phabricator.wikimedia.org/T369308#10325640 (10taavi) [09:10:22] 10Cloud-VPS (Project-requests), 06Data-Platform-SRE, 10Wikidata, 10Wikidata-Query-Service: Request creation of wikiqlever VPS project - https://phabricator.wikimedia.org/T377655#10325663 (10Seppl2013) What is the next step here? I do not know what it means that this ticket has been worked on. [09:17:42] (03open) 10aborrero: emailer: run webserver in a different thread [repos/cloud/toolforge/jobs-emailer] (adopt_common_practices) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-emailer/-/merge_requests/9 (https://phabricator.wikimedia.org/T379924) [09:18:28] 10Toolforge (Toolforge iteration 16), 13Patch-For-Review: [jobs-emailer] http requests are blocked by the loops - https://phabricator.wikimedia.org/T379924#10325730 (10aborrero) this is what I was referring to: https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-emailer/-/merge_requests/9 [09:25:11] 10Toolforge (Toolforge iteration 16), 13Patch-For-Review: [jobs-emailer] http requests are blocked by the loops - https://phabricator.wikimedia.org/T379924#10325759 (10aborrero) >>! In T379924#10323232, @dcaro wrote: >>>! In T379924#10323182, @aborrero wrote: >> The easier solution is to run the webserver task... [09:25:42] 06cloud-services-team, 10Cloud-VPS, 13Patch-For-Review: openstack: wmfkeystonehooks: project ids rather than names are being used in LDAP group creation - https://phabricator.wikimedia.org/T379030#10325763 (10taavi) [09:31:52] 10Cloud-VPS (Project-requests), 06Data-Platform-SRE, 10Wikidata, 10Wikidata-Query-Service: Request creation of wikiqlever VPS project - https://phabricator.wikimedia.org/T377655#10325768 (10Physikerwelt) >>! In T377655#10325663, @Seppl2013 wrote: > What is the next step here? I do not know what it mean... [09:35:55] 06cloud-services-team, 10Cloud-VPS, 13Patch-For-Review: openstack: wmfkeystonehooks: project ids rather than names are being used in LDAP group creation - https://phabricator.wikimedia.org/T379030#10325775 (10aborrero) >>! In T379030#10322740, @Andrew wrote: > It's definitely the case that we enforce unique... [10:47:11] (03open) 10taavi: channels: add #wikipedia-ko [toolforge-repos/ircservserv-config] - 10https://gitlab.wikimedia.org/toolforge-repos/ircservserv-config/-/merge_requests/14 [10:47:16] (03update) 10taavi: channels: add #wikipedia-ko [toolforge-repos/ircservserv-config] - 10https://gitlab.wikimedia.org/toolforge-repos/ircservserv-config/-/merge_requests/14 [10:51:05] (03merge) 10taavi: channels: add #wikipedia-ko [toolforge-repos/ircservserv-config] - 10https://gitlab.wikimedia.org/toolforge-repos/ircservserv-config/-/merge_requests/14 [11:16:20] 06cloud-services-team, 10Cloud-VPS, 07IPv6, 13Patch-For-Review: openstack: nova-fullstack: add support for IPv6 - https://phabricator.wikimedia.org/T379356#10326080 (10aborrero) 05In progress→03Resolved a:03aborrero This is ready. IPv6 support has been activated in codfw1dev. Support for eqiad1 i... [11:25:31] FIRING: ToolsToolsDBReplicationMissing: ToolsDB replication is not running on tools-db-4 (errno 0) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsDBReplication - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsToolsDBReplicationMissing [11:31:31] FIRING: ToolsToolsDBReplicationLagIsTooHigh: ToolsDB replication on tools-db-4 is lagging behind the primary, the current lag is 74782 - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsDBReplication - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsToolsDBReplicationLagIsTooHigh [11:43:01] RESOLVED: ToolsToolsDBReplicationMissing: ToolsDB replication is not running on tools-db-4 (errno 0) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsDBReplication - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsToolsDBReplicationMissing [11:50:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [12:00:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [12:07:07] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-14 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [12:07:45] 06cloud-services-team, 10Horizon, 07IPv6, 13Patch-For-Review: horizon: enable IPv6 security group panels - https://phabricator.wikimedia.org/T377339#10326274 (10aborrero) 05Open→03In progress [12:08:21] 06cloud-services-team, 10Horizon, 07IPv6, 13Patch-For-Review: horizon: enable IPv6 security group panels - https://phabricator.wikimedia.org/T377339#10326289 (10aborrero) note this comment in the horizon source code: ` # When cidr is used ethertype is determined from IP version of cidr. # When sou... [12:27:03] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-14 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [12:45:31] 06cloud-services-team, 10Horizon, 07IPv6, 13Patch-For-Review: horizon: enable IPv6 security group panels - https://phabricator.wikimedia.org/T377339#10326495 (10aborrero) 05In progress→03Resolved [12:46:09] 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 10netops, 06SRE: CloudVPS: IPv6 in codfw1dev - https://phabricator.wikimedia.org/T245495#10326497 (10aborrero) [12:48:43] 06cloud-services-team, 10Cloud-VPS, 13Patch-For-Review: neutron: clarify why DNS extension is not enabled - https://phabricator.wikimedia.org/T377740#10326499 (10aborrero) 05Stalled→03Resolved a:03aborrero We went with {T378192} [12:48:48] 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 10netops, 06SRE: openstack: work out IPv6 and designate integration - https://phabricator.wikimedia.org/T374715#10326506 (10aborrero) 05Stalled→03Resolved a:03aborrero this was done by means of {T378192} [12:50:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [12:51:53] 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 10netops, 06SRE: dns: integrate PTR support for 2a02:ec80:a100::/48 - https://phabricator.wikimedia.org/T376462#10326522 (10aborrero) 05In progress→03Resolved [12:52:19] 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 10netops, 06SRE: CloudVPS: IPv6 in codfw1dev - https://phabricator.wikimedia.org/T245495#10326524 (10aborrero) [12:53:57] 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 10netops, 06SRE: CloudVPS: IPv6 in codfw1dev - https://phabricator.wikimedia.org/T245495#10326527 (10aborrero) 05Open→03Resolved a:03aborrero I think we can consider IPv6 to be fully working on codfw1dev. [12:56:48] FIRING: PuppetFailure: Puppet has failed on cloudcontrol2006-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [12:56:54] 06cloud-services-team: PuppetFailure Puppet has failed on cloudcontrol2006-dev:9100 - https://phabricator.wikimedia.org/T380048 (10phaultfinder) 03NEW [13:05:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [13:41:01] RESOLVED: ToolsToolsDBReplicationLagIsTooHigh: ToolsDB replication on tools-db-4 is lagging behind the primary, the current lag is 3639 - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsDBReplication - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsToolsDBReplicationLagIsTooHigh [13:41:28] 06cloud-services-team, 10Cloud-VPS, 07IPv6: Cloud VPS: prepare documentation on VXLAN/IPV6 migration - https://phabricator.wikimedia.org/T380054 (10aborrero) 03NEW [13:46:10] 06cloud-services-team, 10Cloud-VPS, 07IPv6: Cloud VPS: prepare documentation on VXLAN/IPV6 migration - https://phabricator.wikimedia.org/T380054#10326769 (10aborrero) p:05Triage→03Medium [13:49:52] !log fnegri@cloudcumin1001 tools START - Cookbook wmcs.vps.create_instance_with_prefix with prefix 'tools-db' (T352206) [13:49:56] T352206: [toolsdb] Upgrade to MariaDB 10.6 - https://phabricator.wikimedia.org/T352206 [13:50:17] !log fnegri@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.vps.create_instance_with_prefix (exit_code=99) with prefix 'tools-db' (T352206) [13:57:31] !log fnegri@cloudcumin1001 tools START - Cookbook wmcs.openstack.quota_increase (T352206) [13:57:35] T352206: [toolsdb] Upgrade to MariaDB 10.6 - https://phabricator.wikimedia.org/T352206 [13:57:39] !log fnegri@cloudcumin1001 tools END (PASS) - Cookbook wmcs.openstack.quota_increase (exit_code=0) (T352206) [13:57:44] !log fnegri@cloudcumin1001 tools START - Cookbook wmcs.vps.create_instance_with_prefix with prefix 'tools-db' (T352206) [14:03:33] !log fnegri@cloudcumin1001 tools END (PASS) - Cookbook wmcs.vps.create_instance_with_prefix (exit_code=0) with prefix 'tools-db' (T352206) [14:03:38] T352206: [toolsdb] Upgrade to MariaDB 10.6 - https://phabricator.wikimedia.org/T352206 [14:03:56] !log fnegri@cloudcumin1001 tools START - Cookbook wmcs.vps.refresh_puppet_certs on tools-db-5.tools.eqiad1.wikimedia.cloud (T352206) [14:05:13] !log fnegri@cloudcumin1001 tools END (PASS) - Cookbook wmcs.vps.refresh_puppet_certs (exit_code=0) on tools-db-5.tools.eqiad1.wikimedia.cloud (T352206) [14:11:48] RESOLVED: PuppetFailure: Puppet has failed on cloudcontrol2006-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [14:13:13] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Toolforge (Toolforge iteration 16), 05Goal: [toolsdb] Upgrade to MariaDB 10.6 - https://phabricator.wikimedia.org/T352206#10326886 (10fnegri) Current situation: * tools-db-1: primary, running MariaDB 10.4 * tools-db-3: replica, running MariaDb 10.4 * tools-db-4: n... [14:30:29] 06cloud-services-team, 10Toolforge, 07IPv6, 07Kubernetes: Support IPv6 in Toolforge Kubernetes - https://phabricator.wikimedia.org/T380060 (10taavi) 03NEW [14:50:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [15:00:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [15:30:01] 10cloud-services-team (FY2024/2025-Q1-Q2): [cloudcumin] After the upgrade to spicerack 8.15.2, the wmcs.openstack.cloudvirt.vm_console cookbook stopped working - https://phabricator.wikimedia.org/T379570#10327084 (10fnegri) The same happens from cumin1002 to any prod host, I can't figure out what's preventing th... [15:43:35] 10cloud-services-team (FY2024/2025-Q1-Q2): [cloudcumin] After the upgrade to spicerack 8.15.2, the wmcs.openstack.cloudvirt.vm_console cookbook stopped working - https://phabricator.wikimedia.org/T379570#10327158 (10CDanis) If you look at `/etc/ssh/userkeys/root.d/cumin` on a prod host, it contains the `restric... [15:43:40] 10Cloud-VPS (Debian Buster Deprecation): Cloud VPS "commons-corruption-checker" project Buster deprecation - https://phabricator.wikimedia.org/T367525#10327148 (10Andrew) 05Open→03Resolved a:03Andrew I have deleted remaining Buster VMs. [15:44:27] 10Cloud-VPS (Debian Buster Deprecation): Cloud VPS "wikicommunityhealth" project Buster deprecation - https://phabricator.wikimedia.org/T367560#10327154 (10Andrew) 05Open→03Resolved a:03Andrew I have deleted remaining Buster VMs. [15:44:36] 10Cloud-VPS (Debian Buster Deprecation): Cloud VPS "etytree" project Buster deprecation - https://phabricator.wikimedia.org/T367529#10327151 (10Andrew) 05Open→03Resolved a:03Andrew I have deleted remaining Buster VMs. [15:45:44] 10Cloud-VPS (Debian Buster Deprecation), 06Fundraising-Backlog, 10MediaWiki-extensions-CentralNotice: Upgrade centralnotice-staging to get off debian buster - https://phabricator.wikimedia.org/T360949#10327171 (10Andrew) 05Open→03Resolved a:03Andrew I have deleted remaining Buster VMs. [15:45:45] 10Cloud-VPS (Debian Buster Deprecation): Cloud VPS "schematreerecommender" project Buster deprecation - https://phabricator.wikimedia.org/T367552#10327159 (10Andrew) 05Open→03Resolved a:03Andrew I have deleted remaining Buster VMs. [15:45:53] 06cloud-services-team, 10Cloud-VPS (Debian Buster Deprecation): Cloud-vps Buster deprecation - https://phabricator.wikimedia.org/T331738#10327177 (10Andrew) Sent the following email on Wednesday: > > On Friday I will delete the following VMs. All are running the long-deprecated Debian Buster OS and have b... [15:46:20] 10Cloud-VPS (Debian Buster Deprecation), 10Beta-Cluster-Infrastructure: Migrate deployment-prep away from Debian Buster to Bullseye/Bookworm - https://phabricator.wikimedia.org/T327742#10327165 (10Andrew) 05Open→03Resolved a:03Andrew I believe this to be done; there are no more Buster VMs in deployme... [15:46:42] 06cloud-services-team, 10Cloud-VPS (Debian Buster Deprecation), 10Beta-Cluster-Infrastructure, 06Content-Transform-Team-WIP: Rebuild or delete deployment-docker-proton01 - https://phabricator.wikimedia.org/T369916#10327185 (10Andrew) 05Open→03Resolved [15:48:33] 06cloud-services-team, 10Cloud-VPS (Debian Buster Deprecation): Cloud-vps Buster deprecation - https://phabricator.wikimedia.org/T331738#10327182 (10Andrew) 05Open→03Resolved a:03Andrew I have now deleted the VMs mentioned in that last email. [15:48:57] 10Cloud-VPS (Debian Buster Deprecation), 10Beta-Cluster-Infrastructure: Replace deployment-maps-master01 with a Bullseye or Bookworm instance - https://phabricator.wikimedia.org/T361381#10327198 (10Andrew) 05Open→03Resolved [15:49:03] 10Cloud-VPS (Debian Buster Deprecation), 10Beta-Cluster-Infrastructure: Replace or remove deployment-echostore02.deployment-prep.eqiad1.wikimedia.cloud - https://phabricator.wikimedia.org/T361383#10327200 (10Andrew) 05Open→03Resolved [15:49:13] 06cloud-services-team, 10Cloud-VPS (Debian Buster Deprecation), 10Beta-Cluster-Infrastructure: Remove or replace poolcounter06.deployment-prep.eqiad1.wikimedia.cloud (Buster deprecation) - https://phabricator.wikimedia.org/T370458#10327203 (10Andrew) 05Open→03Resolved [15:50:14] 10cloud-services-team (FY2024/2025-Q1-Q2): [cloudcumin] After the upgrade to spicerack 8.15.2, the wmcs.openstack.cloudvirt.vm_console cookbook stopped working - https://phabricator.wikimedia.org/T379570#10327213 (10fnegri) > it contains the restrict option, which disallows PTY. Thanks! That explains it :) > I... [15:50:28] 10Cloud-VPS (Debian Buster Deprecation): Cloud VPS "mediawiki-vagrant" project Buster deprecation - https://phabricator.wikimedia.org/T367541#10327191 (10Andrew) 05Open→03Resolved a:03Andrew deleted [15:51:29] 10cloud-services-team (FY2024/2025-Q1-Q2): [wmcs-cookbooks] wmcs.openstack.cloudvirt.vm_console cookbook is not working from cloudcumin hosts - https://phabricator.wikimedia.org/T379570#10327221 (10fnegri) [15:53:18] 10Cloud-VPS (Debian Buster Deprecation): Cloud VPS "auditlogging" project Buster deprecation - https://phabricator.wikimedia.org/T367522#10327216 (10Andrew) There are now exactly two buster VMs remaining: this one, and tools-sgebastion-10.tools.eqiad1.wikimedia.cloud. @Southparkfan can you confirm that I can sa... [16:20:41] FIRING: CloudVPSDesignateLeaks: Detected 40 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [16:27:00] FIRING: NovafullstackSustainedFailures: Novafullstack tests have been failing for more than 5hours in eqiad - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/NovafullstackSustainedFailures - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-nova-fullstack?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DNovafullstackSustainedFailures [16:27:04] 06cloud-services-team: NovafullstackSustainedFailures Novafullstack tests have been failing for more than 5hours in eqiad - https://phabricator.wikimedia.org/T380067 (10phaultfinder) 03NEW [16:27:57] 06cloud-services-team, 10Cloud-VPS: wmcs-openstack-eqiad-summary grafana dashboard has degraded - https://phabricator.wikimedia.org/T380069 (10Andrew) 03NEW [16:29:47] !log aborrero@cloudcumin2001 admin START - Cookbook wmcs.openstack.restart_openstack [16:30:23] !log aborrero@cloudcumin2001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) [16:33:51] !log taavi@cloudcumin1001 proxy-codfw1dev START - Cookbook wmcs.vps.refresh_puppet_certs on proxy-04.proxy-codfw1dev.codfw1dev.wikimedia.cloud [16:33:51] taavi@cloudcumin1001: Unknown project "proxy-codfw1dev" [16:33:52] !log taavi@cloudcumin1001 proxy-codfw1dev END (FAIL) - Cookbook wmcs.vps.refresh_puppet_certs (exit_code=99) on proxy-04.proxy-codfw1dev.codfw1dev.wikimedia.cloud [16:33:53] taavi@cloudcumin1001: Unknown project "proxy-codfw1dev" [16:35:44] !log taavi@cloudcumin1001 proxy-codfw1dev START - Cookbook wmcs.vps.refresh_puppet_certs on proxy-04.proxy-codfw1dev.codfw1dev.wikimedia.cloud [16:35:45] taavi@cloudcumin1001: Unknown project "proxy-codfw1dev" [16:35:55] !log taavi@cloudcumin1001 proxy-codfw1dev END (FAIL) - Cookbook wmcs.vps.refresh_puppet_certs (exit_code=99) on proxy-04.proxy-codfw1dev.codfw1dev.wikimedia.cloud [16:35:56] taavi@cloudcumin1001: Unknown project "proxy-codfw1dev" [16:49:04] 10Tool-openstack-browser: openstack-browser and Horizon show different projects membership - https://phabricator.wikimedia.org/T377710#10327554 (10taavi) That account [[ https://ldap.toolforge.org/user/twentyafterfour | is disabled ]]. IIRC Keystone filters out disabled users somewhere which is most likely not a... [17:02:20] !log taavi@cloudcumin1001 proxy-codfw1dev START - Cookbook wmcs.vps.refresh_puppet_certs on proxy-04.proxy-codfw1dev.codfw1dev.wikimedia.cloud [17:02:22] taavi@cloudcumin1001: Unknown project "proxy-codfw1dev" [17:02:22] !log taavi@cloudcumin1001 proxy-codfw1dev END (FAIL) - Cookbook wmcs.vps.refresh_puppet_certs (exit_code=99) on proxy-04.proxy-codfw1dev.codfw1dev.wikimedia.cloud [17:02:22] taavi@cloudcumin1001: Unknown project "proxy-codfw1dev" [17:03:15] !log taavi@cloudcumin1001 proxy-codfw1dev START - Cookbook wmcs.vps.refresh_puppet_certs on proxy-04.proxy-codfw1dev.codfw1dev.wikimedia.cloud [17:03:15] taavi@cloudcumin1001: Unknown project "proxy-codfw1dev" [17:03:25] !log taavi@cloudcumin1001 proxy-codfw1dev END (FAIL) - Cookbook wmcs.vps.refresh_puppet_certs (exit_code=99) on proxy-04.proxy-codfw1dev.codfw1dev.wikimedia.cloud [17:03:25] taavi@cloudcumin1001: Unknown project "proxy-codfw1dev" [17:03:49] (03open) 10aborrero: codfw1dev: network: introduce VXLAN/IPv4-only subnet [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/127 (https://phabricator.wikimedia.org/T377467) [17:05:29] (03merge) 10aborrero: codfw1dev: network: introduce VXLAN/IPv4-only subnet [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/127 (https://phabricator.wikimedia.org/T377467) [17:05:48] !log aborrero@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan+apply for main branch [17:06:04] !log aborrero@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.tofu (exit_code=99) running tofu plan+apply for main branch [17:06:20] !log aborrero@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan+apply for main branch [17:06:46] !log aborrero@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.tofu (exit_code=99) running tofu plan+apply for main branch [17:07:13] 10Tool-wikiqanda, 06Future-Audiences: Add testing capabilities to Discord bot - https://phabricator.wikimedia.org/T379029#10327678 (10etz) a:03etz [17:08:36] (03open) 10aborrero: codfw1dev: network: cloud-flat-codfw1dev-ipv4only: use segmentation id 9 [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/128 [17:09:16] (03merge) 10aborrero: codfw1dev: network: cloud-flat-codfw1dev-ipv4only: use segmentation id 9 [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/128 [17:09:45] !log aborrero@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan+apply for main branch [17:10:26] !log aborrero@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.tofu (exit_code=99) running tofu plan+apply for main branch [17:13:45] !log aborrero@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan+apply for main branch [17:14:13] !log aborrero@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.tofu (exit_code=99) running tofu plan+apply for main branch [17:16:34] (03open) 10aborrero: codfw1dev: subnet: fix network reference [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/129 [17:17:12] (03merge) 10aborrero: codfw1dev: subnet: fix network reference [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/129 [17:17:17] !log aborrero@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan+apply for main branch [17:18:21] !log aborrero@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.tofu (exit_code=0) running tofu plan+apply for main branch [17:18:27] !log aborrero@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan+apply for main branch [17:18:49] !log aborrero@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.tofu (exit_code=0) running tofu plan+apply for main branch [17:29:11] 06cloud-services-team, 10Cloud-VPS, 07IPv6, 13Patch-For-Review: Enable IPv6 for the Cloud VPS web proxy - https://phabricator.wikimedia.org/T379175#10327736 (10taavi) [17:31:06] 06cloud-services-team, 10Cloud-VPS, 07IPv6: horizon: enable the UI to select networks on VM creation panel - https://phabricator.wikimedia.org/T380081 (10aborrero) 03NEW [17:31:14] 06cloud-services-team, 10Cloud-VPS, 07IPv6: horizon: enable the UI to select networks on VM creation panel - https://phabricator.wikimedia.org/T380081#10327759 (10aborrero) p:05Triage→03Medium [17:32:53] 06cloud-services-team, 10Cloud-VPS, 07IPv6: horizon: enable the UI to select networks on VM creation panel - https://phabricator.wikimedia.org/T380081#10327762 (10aborrero) [17:33:02] 10Tool-openstack-browser: openstack-browser: Show information about networks and subnets - https://phabricator.wikimedia.org/T380082 (10taavi) 03NEW [17:46:16] !log taavi@cloudcumin1001 proxy-codfw1dev START - Cookbook wmcs.vps.remove_instance for instance proxy-04 [17:46:17] taavi@cloudcumin1001: Unknown project "proxy-codfw1dev" [17:46:45] !log taavi@cloudcumin1001 proxy-codfw1dev END (PASS) - Cookbook wmcs.vps.remove_instance (exit_code=0) for instance proxy-04 [17:46:45] taavi@cloudcumin1001: Unknown project "proxy-codfw1dev" [18:18:19] !log taavi@cloudcumin1001 proxy-codfw1dev START - Cookbook wmcs.vps.refresh_puppet_certs on proxy-04.proxy-codfw1dev.codfw1dev.wikimedia.cloud [18:18:20] taavi@cloudcumin1001: Unknown project "proxy-codfw1dev" [18:24:59] !log taavi@cloudcumin1001 proxy-codfw1dev END (PASS) - Cookbook wmcs.vps.refresh_puppet_certs (exit_code=0) on proxy-04.proxy-codfw1dev.codfw1dev.wikimedia.cloud [18:25:00] taavi@cloudcumin1001: Unknown project "proxy-codfw1dev" [18:26:10] (03PS1) 10Majavah: templates: Fix some narrow forms [openstack/horizon/wmf-puppet-dashboard] - 10https://gerrit.wikimedia.org/r/1091792 [18:26:10] (03PS1) 10Majavah: static: Remove unused style [openstack/horizon/wmf-puppet-dashboard] - 10https://gerrit.wikimedia.org/r/1091793 [18:26:10] (03PS1) 10Majavah: Make hiera and class text inputs monospace [openstack/horizon/wmf-puppet-dashboard] - 10https://gerrit.wikimedia.org/r/1091794 [18:27:48] FIRING: PuppetFailure: Puppet has failed on cloudcontrol2006-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [18:29:00] (03open) 10bd808: cloud: Restore Taavi's founder bit [toolforge-repos/ircservserv-config] - 10https://gitlab.wikimedia.org/toolforge-repos/ircservserv-config/-/merge_requests/15 [18:44:06] FIRING: [2x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_tool_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [18:49:06] RESOLVED: [4x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [18:50:55] (03approved) 10jjmc89: cloud: Restore Taavi's founder bit [toolforge-repos/ircservserv-config] - 10https://gitlab.wikimedia.org/toolforge-repos/ircservserv-config/-/merge_requests/15 (owner: 10bd808) [18:52:24] (03approved) 10taavi: cloud: Restore Taavi's founder bit [toolforge-repos/ircservserv-config] - 10https://gitlab.wikimedia.org/toolforge-repos/ircservserv-config/-/merge_requests/15 (owner: 10bd808) [18:52:53] (03merge) 10jjmc89: cloud: Restore Taavi's founder bit [toolforge-repos/ircservserv-config] - 10https://gitlab.wikimedia.org/toolforge-repos/ircservserv-config/-/merge_requests/15 (owner: 10bd808) [18:54:38] !log taavi@cloudcumin1001 proxy-codfw1dev START - Cookbook wmcs.vps.refresh_puppet_certs on proxy-05.proxy-codfw1dev.codfw1dev.wikimedia.cloud [18:54:39] taavi@cloudcumin1001: Unknown project "proxy-codfw1dev" [18:57:32] !log taavi@cloudcumin1001 proxy-codfw1dev END (PASS) - Cookbook wmcs.vps.refresh_puppet_certs (exit_code=0) on proxy-05.proxy-codfw1dev.codfw1dev.wikimedia.cloud [18:57:32] taavi@cloudcumin1001: Unknown project "proxy-codfw1dev" [19:07:43] (03open) 10taavi: proxy-codfw1dev: Permit IPv6 web proxy traffic [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/130 (https://phabricator.wikimedia.org/T379175) [19:09:14] (03merge) 10taavi: proxy-codfw1dev: Permit IPv6 web proxy traffic [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/130 (https://phabricator.wikimedia.org/T379175) [19:09:24] !log taavi@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan+apply for main branch [19:09:53] !log taavi@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.tofu (exit_code=0) running tofu plan+apply for main branch [19:23:15] 10Cloud-VPS (Debian Buster Deprecation): Cloud VPS "auditlogging" project Buster deprecation - https://phabricator.wikimedia.org/T367522#10328077 (10Southparkfan) >>! In T367522#10327216, @Andrew wrote: > There are now exactly two buster VMs remaining: this one, and tools-sgebastion-10.tools.eqiad1.wikimedia.clo... [19:30:11] 06cloud-services-team, 10Cloud-VPS: wmcs-openstack-eqiad-summary grafana dashboard has degraded - https://phabricator.wikimedia.org/T380069#10328089 (10taavi) [19:38:20] 06cloud-services-team, 10Cloud-VPS, 07IPv6, 13Patch-For-Review: Enable IPv6 for the Cloud VPS web proxy - https://phabricator.wikimedia.org/T379175#10328117 (10taavi) a:03taavi [19:47:15] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Cloud-VPS, 13Patch-For-Review: [wmcs-cookbooks] wmcs.openstack.cloudvirt.vm_console cookbook is not working from cloudcumin hosts - https://phabricator.wikimedia.org/T379570#10328156 (10taavi) [19:47:49] 10cloud-services-team (FY2024/2025-Q1-Q2), 10SRE-Access-Requests, 13Patch-For-Review: Add permissions for Komla to run WMCS cookbooks - https://phabricator.wikimedia.org/T379159#10328159 (10taavi) [19:49:57] 10cloud-services-team (FY2024/2025-Q1-Q2): Drain C8 rack - https://phabricator.wikimedia.org/T374043#10328163 (10taavi) Anything left to do here? [19:56:45] 06cloud-services-team, 06Infrastructure-Foundations, 10SRE-tools, 07IPv6: Some WMCS clusters apparently do not support IPv6 - https://phabricator.wikimedia.org/T271139#10328185 (10taavi) >>! In T271139#10151973, @Volans wrote: > I guess that the clouddb are expected and they **all** don't have the AAAA rec... [20:04:38] 06cloud-services-team, 10Cloud-VPS: Create mechanism to allow the use of vanity domains by projects behind the Cloud VPS shared HTTP proxy - https://phabricator.wikimedia.org/T342398#10328207 (10taavi) 05In progress→03Resolved [20:14:11] (03CR) 10Majavah: [C:03+2] templates: Fix some narrow forms [openstack/horizon/wmf-puppet-dashboard] - 10https://gerrit.wikimedia.org/r/1091792 (owner: 10Majavah) [20:14:19] (03CR) 10Majavah: [C:03+2] static: Remove unused style [openstack/horizon/wmf-puppet-dashboard] - 10https://gerrit.wikimedia.org/r/1091793 (owner: 10Majavah) [20:20:41] FIRING: CloudVPSDesignateLeaks: Detected 40 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [20:22:29] 06cloud-services-team, 10Toolforge: Install mariadb-dump on Toolforge bastions - https://phabricator.wikimedia.org/T378882#10328276 (10taavi) [20:27:00] FIRING: NovafullstackSustainedFailures: Novafullstack tests have been failing for more than 5hours in eqiad - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/NovafullstackSustainedFailures - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-nova-fullstack?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DNovafullstackSustainedFailures [20:36:37] (03Merged) 10jenkins-bot: templates: Fix some narrow forms [openstack/horizon/wmf-puppet-dashboard] - 10https://gerrit.wikimedia.org/r/1091792 (owner: 10Majavah) [20:36:37] (03Merged) 10jenkins-bot: static: Remove unused style [openstack/horizon/wmf-puppet-dashboard] - 10https://gerrit.wikimedia.org/r/1091793 (owner: 10Majavah) [20:37:25] (03PS2) 10Majavah: Make hiera and class text inputs monospace [openstack/horizon/wmf-puppet-dashboard] - 10https://gerrit.wikimedia.org/r/1091794 [20:37:41] (03CR) 10Majavah: [C:03+2] Make hiera and class text inputs monospace [openstack/horizon/wmf-puppet-dashboard] - 10https://gerrit.wikimedia.org/r/1091794 (owner: 10Majavah) [20:38:13] (03Merged) 10jenkins-bot: Make hiera and class text inputs monospace [openstack/horizon/wmf-puppet-dashboard] - 10https://gerrit.wikimedia.org/r/1091794 (owner: 10Majavah) [20:44:57] 06cloud-services-team, 10Cloud-VPS: Do not create DNS zones for projects outside default domain - https://phabricator.wikimedia.org/T380095 (10taavi) 03NEW [20:45:46] 06cloud-services-team, 10Cloud-VPS, 07Documentation, 07IPv6: Cloud VPS: prepare documentation on VXLAN/IPV6 migration - https://phabricator.wikimedia.org/T380054#10328336 (10taavi) [21:05:22] (03PS1) 10Majavah: Fix style loading [openstack/horizon/wmf-puppet-dashboard] - 10https://gerrit.wikimedia.org/r/1091817 [21:06:15] (03CR) 10Majavah: [C:03+2] Fix style loading [openstack/horizon/wmf-puppet-dashboard] - 10https://gerrit.wikimedia.org/r/1091817 (owner: 10Majavah) [21:06:39] (03Merged) 10jenkins-bot: Fix style loading [openstack/horizon/wmf-puppet-dashboard] - 10https://gerrit.wikimedia.org/r/1091817 (owner: 10Majavah) [21:20:19] 10Tool-wikiqanda, 06Future-Audiences: Link to source article(s) for all bot responses - https://phabricator.wikimedia.org/T380098 (10Maryana) 03NEW [21:21:23] 10Tool-wikiqanda, 06Future-Audiences: Link to source article(s) for all bot responses - https://phabricator.wikimedia.org/T380098#10328406 (10Maryana) [21:21:24] 10Tool-wikiqanda, 06Future-Audiences, 07Epic: [Epic] Discord Q&A bot Milestone 2 - https://phabricator.wikimedia.org/T378121#10328407 (10Maryana) [21:24:05] 06cloud-services-team, 10Cloud-VPS: Audit WMCS compute capacity - https://phabricator.wikimedia.org/T380099 (10Andrew) 03NEW [21:24:12] 06cloud-services-team, 10Cloud-VPS: wmcs-openstack-eqiad-summary grafana dashboard has degraded - https://phabricator.wikimedia.org/T380069#10328419 (10Andrew) [21:24:14] 06cloud-services-team, 10Cloud-VPS: Audit WMCS compute capacity - https://phabricator.wikimedia.org/T380099#10328420 (10Andrew) [22:27:48] FIRING: PuppetFailure: Puppet has failed on cloudcontrol2006-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [23:12:18] (03PS1) 10SomeRandomDeveloper: Replace divs with semantic elements [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/1091830 (https://phabricator.wikimedia.org/T227631) [23:12:36] (03CR) 10CI reject: [V:04-1] Replace divs with semantic elements [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/1091830 (https://phabricator.wikimedia.org/T227631) (owner: 10SomeRandomDeveloper) [23:12:59] 10Tool-Global-user-contributions, 13Patch-For-Review, 07patch-welcome: Replace
by
,