[02:36:18] FIRING: [2x] KernelErrors: Server cloudcephosd1041 logged kernel errors - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/KernelErrors - https://grafana.wikimedia.org/d/b013af4c-d405-4d9f-85d4-985abb3dec0c/wmcs-kernel-errors?orgId=1&var-instance=cloudcephosd1041 - https://alerts.wikimedia.org/?q=alertname%3DKernelErrors [02:36:26] 06cloud-services-team: KernelErrors Server cloudcephosd1041 logged kernel errors - https://phabricator.wikimedia.org/T404282 (10phaultfinder) 03NEW [06:36:33] FIRING: [2x] KernelErrors: Server cloudcephosd1041 logged kernel errors - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/KernelErrors - https://grafana.wikimedia.org/d/b013af4c-d405-4d9f-85d4-985abb3dec0c/wmcs-kernel-errors?orgId=1&var-instance=cloudcephosd1041 - https://alerts.wikimedia.org/?q=alertname%3DKernelErrors [07:18:08] (03merge) 10taavi: tools: dns: Drop temporary migration names [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/75 [07:21:14] (03open) 10taavi: tools: Provision Trixie-based Toolforge bastions [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/76 (https://phabricator.wikimedia.org/T392510) [07:21:16] (03update) 10taavi: tools: Provision Trixie-based Toolforge bastions [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/76 (https://phabricator.wikimedia.org/T392510) [07:28:05] (03update) 10taavi: tools: Provision Trixie-based Toolforge bastions [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/76 (https://phabricator.wikimedia.org/T392510) [07:42:00] (03approved) 10filippo: tools: Provision Trixie-based Toolforge bastions [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/76 (https://phabricator.wikimedia.org/T392510) (owner: 10taavi) [07:44:40] (03merge) 10taavi: tools: Provision Trixie-based Toolforge bastions [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/76 (https://phabricator.wikimedia.org/T392510) [07:52:47] 06cloud-services-team: KernelErrors Server cloudcephosd1041 logged kernel errors - https://phabricator.wikimedia.org/T404282#11171043 (10Volans) I've found this in kern.log/dmesg but nothing in racadm logs (both getsel and lclog): ` Sep 11 02:33:32 cloudcephosd1041 kernel: [125071.786152] {1}[Hardware Error]: Ha... [07:52:50] 06cloud-services-team: KernelErrors Server cloudcephosd1041 logged kernel errors - https://phabricator.wikimedia.org/T404282#11171044 (10Volans) p:05Triage→03Medium [07:53:34] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.openstack.quota_increase [07:53:41] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.openstack.quota_increase (exit_code=0) [07:56:03] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.vps.refresh_puppet_certs on tools-bastion-15.tools.eqiad1.wikimedia.cloud [07:59:21] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.vps.refresh_puppet_certs (exit_code=0) on tools-bastion-15.tools.eqiad1.wikimedia.cloud [07:59:26] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.vps.refresh_puppet_certs on tools-bastion-14.tools.eqiad1.wikimedia.cloud [08:02:38] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.vps.refresh_puppet_certs (exit_code=0) on tools-bastion-14.tools.eqiad1.wikimedia.cloud [08:12:05] (03open) 10taavi: tools: dns: Migrate dev.toolforge.org to new Trixie bastion [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/77 (https://phabricator.wikimedia.org/T392510) [08:12:10] (03update) 10taavi: tools: dns: Migrate dev.toolforge.org to new Trixie bastion [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/77 (https://phabricator.wikimedia.org/T392510) [08:13:39] (03update) 10taavi: tools: dns: Migrate dev.toolforge.org to new Trixie bastion [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/77 (https://phabricator.wikimedia.org/T392510) [08:13:40] (03update) 10taavi: tools: dns: Migration login.toolforge.org to new Trixie bastion [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/78 (https://phabricator.wikimedia.org/T392510) [08:13:40] (03update) 10taavi: tools: Drop floating IPs for Bookworm bastions [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/79 (https://phabricator.wikimedia.org/T392510) [08:13:40] (03open) 10taavi: tools: dns: Migration login.toolforge.org to new Trixie bastion [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/78 (https://phabricator.wikimedia.org/T392510) [08:13:42] (03open) 10taavi: tools: Drop floating IPs for Bookworm bastions [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/79 (https://phabricator.wikimedia.org/T392510) [08:13:45] (03update) 10taavi: tools: dns: Migration login.toolforge.org to new Trixie bastion [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/78 (https://phabricator.wikimedia.org/T392510) [08:13:51] (03update) 10taavi: tools: Drop floating IPs for Bookworm bastions [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/79 (https://phabricator.wikimedia.org/T392510) [08:16:36] (03update) 10taavi: tools: Drop floating IPs for Bookworm bastions [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/79 (https://phabricator.wikimedia.org/T392510) [08:19:15] (03PS1) 10Majavah: inventory: Add new tools bastions [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1187367 (https://phabricator.wikimedia.org/T392510) [08:23:19] 06cloud-services-team, 10Striker: unlink Minato826 developer account from @Anoop phabricator account - https://phabricator.wikimedia.org/T404239#11171128 (10Anoop) @bd808 connecting Phabricator account to toolforge account worked now, Thankyou [08:36:20] 06cloud-services-team: KernelErrors Server cloudcephosd1041 logged kernel errors - https://phabricator.wikimedia.org/T404282#11171156 (10fnegri) These are happening repeatedly on the same host: {T400222}. It was also on DIMM B1 so I would probably replace it to be on the safe side. Or maybe we can wait for a thi... [08:39:44] 06cloud-services-team: KernelErrors Server cloudcephosd1041 logged kernel errors - https://phabricator.wikimedia.org/T404282#11171158 (10Volans) The problem is that there is no evidence in hardware logs and I doubt we'll get any replacement from Dell without them. [08:44:27] 06cloud-services-team, 10Toolforge: Improve detection of failing ssh to toolforge bastions - https://phabricator.wikimedia.org/T404054#11171178 (10fgiunchedi) [09:04:24] 06cloud-services-team: Remove KernelErrors alerts - https://phabricator.wikimedia.org/T404300 (10fnegri) 03NEW [09:06:53] 06cloud-services-team: Remove KernelErrors alerts - https://phabricator.wikimedia.org/T404300#11171264 (10Volans) +1 for me [09:09:44] 06cloud-services-team: KernelErrors Server cloudcephosd1041 logged kernel errors - https://phabricator.wikimedia.org/T404282#11171276 (10fnegri) 05Open→03Resolved a:03fnegri Fair enough, resolving. [09:20:57] 06cloud-services-team: Remove KernelErrors alerts - https://phabricator.wikimedia.org/T404300#11171321 (10fgiunchedi) +1 too [09:24:12] 06cloud-services-team, 10Cloud-VPS: Support proxy backends using IPv6 - https://phabricator.wikimedia.org/T404302 (10taavi) 03NEW [09:25:34] 06cloud-services-team, 10Cloud-VPS: Support proxy backends using IPv6 - https://phabricator.wikimedia.org/T404302#11171337 (10taavi) [09:36:59] (03open) 10taavi: web_proxy: Use IPv6 in example backend [repos/cloud/cloud-vps/tofu-cloudvps] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-cloudvps/-/merge_requests/13 (https://phabricator.wikimedia.org/T404302) [09:37:03] (03update) 10taavi: web_proxy: Use IPv6 in example backend [repos/cloud/cloud-vps/tofu-cloudvps] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-cloudvps/-/merge_requests/13 (https://phabricator.wikimedia.org/T404302) [09:39:29] 06cloud-services-team, 10Cloud-VPS, 13Patch-For-Review: Support proxy backends using IPv6 - https://phabricator.wikimedia.org/T404302#11171405 (10taavi) [09:39:44] (03update) 10taavi: web_proxy: Use IPv6 in example backend [repos/cloud/cloud-vps/tofu-cloudvps] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-cloudvps/-/merge_requests/13 (https://phabricator.wikimedia.org/T404302) [09:43:16] 06cloud-services-team, 10Toolforge: [tools-static,infra] NFS issues should not bring tools-static down - https://phabricator.wikimedia.org/T397634#11171423 (10taavi) a:05taavi→03None Unlicking, as the problem seems to have gone away in the meantime [10:23:44] 06cloud-services-team, 10Cloud-VPS: wmf-auto-restart can get wedged on nfs4 mounts even when the filesystem is excluded - https://phabricator.wikimedia.org/T404322 (10fgiunchedi) 03NEW [10:36:33] FIRING: [2x] KernelErrors: Server cloudcephosd1041 logged kernel errors - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/KernelErrors - https://grafana.wikimedia.org/d/b013af4c-d405-4d9f-85d4-985abb3dec0c/wmcs-kernel-errors?orgId=1&var-instance=cloudcephosd1041 - https://alerts.wikimedia.org/?q=alertname%3DKernelErrors [10:38:14] (03CR) 10Majavah: [C:03+2] inventory: Add new tools bastions [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1187367 (https://phabricator.wikimedia.org/T392510) (owner: 10Majavah) [10:41:29] 06cloud-services-team: KernelErrors Server cloudcephosd1041 logged kernel errors - https://phabricator.wikimedia.org/T404325 (10phaultfinder) 03NEW [10:42:42] (03Merged) 10jenkins-bot: inventory: Add new tools bastions [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1187367 (https://phabricator.wikimedia.org/T392510) (owner: 10Majavah) [11:00:19] 06cloud-services-team: KernelErrors Server cloudcephosd1041 logged kernel errors - https://phabricator.wikimedia.org/T404325#11171826 (10fnegri) →14Duplicate dup:03T404282 [11:00:23] 06cloud-services-team: KernelErrors Server cloudcephosd1041 logged kernel errors - https://phabricator.wikimedia.org/T404282#11171828 (10fnegri) [11:05:19] (03PS1) 10Majavah: views: Don't crash when encountering a proxy using IPv6 backends [openstack/horizon/wmf-proxy-dashboard] - 10https://gerrit.wikimedia.org/r/1187393 (https://phabricator.wikimedia.org/T404302) [11:05:21] (03PS1) 10Majavah: views: Support constructing URLs with v6 addresses [openstack/horizon/wmf-proxy-dashboard] - 10https://gerrit.wikimedia.org/r/1187394 (https://phabricator.wikimedia.org/T404302) [11:05:45] 06cloud-services-team, 10Cloud-VPS, 07IPv6, 13Patch-For-Review: Support proxy backends using IPv6 - https://phabricator.wikimedia.org/T404302#11171853 (10taavi) [11:42:56] FIRING: ProbeDown: Service tools-k8s-haproxy-5:30000 has failed probes (http_admin_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [11:47:03] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-53 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [11:47:56] RESOLVED: ProbeDown: Service tools-k8s-haproxy-5:30000 has failed probes (http_admin_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [12:12:44] 06cloud-services-team, 10Data-Services, 06Data-Persistence, 06Privacy Engineering, and 7 others: Title of suppressed recentchanges entries can be viewed through the wiki replicas - https://phabricator.wikimedia.org/T402283#11172124 (10Gehel) [12:17:45] 06cloud-services-team, 10Data-Services, 06Data-Engineering, 06Data-Engineering-Radar, and 3 others: Create wiki replicas views for globaljsonlinks tables - https://phabricator.wikimedia.org/T387419#11172207 (10Gehel) [12:26:58] (03open) 10l10n-bot: Localisation updates from https://translatewiki.net. [toolforge-repos/wd-image-positions] - 10https://gitlab.wikimedia.org/toolforge-repos/wd-image-positions/-/merge_requests/44 [12:26:59] (03open) 10l10n-bot: Localisation updates from https://translatewiki.net. [toolforge-repos/lexeme-forms] - 10https://gitlab.wikimedia.org/toolforge-repos/lexeme-forms/-/merge_requests/15 [13:36:25] !log filippo@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-nfs-53 [13:42:16] !log filippo@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-nfs-53 [13:47:24] 06cloud-services-team: Wikimedia server is sending millions of invalid requests to our servers - https://phabricator.wikimedia.org/T404347 (10PatEhlert) 03NEW [13:49:12] 06cloud-services-team: Wikimedia server is sending millions of invalid requests to our servers - https://phabricator.wikimedia.org/T404347#11172709 (10PatEhlert) [13:49:24] 06cloud-services-team: Wikimedia server is sending millions of invalid requests to our servers - https://phabricator.wikimedia.org/T404347#11172713 (10PatEhlert) [13:50:09] 06cloud-services-team: Wikimedia server is sending millions of invalid requests to our servers - https://phabricator.wikimedia.org/T404347#11172714 (10PatEhlert) [13:57:17] 06cloud-services-team: WMCS is sending millions of invalid requests to Europeana.eu servers - https://phabricator.wikimedia.org/T404347#11172763 (10Aklapper) [14:03:09] 10cloud-services-team (FY2025/26-Q1): WMCS is sending millions of invalid requests to Europeana.eu servers - https://phabricator.wikimedia.org/T404347#11172807 (10fnegri) p:05Triage→03High [14:07:37] 10cloud-services-team (FY2025/26-Q1): WMCS is sending millions of invalid requests to Europeana.eu servers - https://phabricator.wikimedia.org/T404347#11172829 (10fnegri) Thanks for reporting this. It's not easy to find where the request originates inside WMCS, we're investigating. Could you share more details... [14:19:57] 10cloud-services-team (FY2025/26-Q1): WMCS is sending millions of invalid requests to Europeana.eu servers - https://phabricator.wikimedia.org/T404347#11172967 (10fnegri) Additional questions: * do the requests come in bulks? * can you share timestamp and source port for one request? [14:20:28] 10cloud-services-team (FY2025/26-Q1): WMCS is sending millions of invalid requests to Europeana.eu servers - https://phabricator.wikimedia.org/T404347#11172974 (10fnegri) a:03fnegri [14:20:42] 10cloud-services-team (FY2025/26-Q1): WMCS is sending millions of invalid requests to Europeana.eu servers - https://phabricator.wikimedia.org/T404347#11172979 (10fnegri) 05Open→03In progress [14:21:42] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-nfs-53 [14:21:53] !log andrew@cloudcumin1001 tools END (ERROR) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=97) for tools-k8s-worker-nfs-53 [14:22:03] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-53 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [14:22:33] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-53 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [14:26:03] (03PS1) 10Brouberol: Add a dummy secret file containing the wikiadmin password [labs/private] - 10https://gerrit.wikimedia.org/r/1187463 [14:27:33] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-53 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [14:30:09] (03PS2) 10Brouberol: Add a dummy secret file containing the wikiadmin password [labs/private] - 10https://gerrit.wikimedia.org/r/1187463 [14:31:02] 10Cloud Services Proposals, 06cloud-services-team, 10Toolforge: DRAFT Decision request - Improving lima-kilo developer experience - https://phabricator.wikimedia.org/T403051#11173083 (10fnegri) I like option 4, I think it has an additional Pro: * if we set up a build process on CI, we can run it once a day a... [14:36:26] !log filippo@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-nfs-46 [14:42:37] !log filippo@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-nfs-46 [14:47:42] 10Cloud-VPS (Quota-requests), 10Release-Engineering-Team (Radar): Additional floating IPs for gitlab-cloud-runner testing in testlabs project - https://phabricator.wikimedia.org/T404150#11173186 (10fnegri) I asked @Andrew about this, and my understanding is that floating IPs are not required to create Octavia... [14:49:42] 10cloud-services-team (FY2025/26-Q1): WMCS is sending millions of invalid requests to Europeana.eu servers - https://phabricator.wikimedia.org/T404347#11173190 (10SomeRandomDeveloper) It might be worth looking at https://github.com/search?q=repo%3Amultichill%2Ftoollabs+europeana.eu%2Fapi%2Fv2%2Fsearch&type=code,... [14:53:04] 06cloud-services-team, 10Cloud-VPS: Add accounts(-dev).wmcloud.org to XFF allowlist - https://phabricator.wikimedia.org/T404172#11173211 (10fnegri) 05Open→03Resolved a:03fnegri Done: https://gerrit.wikimedia.org/g/cloud/instance-puppet/+/master/project-proxy/_.yaml [14:58:51] 06cloud-services-team: NodeDown Node cloudcephosd1052 has been down for long. - https://phabricator.wikimedia.org/T403821#11173233 (10fnegri) 05Open→03Resolved a:03fnegri This host was being set up. [14:58:57] 06cloud-services-team: KernelErrors Server cloudcephosd1052 logged kernel errors - https://phabricator.wikimedia.org/T403842#11173237 (10fnegri) 05Open→03Resolved a:03fnegri This host was being set up. [15:01:00] 10Cloud-VPS (Quota-requests), 10Release-Engineering-Team (Radar): Additional floating IPs for gitlab-cloud-runner testing in testlabs project - https://phabricator.wikimedia.org/T404150#11173248 (10bd808) https://docs.openstack.org/magnum/ocata/dev/kubernetes-load-balancer.html > To publish a service endpoint... [15:12:27] 10cloud-services-team (FY2025/26-Q1): WMCS is sending millions of invalid requests to Europeana.eu servers - https://phabricator.wikimedia.org/T404347#11173299 (10Samoasambia) >>! In T404347#11173190, @SomeRandomDeveloper wrote: > It might be worth looking at https://github.com/search?q=repo%3Amultichill%2Ftooll... [15:22:15] 10cloud-services-team (FY2025/26-Q1): WMCS is sending millions of invalid requests to Europeana.eu servers - https://phabricator.wikimedia.org/T404347#11173361 (10SomeRandomDeveloper) Specifically https://github.com/multichill/toollabs/blob/37943cad62cefd7dc489f6b56c70c96d1b77047a/bot/wikidata/teylers_import.py#... [15:56:33] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.reactivate [15:56:34] !log andrew@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.ceph.osd.reactivate (exit_code=99) [15:58:01] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.reactivate [16:01:02] 06cloud-services-team, 10Cloud-VPS: Log DNS queries from Cloud VPS clients - https://phabricator.wikimedia.org/T404373 (10taavi) 03NEW [16:01:35] 06cloud-services-team, 10Cloud-VPS: Log DNS queries from Cloud VPS clients - https://phabricator.wikimedia.org/T404373#11173566 (10fnegri) p:05Triage→03Medium [16:07:27] andrew@cloudcumin1001 reactivate (PID 2127832) is awaiting input [16:21:33] 06cloud-services-team, 10Cloud-VPS: Log DNS queries from Cloud VPS clients - https://phabricator.wikimedia.org/T404373#11173654 (10Volans) Ideally sampled logs would be good enough, depending how complex is the setup to sample them. If there are no easy options for a real sampling we could also consider altern... [16:35:32] 10Cloud-VPS (Quota-requests), 10Release-Engineering-Team (Radar): Additional floating IPs for gitlab-cloud-runner testing in testlabs project - https://phabricator.wikimedia.org/T404150#11173748 (10dduvall) >>! In T404150#11173186, @fnegri wrote: > I asked @Andrew about this, and my understanding is that float... [16:41:49] (03approved) 10bd808: web_proxy: Use IPv6 in example backend [repos/cloud/cloud-vps/tofu-cloudvps] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-cloudvps/-/merge_requests/13 (https://phabricator.wikimedia.org/T404302) (owner: 10taavi) [16:42:19] (03merge) 10taavi: web_proxy: Use IPv6 in example backend [repos/cloud/cloud-vps/tofu-cloudvps] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-cloudvps/-/merge_requests/13 (https://phabricator.wikimedia.org/T404302) [17:14:24] 06cloud-services-team, 10Toolforge: Request to recreate replica.my.cnf for user wmdesiko - https://phabricator.wikimedia.org/T404175#11173874 (10fnegri) a:03fnegri [17:48:30] 06cloud-services-team, 10Cloud-VPS: Newly-added member of wikitextexp is not in project-bastion LDAP group, but is in bastion project - https://phabricator.wikimedia.org/T404382 (10Urbanecm_WMF) 03NEW [17:49:51] 06cloud-services-team, 10Cloud-VPS: Newly-added member of wikitextexp is not in project-bastion LDAP group, but is in bastion project - https://phabricator.wikimedia.org/T404382#11173934 (10Urbanecm_WMF) For the record, this is the second time I'm seeing this happening ({T403052} is for an earlier addition to... [17:59:12] 06cloud-services-team, 10Cloud-VPS: Newly-added member of wikitextexp is not in project-bastion LDAP group, but is in bastion project - https://phabricator.wikimedia.org/T404382#11173946 (10Urbanecm_WMF) This seems to be easily reproducible. I just added my testing account (which never was in any project) to a... [18:04:22] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.reactivate (exit_code=0) [18:14:15] 10Cloud-VPS (Quota-requests), 10Release-Engineering-Team (Radar): Additional floating IPs for gitlab-cloud-runner testing in testlabs project - https://phabricator.wikimedia.org/T404150#11174001 (10Andrew) +1 approved. My new capi-helm driver in codfw1dev has the floating IP disabled (see diff on https://wiki... [18:31:38] (03update) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/lexeme-forms] - 10https://gitlab.wikimedia.org/toolforge-repos/lexeme-forms/-/merge_requests/15 (owner: 10l10n-bot) [18:32:57] (03approved) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/lexeme-forms] - 10https://gitlab.wikimedia.org/toolforge-repos/lexeme-forms/-/merge_requests/15 (owner: 10l10n-bot) [18:33:35] (03merge) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/lexeme-forms] - 10https://gitlab.wikimedia.org/toolforge-repos/lexeme-forms/-/merge_requests/15 (owner: 10l10n-bot) [18:59:48] 10Cloud-VPS (Project-requests): Request creation of gitlab-runners-staging VPS project - https://phabricator.wikimedia.org/T404386 (10dduvall) 03NEW [19:00:57] 10Cloud-VPS (Project-requests): Request creation of gitlab-runners-staging VPS project - https://phabricator.wikimedia.org/T404386#11174124 (10taavi) +1 [19:03:08] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.reactivate [19:07:17] 10Cloud-VPS (Project-requests): Request creation of gitlab-runners-staging VPS project - https://phabricator.wikimedia.org/T404386#11174143 (10dduvall) [19:12:43] andrew@cloudcumin1001 reactivate (PID 2144244) is awaiting input [19:12:45] !log andrew@cloudcumin1001 gitlab-runners-staging START - Cookbook wmcs.vps.create_project for project gitlab-runners-staging in eqiad1 [19:12:47] andrew@cloudcumin1001: Unknown project "gitlab-runners-staging" [19:13:10] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.reactivate (exit_code=0) [19:13:18] !log andrew@cloudcumin1001 gitlab-runners-staging END (FAIL) - Cookbook wmcs.vps.create_project (exit_code=99) for project gitlab-runners-staging in eqiad1 [19:13:18] andrew@cloudcumin1001: Unknown project "gitlab-runners-staging" [19:13:52] !log andrew@cloudcumin1001 gitlab-runners-staging START - Cookbook wmcs.vps.create_project for project gitlab-runners-staging in eqiad1 [19:13:52] andrew@cloudcumin1001: Unknown project "gitlab-runners-staging" [19:14:29] (03open) 10group_199_bot_333a6c67971a471aeb1cf0b14ccf9f49: projects: added project gitlab-runners-staging [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/266 [19:15:58] !log andrew@cloudcumin1001 gitlab-runners-staging END (FAIL) - Cookbook wmcs.vps.create_project (exit_code=99) for project gitlab-runners-staging in eqiad1 [19:15:58] andrew@cloudcumin1001: Unknown project "gitlab-runners-staging" [19:16:27] !log andrew@cloudcumin1001 gitlab-runners-staging START - Cookbook wmcs.vps.create_project for project gitlab-runners-staging in eqiad1 [19:16:27] andrew@cloudcumin1001: Unknown project "gitlab-runners-staging" [19:16:59] !log andrew@cloudcumin1001 gitlab-runners-staging END (FAIL) - Cookbook wmcs.vps.create_project (exit_code=99) for project gitlab-runners-staging in eqiad1 [19:16:59] andrew@cloudcumin1001: Unknown project "gitlab-runners-staging" [19:18:12] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan for main branch [19:19:37] !log andrew@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.tofu (exit_code=99) running tofu plan for main branch [19:21:55] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment codfw1dev for service: project,designate [19:22:58] !log andrew@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.restart_openstack (exit_code=99) on deployment codfw1dev for service: project,designate [19:23:11] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment codfw1dev for service: project,designate [19:24:14] !log andrew@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.restart_openstack (exit_code=99) on deployment codfw1dev for service: project,designate [19:24:43] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment codfw1dev for service: project,designate [19:25:29] !log andrew@cloudcumin1001 gitlab-runners-staging START - Cookbook wmcs.vps.create_project for project gitlab-runners-staging in eqiad1 (T404386) [19:25:29] andrew@cloudcumin1001: Unknown project "gitlab-runners-staging" [19:25:30] T404386: Request creation of gitlab-runners-staging VPS project - https://phabricator.wikimedia.org/T404386 [19:26:02] !log andrew@cloudcumin1001 gitlab-runners-staging END (FAIL) - Cookbook wmcs.vps.create_project (exit_code=99) for project gitlab-runners-staging in eqiad1 (T404386) [19:26:02] andrew@cloudcumin1001: Unknown project "gitlab-runners-staging" [19:26:26] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) on deployment codfw1dev for service: project,designate [19:26:33] (03close) 10andrew: projects: added project gitlab-runners-staging [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/266 (owner: 10group_199_bot_333a6c67971a471aeb1cf0b14ccf9f49) [19:26:37] !log andrew@cloudcumin1001 gitlab-runners-staging START - Cookbook wmcs.vps.create_project for project gitlab-runners-staging in eqiad1 (T404386) [19:26:38] andrew@cloudcumin1001: Unknown project "gitlab-runners-staging" [19:27:17] (03open) 10group_199_bot_333a6c67971a471aeb1cf0b14ccf9f49: projects: added project gitlab-runners-staging [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/267 (https://phabricator.wikimedia.org/T404386) [19:30:47] andrew@cloudcumin1001 create_project (PID 2147068) is awaiting input [19:32:06] !log andrew@cloudcumin1001 gitlab-runners-staging END (FAIL) - Cookbook wmcs.vps.create_project (exit_code=99) for project gitlab-runners-staging in eqiad1 (T404386) [19:32:07] andrew@cloudcumin1001: Unknown project "gitlab-runners-staging" [19:32:08] T404386: Request creation of gitlab-runners-staging VPS project - https://phabricator.wikimedia.org/T404386 [19:33:03] (03close) 10andrew: projects: added project gitlab-runners-staging [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/267 (https://phabricator.wikimedia.org/T404386) (owner: 10group_199_bot_333a6c67971a471aeb1cf0b14ccf9f49) [19:33:04] !log andrew@cloudcumin1001 gitlab-runners-staging START - Cookbook wmcs.vps.create_project for project gitlab-runners-staging in eqiad1 (T404386) [19:33:05] andrew@cloudcumin1001: Unknown project "gitlab-runners-staging" [19:33:44] (03update) 10group_199_bot_333a6c67971a471aeb1cf0b14ccf9f49: projects: added project gitlab-runners-staging [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/268 (https://phabricator.wikimedia.org/T404386) [19:33:51] (03open) 10group_199_bot_333a6c67971a471aeb1cf0b14ccf9f49: projects: added project gitlab-runners-staging [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/268 (https://phabricator.wikimedia.org/T404386) [19:34:50] !log andrew@cloudcumin1001 gitlab-runners-staging END (FAIL) - Cookbook wmcs.vps.create_project (exit_code=99) for project gitlab-runners-staging in eqiad1 (T404386) [19:34:50] andrew@cloudcumin1001: Unknown project "gitlab-runners-staging" [19:37:16] (03CR) 10Krinkle: [C:03+2] Replace rc_type with rc_source [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/1184777 (https://phabricator.wikimedia.org/T403710) (owner: 10MarcoAurelio) [19:37:47] (03Merged) 10jenkins-bot: Replace rc_type with rc_source [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/1184777 (https://phabricator.wikimedia.org/T403710) (owner: 10MarcoAurelio) [19:38:57] (03close) 10andrew: projects: added project gitlab-runners-staging [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/268 (https://phabricator.wikimedia.org/T404386) (owner: 10group_199_bot_333a6c67971a471aeb1cf0b14ccf9f49) [19:39:02] !log andrew@cloudcumin1001 gitlab-runners-staging START - Cookbook wmcs.vps.create_project for project gitlab-runners-staging in eqiad1 (T404386) [19:39:05] T404386: Request creation of gitlab-runners-staging VPS project - https://phabricator.wikimedia.org/T404386 [19:39:41] (03open) 10group_199_bot_333a6c67971a471aeb1cf0b14ccf9f49: projects: added project gitlab-runners-staging [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/269 (https://phabricator.wikimedia.org/T404386) [19:40:28] 10Tool-Global-user-contributions, 13Patch-For-Review: Replace rc_type with rc_source - https://phabricator.wikimedia.org/T403710#11174234 (10Krinkle) 05Open→03Resolved a:03MarcoAurelio Thank you for the quick patch! Reviewed, merged, and deployed. > https://guc.toolforge.org/?src=rc&by=date&user=Mar... [19:41:00] (03merge) 10andrew: projects: added project gitlab-runners-staging [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/269 (https://phabricator.wikimedia.org/T404386) (owner: 10group_199_bot_333a6c67971a471aeb1cf0b14ccf9f49) [19:41:36] !log andrew@cloudcumin1001 gitlab-runners-staging END (FAIL) - Cookbook wmcs.vps.create_project (exit_code=99) for project gitlab-runners-staging in eqiad1 (T404386) [19:42:40] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan+apply for main branch [19:43:16] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.tofu (exit_code=0) running tofu plan+apply for main branch [19:46:08] !log andrew@cloudcumin1001 gitlab-runners-staging START - Cookbook wmcs.openstack.quota_increase [19:46:15] !log andrew@cloudcumin1001 gitlab-runners-staging END (PASS) - Cookbook wmcs.openstack.quota_increase (exit_code=0) [19:52:35] 10Cloud-VPS (Project-requests), 13Patch-For-Review: Request creation of gitlab-runners-staging VPS project - https://phabricator.wikimedia.org/T404386#11174259 (10Andrew) 05Open→03Resolved a:03Andrew I made a whole lot of mistakes applying the cookbook but I think this project is ready now. [19:56:15] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.reactivate [19:57:37] 10Cloud-VPS (Quota-requests), 10Release-Engineering-Team (Radar): Additional floating IPs for gitlab-cloud-runner testing in testlabs project - https://phabricator.wikimedia.org/T404150#11174288 (10Andrew) 05Open→03Declined closed in favor of T404386 [20:03:05] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.reactivate (exit_code=0) [20:04:40] 10cloud-services-team (FY2025/26-Q1), 10Cloud-VPS, 06SRE-OnFire, 10Sustainability (Incident Followup): [ceph,codfw1dev] upgrade the hosts from pacific->quincy - https://phabricator.wikimedia.org/T400334#11174299 (10Andrew) ...and now it's 100% bookworm/reef [20:42:46] 10cloud-services-team (FY2025/26-Q1), 10Toolforge (Toolforge iteration 24), 07Epic: [KR] WE6.3 Introduce a sustainability scoring system for the Toolforge platform - https://phabricator.wikimedia.org/T368600#11174420 (10komla) This has been transferred to [[ https://wikitech.wikimedia.org/wiki/Portal:Toolfor... [21:19:58] (03CR) 10Jforrester: [C:03+2] build: Updating mediawiki/mediawiki-codesniffer to 48.0.0 [labs/tools/coverme] - 10https://gerrit.wikimedia.org/r/1185448 (owner: 10Libraryupgrader) [21:35:31] (03open) 10kevinpayravi: Adding support for requesting files from latest release [toolforge-repos/gitlab-content] - 10https://gitlab.wikimedia.org/toolforge-repos/gitlab-content/-/merge_requests/13 [21:50:00] (03update) 10kevinpayravi: Adding support for requesting files from latest release [toolforge-repos/gitlab-content] - 10https://gitlab.wikimedia.org/toolforge-repos/gitlab-content/-/merge_requests/13 [21:54:02] (03update) 10kevinpayravi: Adding support for requesting files from latest release [toolforge-repos/gitlab-content] - 10https://gitlab.wikimedia.org/toolforge-repos/gitlab-content/-/merge_requests/13