[00:08:03] (InstanceDown) firing: Project tf-infra-test instance tf-infra-test is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [00:18:03] (InstanceDown) resolved: Project tf-infra-test instance tf-infra-test is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [02:14:49] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [04:32:26] (OpenstackAPIResponse) firing: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [05:14:49] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [07:37:27] (OpenstackAPIResponse) firing: (2) Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [07:48:42] 10Toolforge (Toolforge iteration 01), 10Patch-For-Review: Upgrade harbor - https://phabricator.wikimedia.org/T346241 (10CodeReviewBot) sstefanova opened https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/85 harbor: upgrade to 2.9.0 [07:53:11] 10Cloud-VPS, 10cloud-services-team: cloudvirt1051 crashed - https://phabricator.wikimedia.org/T349109 (10taavi) [08:08:28] 10Toolforge (Toolforge iteration 01), 10Patch-For-Review: Upgrade harbor - https://phabricator.wikimedia.org/T346241 (10CodeReviewBot) sstefanova opened https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/19 harbor: upgrade to 2.9.0 [08:14:49] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [08:29:39] 10Toolforge, 10cloud-services-team, 10Fiwiki-Wikidata-Commons: Investigate why Toolforge www is slow - https://phabricator.wikimedia.org/T348599 (10Zache) 05Open→03Resolved [08:30:38] 10Toolforge, 10cloud-services-team, 10Fiwiki-Wikidata-Commons: Investigate why Toolforge www is slow - https://phabricator.wikimedia.org/T348599 (10Zache) I closed the ticket as I got an answer why www was slow. Thank you for everybody. [08:30:48] 10Toolforge (Toolforge iteration 01), 10cloud-services-team (FY2023/2024-Q1), 10Cloud-Services-Origin-Team, 10Cloud-Services-Worktype-Project, 10Patch-For-Review: [builds-cli] Use the API to retrieve the latest build - https://phabricator.wikimedia.org/T348866 (10CodeReviewBot) dcaro merged https://gitla... [08:34:03] 10cloud-services-team (Kanban), 10wikitech.wikimedia.org, 10CirrusSearch, 10Discovery-Search, 10Wikimedia-production-error: DBQueryError on Wikitech Static Search - https://phabricator.wikimedia.org/T243730 (10YOUR1) 05Resolved→03Open I know this is a rather old issue; but we've encountered the same... [08:36:01] 10cloud-services-team (Kanban), 10wikitech.wikimedia.org, 10CirrusSearch, 10Discovery-Search, 10Wikimedia-production-error: DBQueryError on Wikitech Static Search - https://phabricator.wikimedia.org/T243730 (10taavi) 05Open→03Resolved Please do not re-open old, unrelated tasks. [08:46:20] 10Data-Services, 10cloud-services-team, 10Data-Engineering, 10Data-Platform-SRE: Some wikibase tables not available in commonswiki_p - https://phabricator.wikimedia.org/T298452 (10Gehel) p:05Triage→03High [08:54:14] 10Cloud-VPS, 10cloud-services-team (FY2023/2024-Q1): [openstack] Upgrade codfw hosts to bookworm - https://phabricator.wikimedia.org/T345810 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by fnegri@cumin1001 for host cloudbackup1002-dev.eqiad.wmnet with OS bookworm [09:23:12] 10Cloud-VPS, 10cloud-services-team (FY2023/2024-Q1): [openstack] Upgrade codfw hosts to bookworm - https://phabricator.wikimedia.org/T345810 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by fnegri@cumin1001 for host cloudbackup1002-dev.eqiad.wmnet with OS bookworm executed with errors: -... [09:43:41] 10Toolforge (Toolforge iteration 01), 10cloud-services-team (FY2023/2024-Q1), 10Cloud-Services-Origin-Team, 10Cloud-Services-Worktype-Project: [builds-cli] Use the API to retrieve the latest build - https://phabricator.wikimedia.org/T348866 (10CodeReviewBot) dcaro updated https://gitlab.wikimedia.org/repos... [09:49:21] 10Toolforge (Toolforge iteration 01), 10Patch-For-Review: [tbs][builder] Add shellcheck to pre-commit - https://phabricator.wikimedia.org/T348961 (10CodeReviewBot) sstefanova merged https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/18 dev: add shellcheck [10:07:31] 10Cloud-VPS, 10cloud-services-team (FY2023/2024-Q1): [openstack] Upgrade codfw hosts to bookworm - https://phabricator.wikimedia.org/T345810 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by fnegri@cumin1001 for host cloudbackup1002-dev.eqiad.wmnet with OS bookworm [10:10:41] !log fnegri@cloudcumin1001 linkwatcher START - Cookbook wmcs.openstack.quota_increase (T348441) [10:10:45] T348441: Quota increase for linkwatcher - https://phabricator.wikimedia.org/T348441 [10:11:01] !log fnegri@cloudcumin1001 linkwatcher END (PASS) - Cookbook wmcs.openstack.quota_increase (exit_code=0) (T348441) [10:11:43] 10Cloud-VPS (Quota-requests), 10linkwatcher: Quota increase for linkwatcher - https://phabricator.wikimedia.org/T348441 (10fnegri) 05Open→03Resolved a:03fnegri Quota increased as requested. :) [10:23:34] 10Toolforge (Toolforge iteration 01), 10cloud-services-team (FY2023/2024-Q1), 10Cloud-Services-Origin-Team, 10Cloud-Services-Worktype-Project: [builds-cli] Use the API to retrieve the latest build - https://phabricator.wikimedia.org/T348866 (10CodeReviewBot) dcaro merged https://gitlab.wikimedia.org/repos/... [10:25:33] (SystemdUnitDown) firing: The service unit wmf_auto_restart_virtlogd.service is in failed status on host cloudvirt1051. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudvirt1051 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [10:32:50] 10Cloud-VPS, 10cloud-services-team (FY2023/2024-Q1): [openstack] Upgrade codfw hosts to bookworm - https://phabricator.wikimedia.org/T345810 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by fnegri@cumin1001 for host cloudbackup1002-dev.eqiad.wmnet with OS bookworm executed with errors: -... [10:33:12] 10Toolforge (Quota-requests): Request increased quota for Montage Toolforge tool - https://phabricator.wikimedia.org/T348894 (10fnegri) @mahmoud the hardware resources (technically VM resources) should not be a constraint, as replicas can spread across different nodes. I think the CPU and Memory quotas are be th... [11:01:33] 10PAWS: Upgrade openrefine to 3.7.6 - https://phabricator.wikimedia.org/T348464 (10rook) 05Open→03Resolved a:03rook [11:01:38] 10PAWS: Upgrade openrefine to 3.7.6 - https://phabricator.wikimedia.org/T348464 (10github-toolforge-bot) vivian-rook closed https://github.com/toolforge/paws/pull/338 [11:01:46] vivian-rook closed https://github.com/toolforge/paws/pull/338 [11:01:49] 10Cloud-VPS, 10cloud-services-team (FY2023/2024-Q1): [openstack] Upgrade codfw hosts to bookworm - https://phabricator.wikimedia.org/T345810 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by fnegri@cumin1001 for host cloudbackup1002-dev.eqiad.wmnet with OS bookworm [11:03:50] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.prepare_upgrade for cluster tools upgrade from 1.22.17 to 1.23.17 (T298005) [11:03:53] T298005: Upgrade Toolforge Kubernetes to version 1.23 - https://phabricator.wikimedia.org/T298005 [11:04:17] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.prepare_upgrade (exit_code=0) for cluster tools upgrade from 1.22.17 to 1.23.17 (T298005) [11:04:51] 10Cloud-VPS, 10cloud-services-team: cloud/instance-puppet.git updater is broken - https://phabricator.wikimedia.org/T349195 (10taavi) [11:06:39] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-control-4 from 1.22.17 to 1.23.17 [11:14:49] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [11:15:59] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-control-4 from 1.22.17 to 1.23.17 [11:16:06] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-control-5 from 1.22.17 to 1.23.17 [11:22:01] 10Toolforge, 10cloud-services-team, 10Kubernetes: Remove TTLAfterFinished from config - https://phabricator.wikimedia.org/T349197 (10taavi) [11:23:08] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-control-5 from 1.22.17 to 1.23.17 [11:25:56] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-control-6 from 1.22.17 to 1.23.17 [11:29:21] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-control-6 from 1.22.17 to 1.23.17 [11:30:27] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-30 from 1.22.17 to 1.23.17 [11:31:51] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-30 from 1.22.17 to 1.23.17 [11:32:49] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-31 from 1.22.17 to 1.23.17 [11:34:12] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-31 from 1.22.17 to 1.23.17 [11:34:21] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-32 from 1.22.17 to 1.23.17 [11:35:42] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-32 from 1.22.17 to 1.23.17 [11:37:02] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-33 from 1.22.17 to 1.23.17 [11:37:27] (OpenstackAPIResponse) firing: (2) Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [11:37:36] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-51 from 1.22.17 to 1.23.17 [11:38:04] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-71 from 1.22.17 to 1.23.17 [11:38:28] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-33 from 1.22.17 to 1.23.17 [11:38:29] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-34 from 1.22.17 to 1.23.17 [11:38:55] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-51 from 1.22.17 to 1.23.17 [11:38:56] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-52 from 1.22.17 to 1.23.17 [11:39:36] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-71 from 1.22.17 to 1.23.17 [11:39:37] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-72 from 1.22.17 to 1.23.17 [11:39:53] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-34 from 1.22.17 to 1.23.17 [11:39:54] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-35 from 1.22.17 to 1.23.17 [11:40:24] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-52 from 1.22.17 to 1.23.17 [11:40:26] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-53 from 1.22.17 to 1.23.17 [11:41:02] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-72 from 1.22.17 to 1.23.17 [11:41:04] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-73 from 1.22.17 to 1.23.17 [11:41:20] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-35 from 1.22.17 to 1.23.17 [11:41:21] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-36 from 1.22.17 to 1.23.17 [11:41:45] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-53 from 1.22.17 to 1.23.17 [11:41:46] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-54 from 1.22.17 to 1.23.17 [11:42:30] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-73 from 1.22.17 to 1.23.17 [11:42:31] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-74 from 1.22.17 to 1.23.17 [11:42:45] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-36 from 1.22.17 to 1.23.17 [11:42:46] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-37 from 1.22.17 to 1.23.17 [11:43:05] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-54 from 1.22.17 to 1.23.17 [11:43:06] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-55 from 1.22.17 to 1.23.17 [11:43:35] 10Cloud-VPS, 10cloud-services-team (FY2023/2024-Q1): [openstack] Upgrade codfw hosts to bookworm - https://phabricator.wikimedia.org/T345810 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by fnegri@cumin1001 for host cloudbackup1002-dev.eqiad.wmnet with OS bookworm completed: - cloudbackup... [11:44:00] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-74 from 1.22.17 to 1.23.17 [11:44:03] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-75 from 1.22.17 to 1.23.17 [11:44:08] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-37 from 1.22.17 to 1.23.17 [11:44:09] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-38 from 1.22.17 to 1.23.17 [11:44:30] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-55 from 1.22.17 to 1.23.17 [11:44:31] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-56 from 1.22.17 to 1.23.17 [11:45:38] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-38 from 1.22.17 to 1.23.17 [11:45:39] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-39 from 1.22.17 to 1.23.17 [11:45:39] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-75 from 1.22.17 to 1.23.17 [11:45:40] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-76 from 1.22.17 to 1.23.17 [11:45:53] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-56 from 1.22.17 to 1.23.17 [11:45:54] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-57 from 1.22.17 to 1.23.17 [11:46:57] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-39 from 1.22.17 to 1.23.17 [11:46:58] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-40 from 1.22.17 to 1.23.17 [11:47:05] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-76 from 1.22.17 to 1.23.17 [11:47:06] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-77 from 1.22.17 to 1.23.17 [11:47:15] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-57 from 1.22.17 to 1.23.17 [11:47:16] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-58 from 1.22.17 to 1.23.17 [11:48:21] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-40 from 1.22.17 to 1.23.17 [11:48:22] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-41 from 1.22.17 to 1.23.17 [11:48:30] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-77 from 1.22.17 to 1.23.17 [11:48:31] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-78 from 1.22.17 to 1.23.17 [11:48:33] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-58 from 1.22.17 to 1.23.17 [11:48:34] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-59 from 1.22.17 to 1.23.17 [11:49:44] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-41 from 1.22.17 to 1.23.17 [11:49:45] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-42 from 1.22.17 to 1.23.17 [11:49:55] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-78 from 1.22.17 to 1.23.17 [11:49:56] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-79 from 1.22.17 to 1.23.17 [11:49:57] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-59 from 1.22.17 to 1.23.17 [11:49:58] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-60 from 1.22.17 to 1.23.17 [11:50:07] 10Toolforge (Toolforge iteration 01), 10cloud-services-team (FY2023/2024-Q1), 10Cloud-Services-Origin-Team, 10Cloud-Services-Worktype-Project: [buildservice] Create GET /build/latest endpoint in the buildservice API - https://phabricator.wikimedia.org/T345675 (10dcaro) [11:50:42] 10Toolforge (Toolforge iteration 01), 10cloud-services-team (FY2023/2024-Q1), 10Cloud-Services-Origin-Team, 10Cloud-Services-Worktype-Project: [builds-cli] Use the API to retrieve the latest build - https://phabricator.wikimedia.org/T348866 (10dcaro) 05In progress→03Resolved [11:50:49] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-ingress-4 from 1.22.17 to 1.23.17 [11:51:03] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-42 from 1.22.17 to 1.23.17 [11:51:05] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-43 from 1.22.17 to 1.23.17 [11:51:21] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-60 from 1.22.17 to 1.23.17 [11:51:22] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-61 from 1.22.17 to 1.23.17 [11:51:24] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-79 from 1.22.17 to 1.23.17 [11:51:25] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-80 from 1.22.17 to 1.23.17 [11:51:59] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-ingress-4 from 1.22.17 to 1.23.17 [11:52:24] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-43 from 1.22.17 to 1.23.17 [11:52:26] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-44 from 1.22.17 to 1.23.17 [11:52:47] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-61 from 1.22.17 to 1.23.17 [11:52:48] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-62 from 1.22.17 to 1.23.17 [11:52:49] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-80 from 1.22.17 to 1.23.17 [11:52:50] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-81 from 1.22.17 to 1.23.17 [11:53:11] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-ingress-5 from 1.22.17 to 1.23.17 [11:53:59] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-44 from 1.22.17 to 1.23.17 [11:54:00] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-45 from 1.22.17 to 1.23.17 [11:54:09] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-81 from 1.22.17 to 1.23.17 [11:54:10] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-82 from 1.22.17 to 1.23.17 [11:54:23] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-ingress-5 from 1.22.17 to 1.23.17 [11:54:25] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-62 from 1.22.17 to 1.23.17 [11:54:26] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-64 from 1.22.17 to 1.23.17 [11:55:19] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-ingress-6 from 1.22.17 to 1.23.17 [11:55:27] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-45 from 1.22.17 to 1.23.17 [11:55:28] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-46 from 1.22.17 to 1.23.17 [11:55:31] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-82 from 1.22.17 to 1.23.17 [11:55:32] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-83 from 1.22.17 to 1.23.17 [11:55:51] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-64 from 1.22.17 to 1.23.17 [11:55:52] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-65 from 1.22.17 to 1.23.17 [11:56:35] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-ingress-6 from 1.22.17 to 1.23.17 [11:56:50] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-83 from 1.22.17 to 1.23.17 [11:56:51] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-84 from 1.22.17 to 1.23.17 [11:56:53] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-46 from 1.22.17 to 1.23.17 [11:56:54] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-47 from 1.22.17 to 1.23.17 [11:57:17] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-65 from 1.22.17 to 1.23.17 [11:57:18] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-66 from 1.22.17 to 1.23.17 [11:58:12] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-84 from 1.22.17 to 1.23.17 [11:58:13] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-85 from 1.22.17 to 1.23.17 [11:58:13] (DiskSpace) firing: Disk space cloudbackup1004:9100:/ 5.351% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudbackup1004 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [11:58:19] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-47 from 1.22.17 to 1.23.17 [11:58:20] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-48 from 1.22.17 to 1.23.17 [11:58:55] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-66 from 1.22.17 to 1.23.17 [11:58:56] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-67 from 1.22.17 to 1.23.17 [11:59:33] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-85 from 1.22.17 to 1.23.17 [11:59:34] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-86 from 1.22.17 to 1.23.17 [11:59:43] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-48 from 1.22.17 to 1.23.17 [11:59:45] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-49 from 1.22.17 to 1.23.17 [12:00:27] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-67 from 1.22.17 to 1.23.17 [12:00:28] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-68 from 1.22.17 to 1.23.17 [12:01:00] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-86 from 1.22.17 to 1.23.17 [12:01:01] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-87 from 1.22.17 to 1.23.17 [12:01:08] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-49 from 1.22.17 to 1.23.17 [12:01:10] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-50 from 1.22.17 to 1.23.17 [12:02:03] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-68 from 1.22.17 to 1.23.17 [12:02:04] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-69 from 1.22.17 to 1.23.17 [12:02:22] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-87 from 1.22.17 to 1.23.17 [12:02:23] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-88 from 1.22.17 to 1.23.17 [12:03:21] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-50 from 1.22.17 to 1.23.17 [12:03:30] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-69 from 1.22.17 to 1.23.17 [12:03:31] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node tools-k8s-worker-70 from 1.22.17 to 1.23.17 [12:03:50] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-88 from 1.22.17 to 1.23.17 [12:05:03] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node tools-k8s-worker-70 from 1.22.17 to 1.23.17 [12:08:27] 10Toolforge, 10cloud-services-team: Upgrade Toolforge Kubernetes to version 1.24 - https://phabricator.wikimedia.org/T307651 (10taavi) [12:09:33] 10Toolforge (Toolforge iteration 01), 10cloud-services-team (FY2023/2024-Q1), 10Goal: Upgrade Toolforge Kubernetes to version 1.23 - https://phabricator.wikimedia.org/T298005 (10taavi) [12:14:54] 10Toolforge (Toolforge iteration 01), 10Patch-For-Review: [tbs][builder] Add shellcheck to pre-commit - https://phabricator.wikimedia.org/T348961 (10CodeReviewBot) sstefanova opened https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/118 builds-builder: bump to 0.0.78-20231018... [12:15:03] 10Toolforge (Toolforge iteration 01), 10Patch-For-Review: [tbs][builder] Refactor task yaml template - https://phabricator.wikimedia.org/T348750 (10CodeReviewBot) sstefanova opened https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/118 builds-builder: bump to 0.0.78-202310180... [12:17:57] !log sstefanova@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.component.deploy for component builds-builder [12:18:13] !log sstefanova@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component builds-builder [12:20:33] (SystemdUnitDownForLong) firing: The systemd unit wmf_auto_restart_virtlogd.service on node cloudvirt1051 has been failing for more than two hours. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDownForLong - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudvirt1051 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDownForLong [12:20:39] 10cloud-services-team: SystemdUnitDownForLong cloudvirt1051:9100 Unit wmf_auto_restart_virtlogd.service on node cloudvirt1051 has been down for long. - https://phabricator.wikimedia.org/T349202 (10phaultfinder) [12:21:27] !log sstefanova@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.component.deploy for component builds-builder [12:21:47] !log sstefanova@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component builds-builder [12:27:23] 10Toolforge (Toolforge iteration 01), 10Patch-For-Review: Upgrade harbor - https://phabricator.wikimedia.org/T346241 (10CodeReviewBot) sstefanova merged https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/85 harbor: upgrade to 2.9.0 [12:29:08] 10Cloud-VPS, 10cloud-services-team (FY2023/2024-Q1): [openstack] Upgrade codfw hosts to bookworm - https://phabricator.wikimedia.org/T345810 (10fnegri) I just reimaged `cloudbackup1002-dev` because I realized I had reimaged `cloudbackup1001-dev` but forgot about `1002`. [12:29:18] 10Toolforge (Toolforge iteration 01), 10Patch-For-Review: [tbs][builder] Add shellcheck to pre-commit - https://phabricator.wikimedia.org/T348961 (10CodeReviewBot) sstefanova merged https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/118 builds-builder: bump to 0.0.78-20231018... [12:29:31] 10Toolforge (Toolforge iteration 01), 10Patch-For-Review: [tbs][builder] Refactor task yaml template - https://phabricator.wikimedia.org/T348750 (10CodeReviewBot) sstefanova merged https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/118 builds-builder: bump to 0.0.78-202310180... [12:30:39] 10Toolforge (Toolforge iteration 01): [tbs][builder] Add shellcheck to pre-commit - https://phabricator.wikimedia.org/T348961 (10Slst2020) 05In progress→03Resolved [12:31:00] 10PAWS: jupyterlab 4.0.6 - https://phabricator.wikimedia.org/T347108 (10rook) 05Open→03Resolved a:03rook [12:31:03] 10PAWS: jupyterlab 4.0.6 - https://phabricator.wikimedia.org/T347108 (10github-toolforge-bot) vivian-rook closed https://github.com/toolforge/paws/pull/335 [12:31:11] vivian-rook closed https://github.com/toolforge/paws/pull/335 [12:31:25] 10PAWS: jupyterlab to 4.0.7 - https://phabricator.wikimedia.org/T349203 (10rook) [12:37:06] 10Toolforge (Toolforge iteration 01), 10Documentation, 10Kubernetes: [buildservice] Add docs on how to run a ruby based tool using buildpacks - https://phabricator.wikimedia.org/T347402 (10Slst2020) a:03Slst2020 [12:49:14] 10cloud-services-team (Hardware), 10SRE, 10ops-codfw, 10User-dcaro: cloud: prepare codfw for expansion (racks, switches, ceph) - https://phabricator.wikimedia.org/T346661 (10Papaul) @nskaggs you are correct even 1 additional rack isnt't possible at this time. Sorry about that. [12:57:44] 10Tool-Pageviews, 10Data-Engineering, 10Data Products (Sprint 02): Mediarequests returning "file not found" for filenames with specific characters - https://phabricator.wikimedia.org/T347899 (10Sfaci) Ok! No worries! Just waiting for some sample data to test some edge cases before pushing a fix for all this.... [13:04:31] 10Toolforge, 10cloud-services-team, 10Kubernetes: Toolforge k8s: Migrate workers to Containerd and Bookworm - https://phabricator.wikimedia.org/T284656 (10taavi) p:05Low→03Medium [13:05:34] 10Toolforge, 10cloud-services-team, 10Kubernetes: Toolforge k8s: Migrate workers to Containerd and Bookworm - https://phabricator.wikimedia.org/T284656 (10taavi) [13:05:36] 10Toolforge, 10cloud-services-team, 10Kubernetes: Migrate Toolforge Kubernetes hosts to Debian Bullseye or later - https://phabricator.wikimedia.org/T311908 (10taavi) [13:06:03] 10Toolforge, 10cloud-services-team, 10Kubernetes: Upgrade Toolforge K8s haproxies to Bookworm - https://phabricator.wikimedia.org/T349206 (10taavi) [13:06:44] 10Toolforge, 10cloud-services-team, 10Kubernetes: Upgrade Toolforge K8s etcd nodes to Bookworm - https://phabricator.wikimedia.org/T349207 (10taavi) [13:15:18] (03Abandoned) 10Majavah: toolforge: k8s: worker: upgrade: add SAL messages [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/953581 (https://phabricator.wikimedia.org/T343869) (owner: 10Arturo Borrero Gonzalez) [13:33:12] 10Cloud-VPS, 10cloud-services-team: cloud/instance-puppet.git updater is broken - https://phabricator.wikimedia.org/T349195 (10taavi) ` Oct 18 13:30:56 enc-2 puppet-enc-git-worker[3228677]: 2023-10-18 13:30:56.875 3228677 CRITICAL puppet-enc-git-worker [-] Unhandled error: pymysql.err.Error: Already closed Oct... [13:37:09] 10Cloud-VPS, 10cloud-services-team: cloud/instance-puppet.git updater is broken - https://phabricator.wikimedia.org/T349195 (10taavi) a:03taavi [13:58:09] 10Toolforge (Toolforge iteration 01), 10Patch-For-Review: Upgrade harbor - https://phabricator.wikimedia.org/T346241 (10Slst2020) [13:58:51] 10Toolforge (Toolforge iteration 01), 10Patch-For-Review: Upgrade harbor from 2.5 to 2.9 - https://phabricator.wikimedia.org/T346241 (10taavi) [14:03:19] 10Toolforge (Toolforge iteration 01), 10Patch-For-Review: Upgrade harbor from 2.5 to 2.9 - https://phabricator.wikimedia.org/T346241 (10Slst2020) harbor 2.9 requires postgres >=12.0. We're using 12.7 via trove so all good. [14:07:03] PROBLEM - nova-compute proc minimum on cloudvirt1058 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [14:08:27] RECOVERY - nova-compute proc minimum on cloudvirt1058 is OK: PROCS OK: 1 process with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [14:14:49] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [14:25:50] 10Cloud-VPS, 10cloud-services-team, 10DC-Ops, 10SRE, 10ops-eqiad: cloudcephosd1021-1034: hard drive sector errors increasing - https://phabricator.wikimedia.org/T348643 (10Jclark-ctr) @dcaro so i submitted the logs and here is Dells Response. The only errors showing in the System Event Log (SEL) ar... [14:34:23] 10Cloud-VPS, 10cloud-services-team, 10DC-Ops, 10SRE, 10ops-eqiad: cloudcephosd1021-1034: hard drive sector errors increasing - https://phabricator.wikimedia.org/T348643 (10dcaro) just in case you need it: > What OS are you running on the server? Debian Bullseye (11): 5.10.0-19-amd64 #1 SMP Debian 5.10.1... [14:36:22] RECOVERY - ensure kvm processes are running on cloudvirt1051 is OK: PROCS OK: 1 process with regex args qemu-system-x86_64 https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:16:20] 10Tool-bub2, 10Internet-Archive, 10Outreach-Programs-Projects, 10Outreachy (Round 27): Integrate Wikimedia Ecosystem within BUB2 tool - https://phabricator.wikimedia.org/T346386 (10Ibinaboadiela) >>! In T346386#9233547, @Evonloch wrote: > Hello, I am an outreachy applicant. I am interested in this project.... [15:25:42] 10Cloud Services Proposals: Decision Request - Incident Response Process - https://phabricator.wikimedia.org/T348887 (10nskaggs) One thing to keep in mind is even SRE doesn't take every incident through the entire process, which would include a retro. So we could choose to do similar and despite having a process... [15:27:02] 10Toolforge, 10cloud-services-team, 10Sustainability (Incident Followup): do not use :latest for toolforge infrastructure components - https://phabricator.wikimedia.org/T320476 (10taavi) [15:27:30] 10Cloud Services Proposals, 10Toolforge: Cloud services enhancement proposal: Toolforge Kubernetes component workflow improvements - https://phabricator.wikimedia.org/T320667 (10taavi) 05Open→03Resolved a:03taavi This is more or less done. [15:33:44] 10cloud-services-team (Hardware), 10DC-Ops, 10SRE, 10ops-eqiad: Q3:rack/setup/install cloudcephosd10(3[5-9]|40) - https://phabricator.wikimedia.org/T324998 (10nskaggs) Can someone provide an update on what's happening with these machines? Where they indeed sent back? Do we have replacement hardware? [15:37:27] (OpenstackAPIResponse) firing: (2) Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [15:46:02] 10Cloud-VPS, 10cloud-services-team, 10Data-Platform-SRE, 10SRE, 10ops-eqiad: Move cloudvirt-wdqs hosts - https://phabricator.wikimedia.org/T346948 (10taavi) @jclark-ctr these need a single NIC connected to the `cloud-hosts` as the primary VLAN, and `cloud-instances` and `cloud-private` VLANs trunked (we... [15:58:13] (DiskSpace) firing: Disk space cloudbackup1004:9100:/ 5.839% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudbackup1004 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [16:05:38] (03PS1) 10Vgutierrez: ssl: Add dummy digicert-2023 unified keys [labs/private] - 10https://gerrit.wikimedia.org/r/966894 (https://phabricator.wikimedia.org/T341119) [16:06:26] (03CR) 10Vgutierrez: [V: 03+2 C: 03+2] ssl: Add dummy digicert-2023 unified keys [labs/private] - 10https://gerrit.wikimedia.org/r/966894 (https://phabricator.wikimedia.org/T341119) (owner: 10Vgutierrez) [17:13:13] (DiskSpace) resolved: Disk space cloudbackup1004:9100:/ 5.888% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudbackup1004 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [17:14:49] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [18:23:26] 10cloud-services-team (Hardware), 10DC-Ops, 10SRE, 10ops-eqiad: Q3:rack/setup/install cloudcephosd10(3[5-9]|40) - https://phabricator.wikimedia.org/T324998 (10RobH) >>! In T324998#9262325, @nskaggs wrote: > Can someone provide an update on what's happening with these machines? Where they indeed sent back?... [18:28:34] 10Tool-bub2, 10Outreach-Programs-Projects, 10Outreachy (Round 27): Change UploadedItems.js component to stateless functional components - https://phabricator.wikimedia.org/T348416 (10wassan.anmol117) 05Open→03Resolved [19:37:27] (OpenstackAPIResponse) firing: (2) Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [20:14:49] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [23:14:50] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [23:37:28] (OpenstackAPIResponse) firing: (2) Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse