[00:08:28] FIRING: PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance tf-infra-test in project tf-infra-test - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [00:13:28] RESOLVED: PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance tf-infra-test in project tf-infra-test - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [00:16:29] FIRING: InstanceDown: Project tf-infra-test instance tf-infra-test is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [00:21:29] RESOLVED: InstanceDown: Project tf-infra-test instance tf-infra-test is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [01:19:56] FIRING: CloudVPSDesignateLeaks: Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [01:37:15] FIRING: NovafullstackSustainedFailures: Novafullstack tests have been failing for more than 5hours in eqiad - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/NovafullstackSustainedFailures - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-nova-fullstack?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DNovafullstackSustainedFailures [02:00:38] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [02:06:36] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 29687 bytes in 4.395 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [02:50:04] 10Toolforge (Toolforge iteration 14): [harbor] 2024-07-24 Tools harbor db out of space - https://phabricator.wikimedia.org/T370843#10063323 (10Raymond_Ndibe) >>! In T370843#10062154, @dcaro wrote: >>>! In T370843#10062104, @Raymond_Ndibe wrote: >>>>! In T370843#10062061, @dcaro wrote: >>> by looking at https://g... [02:52:53] 10Toolforge (Toolforge iteration 14): [harbor] 2024-07-24 Tools harbor db out of space - https://phabricator.wikimedia.org/T370843#10063324 (10Raymond_Ndibe) In the next few minutes we'll know if this worked or not. I still don't know how it all began in the first place and therefore can't confidently say that i... [03:22:30] 10Toolforge (Toolforge iteration 14): [harbor] 2024-07-24 Tools harbor db out of space - https://phabricator.wikimedia.org/T370843#10063328 (10Raymond_Ndibe) >>! In T370843#10063324, @Raymond_Ndibe wrote: > In the next few minutes we'll know if this worked or not. I still don't know how it all began in the first... [03:26:09] 10Toolforge (Toolforge iteration 14): [harbor] 2024-07-24 Tools harbor db out of space - https://phabricator.wikimedia.org/T370843#10063329 (10Raymond_Ndibe) We should monitor this for a few more days/week(s) before we can confidently close the task as resolved. [04:10:27] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack [04:12:37] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) [04:17:00] RESOLVED: NovafullstackSustainedFailures: Novafullstack tests have been failing for more than 5hours in eqiad - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/NovafullstackSustainedFailures - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-nova-fullstack?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DNovafullstackSustainedFailures [04:59:41] RESOLVED: CloudVPSDesignateLeaks: Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [06:36:39] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-5:30000 has failed probes (http_this_tool_does_not_exist_beta_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#toolsbeta-test-k8s-haproxy-5:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [06:41:39] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-5:30000 has failed probes (http_this_tool_does_not_exist_beta_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#toolsbeta-test-k8s-haproxy-5:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [06:48:54] 10Tools: Flickr2 Commons is currently down - https://phabricator.wikimedia.org/T372451#10063391 (10Bugreporter) [07:54:34] !log dcaro@urcuchillay deploymentpreps3 START - Cookbook wmcs.vps.create_project for project deploymentpreps3 in eqiad1 (T372353) [07:54:35] wmbot~dcaro@urcuchillay: Unknown project "deploymentpreps3" [07:54:36] T372353: Request creation of deployment_prep_s3 VPS project - https://phabricator.wikimedia.org/T372353 [07:55:11] !log dcaro@urcuchillay deploymentpreps3 END (PASS) - Cookbook wmcs.vps.create_project (exit_code=0) for project deploymentpreps3 in eqiad1 (T372353) [07:55:11] wmbot~dcaro@urcuchillay: Unknown project "deploymentpreps3" [08:08:36] !log dcaro@urcuchillay deploymentpreps3 START - Cookbook wmcs.vps.add_user_to_project for user 'BryanDavis' in role 'member' (T372353) [08:08:38] wmbot~dcaro@urcuchillay: Unknown project "deploymentpreps3" [08:08:38] T372353: Request creation of deployment_prep_s3 VPS project - https://phabricator.wikimedia.org/T372353 [08:08:40] !log dcaro@urcuchillay deploymentpreps3 END (FAIL) - Cookbook wmcs.vps.add_user_to_project (exit_code=99) for user 'BryanDavis' in role 'member' (T372353) [08:08:41] wmbot~dcaro@urcuchillay: Unknown project "deploymentpreps3" [08:11:37] !log dcaro@urcuchillay deploymentpreps3 START - Cookbook wmcs.vps.add_user_to_project for user 'BryanDavis' in role 'member' (T372353) [08:11:39] !log dcaro@urcuchillay deploymentpreps3 END (FAIL) - Cookbook wmcs.vps.add_user_to_project (exit_code=99) for user 'BryanDavis' in role 'member' (T372353) [08:12:49] wmbot~dcaro@urcuchillay: Unknown project "deploymentpreps3" [08:12:49] wmbot~dcaro@urcuchillay: Unknown project "deploymentpreps3" [08:18:04] !log dcaro@urcuchillay deploymentpreps3 START - Cookbook wmcs.vps.add_user_to_project for user 'BryanDavis' in role 'member' (T372353) [08:18:13] !log dcaro@urcuchillay deploymentpreps3 END (PASS) - Cookbook wmcs.vps.add_user_to_project (exit_code=0) for user 'BryanDavis' in role 'member' (T372353) [08:18:26] !log dcaro@urcuchillay deploymentpreps3 START - Cookbook wmcs.vps.add_user_to_project for user 'Thcipriani' in role 'member' (T372353) [08:18:31] !log dcaro@urcuchillay deploymentpreps3 END (PASS) - Cookbook wmcs.vps.add_user_to_project (exit_code=0) for user 'Thcipriani' in role 'member' (T372353) [08:22:28] 06cloud-services-team, 10Observability-Metrics, 10SRE Observability (FY2024/2025-Q3): Remove librenms -> graphite integration, replace with gnmi - https://phabricator.wikimedia.org/T372457 (10fgiunchedi) 03NEW [08:29:38] 06cloud-services-team, 06SRE Observability: Remove librenms -> graphite integration, replace with gnmi - https://phabricator.wikimedia.org/T372457#10063489 (10fgiunchedi) [08:32:23] wmbot~dcaro@urcuchillay: Unknown project "deploymentpreps3" [08:32:23] T372353: Request creation of deployment_prep_s3 VPS project - https://phabricator.wikimedia.org/T372353 [08:32:24] wmbot~dcaro@urcuchillay: Unknown project "deploymentpreps3" [08:32:26] wmbot~dcaro@urcuchillay: Unknown project "deploymentpreps3" [08:33:58] wmbot~dcaro@urcuchillay: Unknown project "deploymentpreps3" [08:34:14] (03PS1) 10David Caro: openstack: skip passing envvar for role commands [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1062672 [08:34:56] (03CR) 10CI reject: [V:04-1] openstack: skip passing envvar for role commands [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1062672 (owner: 10David Caro) [08:35:12] 10Cloud-VPS (Project-requests), 10Beta-Cluster-Infrastructure: Request creation of deployment_prep_s3 VPS project - https://phabricator.wikimedia.org/T372353#10063507 (10dcaro) I updated both the project description and the task template to mention the requirements, not sure if that is the best place, but... [08:35:13] 10Cloud-VPS (Project-requests), 10Beta-Cluster-Infrastructure: Request creation of deployment_prep_s3 VPS project - https://phabricator.wikimedia.org/T372353#10063500 (10dcaro) 05In progress→03Resolved Done :) [08:35:40] (03PS1) 10David Caro: openstack: skip passing envvar for role commands [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1062674 [08:35:56] (03Abandoned) 10David Caro: openstack: skip passing envvar for role commands [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1062672 (owner: 10David Caro) [09:42:29] 10Toolforge (Toolforge iteration 14): [harbor] 2024-07-24 Tools harbor db out of space - https://phabricator.wikimedia.org/T370843#10063675 (10dcaro) a:05dcaro→03Raymond_Ndibe [09:43:12] 10Toolforge (Toolforge iteration 14): [harbor] 2024-07-24 Tools harbor db out of space - https://phabricator.wikimedia.org/T370843#10063677 (10dcaro) 05Open→03In progress [09:46:17] (03approved) 10dcaro: [builds-cli] remove _display_messages [repos/cloud/toolforge/builds-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-cli/-/merge_requests/69 (owner: 10raymond-ndibe) [09:46:19] (03update) 10dcaro: [builds-cli] remove _display_messages [repos/cloud/toolforge/builds-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-cli/-/merge_requests/69 (owner: 10raymond-ndibe) [09:50:04] (03update) 10dcaro: [jobs-api] custom resource definition deployment templates [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/101 (https://phabricator.wikimedia.org/T359650) (owner: 10raymond-ndibe) [09:50:30] (03approved) 10dcaro: [toolforge-weld] add custom resources version to k8sclient [repos/cloud/toolforge/toolforge-weld] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-weld/-/merge_requests/51 (https://phabricator.wikimedia.org/T359650) (owner: 10raymond-ndibe) [09:50:46] (03update) 10dcaro: [toolforge-weld] add custom resources version to k8sclient [repos/cloud/toolforge/toolforge-weld] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-weld/-/merge_requests/51 (https://phabricator.wikimedia.org/T359650) (owner: 10raymond-ndibe) [09:55:21] (03approved) 10dcaro: [jobs-api] better error handling for services [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/113 (https://phabricator.wikimedia.org/T359804) (owner: 10raymond-ndibe) [09:55:23] (03update) 10dcaro: [jobs-api] better error handling for services [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/113 (https://phabricator.wikimedia.org/T359804) (owner: 10raymond-ndibe) [09:58:01] 10Tool-Pageviews: Validate projects on entry in Pageviews instead of bundling the allowlist - https://phabricator.wikimedia.org/T371997#10063701 (10Amire80) Thank you for working on this, @MusicAnimal. The procedure for creating a wiki is at https://wikitech.wikimedia.org/wiki/Add_a_wiki . Please edit that page... [10:06:41] (03open) 10dcaro: toolforge_deploy_mr: make all apt actions non-interactive [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/182 [10:13:22] (03open) 10dcaro: toolforge_get_versions: remove special case for jobs-cli [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/480 [10:13:39] (03approved) 10dcaro: toolforge_get_versions: remove special case for jobs-cli [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/480 [10:13:42] (03merge) 10dcaro: toolforge_get_versions: remove special case for jobs-cli [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/480 [10:13:43] (03update) 10dcaro: toolforge_get_versions: remove special case for jobs-cli [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/480 [10:31:42] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Toolforge (Toolforge iteration 14): [sct.backend] Create trove database - https://phabricator.wikimedia.org/T370317#10063758 (10dcaro) 05Open→03In progress [12:29:13] (03PS1) 10Krinkle: Add diagnostic to JsonException [labs/tools/fileprotectionsync] - 10https://gerrit.wikimedia.org/r/1062703 [12:29:44] (03CR) 10Krinkle: [C:03+2] Add diagnostic to JsonException [labs/tools/fileprotectionsync] - 10https://gerrit.wikimedia.org/r/1062703 (owner: 10Krinkle) [12:49:16] (03Merged) 10jenkins-bot: Add diagnostic to JsonException [labs/tools/fileprotectionsync] - 10https://gerrit.wikimedia.org/r/1062703 (owner: 10Krinkle) [13:02:56] 10Tool-Global-user-contributions, 06Stewards-and-global-tools, 07Epic, 10Temporary accounts (Create/update essential tools/anti-abuse management): [Epic] Implement global user contributions feature - https://phabricator.wikimedia.org/T337089#10064054 (10KColeman-WMF) [13:03:12] 10Tool-Global-user-contributions, 06Stewards-and-global-tools, 07Epic, 10Temporary accounts (Create/update essential tools/anti-abuse management): [Epic] Implement global user contributions feature - https://phabricator.wikimedia.org/T337089#10064058 (10KColeman-WMF) [13:34:39] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Cloud-VPS: Upgrade cloud-vps openstack to version 'Caracal' - https://phabricator.wikimedia.org/T369044#10064123 (10Andrew) p:05Triage→03Medium [13:34:57] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS: Migrate eqiad1 hypervisors to Neutron OVS agent - https://phabricator.wikimedia.org/T364457#10064124 (10Andrew) p:05Triage→03High [14:14:49] 06cloud-services-team, 06Data-Persistence, 07Grafana, 10SRE Observability (FY2024/2025-Q1): Grafana MySQL charts can be inconsistent when zooming out - https://phabricator.wikimedia.org/T371485#10064209 (10lmata) [14:16:47] 06cloud-services-team, 10SRE Observability (FY2024/2025-Q2): Remove librenms -> graphite integration, replace with gnmi - https://phabricator.wikimedia.org/T372457#10064224 (10lmata) [15:05:21] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [15:05:27] !log andrew@cloudcumin1001 admin END (ERROR) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=97) [15:05:31] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [15:06:03] !log andrew@cloudcumin1001 admin END (ERROR) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=97) [16:44:42] (03update) 10raymond-ndibe: [builds-cli] remove _display_messages [repos/cloud/toolforge/builds-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-cli/-/merge_requests/69 [17:15:54] (03update) 10raymond-ndibe: [toolforge-weld] move _display_message into toolforge weld [repos/cloud/toolforge/toolforge-weld] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-weld/-/merge_requests/46 [17:17:56] (03update) 10raymond-ndibe: [jobs-api] multi-replica support for continuous jobs [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/115 (https://phabricator.wikimedia.org/T341066) [17:25:27] (03update) 10raymond-ndibe: [jobs-api] multi-replica support for continuous jobs [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/115 (https://phabricator.wikimedia.org/T341066) [17:26:01] (03update) 10raymond-ndibe: [jobs-api] multi-replica support for continuous jobs [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/115 (https://phabricator.wikimedia.org/T341066) [17:33:33] 10Cloud-VPS (Project-requests), 10Beta-Cluster-Infrastructure: Request creation of deploymentpreps3 VPS project - https://phabricator.wikimedia.org/T372353#10064943 (10bd808) [18:19:52] (03update) 10raymond-ndibe: [jobs-api] multi-replica support for continuous jobs [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/115 (https://phabricator.wikimedia.org/T341066) [18:50:48] 10Toolforge (Toolforge iteration 14): Toolforge Aptfile not producing working copy of `ffmpeg` - https://phabricator.wikimedia.org/T365633#10065108 (10tchin) Hmm I guess my problem is different and just lies in the non-standard way packages are installed. `lang=bash heroku@38b1190a0d63:/workspace$ find /layers/f... [19:09:30] 10Toolforge (Toolforge iteration 14): Toolforge Aptfile not producing working copy of `ffmpeg` - https://phabricator.wikimedia.org/T365633#10065155 (10derenrich) we're in a hackweek right now. i'll try this again next week [19:28:48] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [19:30:07] (03PS1) 10Krinkle: Upon edit() fail, skip one instead of exit/skip all [labs/tools/fileprotectionsync] - 10https://gerrit.wikimedia.org/r/1062749 [19:30:16] (03CR) 10Krinkle: [C:03+2] Upon edit() fail, skip one instead of exit/skip all [labs/tools/fileprotectionsync] - 10https://gerrit.wikimedia.org/r/1062749 (owner: 10Krinkle) [19:30:35] (03CR) 10CI reject: [V:04-1] Upon edit() fail, skip one instead of exit/skip all [labs/tools/fileprotectionsync] - 10https://gerrit.wikimedia.org/r/1062749 (owner: 10Krinkle) [19:31:41] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [19:34:18] (03PS2) 10Krinkle: Upon edit() fail, skip one instead of exit/skip all [labs/tools/fileprotectionsync] - 10https://gerrit.wikimedia.org/r/1062749 [19:34:23] (03CR) 10Krinkle: [C:03+2] Upon edit() fail, skip one instead of exit/skip all [labs/tools/fileprotectionsync] - 10https://gerrit.wikimedia.org/r/1062749 (owner: 10Krinkle) [19:34:45] (03Merged) 10jenkins-bot: Upon edit() fail, skip one instead of exit/skip all [labs/tools/fileprotectionsync] - 10https://gerrit.wikimedia.org/r/1062749 (owner: 10Krinkle) [21:33:09] FIRING: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [21:38:09] RESOLVED: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [21:42:35] (03PS1) 10Krinkle: Include timestamp in error log entries [labs/tools/fileprotectionsync] - 10https://gerrit.wikimedia.org/r/1062759 [21:47:27] (03CR) 10Krinkle: [C:03+2] Include timestamp in error log entries [labs/tools/fileprotectionsync] - 10https://gerrit.wikimedia.org/r/1062759 (owner: 10Krinkle) [21:47:50] (03Merged) 10jenkins-bot: Include timestamp in error log entries [labs/tools/fileprotectionsync] - 10https://gerrit.wikimedia.org/r/1062759 (owner: 10Krinkle) [21:57:30] (03open) 10dcaro: Add deployment steps info [toolforge-repos/sample-complex-app-frontend] (show_task_status) - 10https://gitlab.wikimedia.org/toolforge-repos/sample-complex-app-frontend/-/merge_requests/3 [21:59:15] (03update) 10dcaro: Add deployment steps info [toolforge-repos/sample-complex-app-frontend] (show_task_status) - 10https://gitlab.wikimedia.org/toolforge-repos/sample-complex-app-frontend/-/merge_requests/3 (https://phabricator.wikimedia.org/T370317) [21:59:39] (03update) 10dcaro: Add deployment steps info [toolforge-repos/sample-complex-app-frontend] (show_task_status) - 10https://gitlab.wikimedia.org/toolforge-repos/sample-complex-app-frontend/-/merge_requests/3 (https://phabricator.wikimedia.org/T370317) [22:00:20] (03update) 10dcaro: Add deployment steps info [toolforge-repos/sample-complex-app-frontend] (show_task_status) - 10https://gitlab.wikimedia.org/toolforge-repos/sample-complex-app-frontend/-/merge_requests/3 (https://phabricator.wikimedia.org/T370317) [22:00:36] (03update) 10dcaro: Add deployment steps info [toolforge-repos/sample-complex-app-frontend] (show_task_status) - 10https://gitlab.wikimedia.org/toolforge-repos/sample-complex-app-frontend/-/merge_requests/3 (https://phabricator.wikimedia.org/T372478) [22:10:05] 10Cloud-VPS, 10Beta-Cluster-Infrastructure: OpenTofu fails to provision a Magnum managed k8s cluster in deployment-prep - https://phabricator.wikimedia.org/T372365#10065531 (10bd808) [22:11:09] FIRING: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [22:13:58] (03update) 10dcaro: Add deployment steps info [toolforge-repos/sample-complex-app-frontend] (show_task_status) - 10https://gitlab.wikimedia.org/toolforge-repos/sample-complex-app-frontend/-/merge_requests/3 (https://phabricator.wikimedia.org/T372478) [22:14:58] 10Toolforge (Toolforge iteration 14): Toolforge Aptfile not producing working copy of `ffmpeg` - https://phabricator.wikimedia.org/T365633#10065533 (10dcaro) @tchin can you open a new task with the code/packages that you are seeing issues with? I'll follow up on that one as it seems to be different [22:16:09] RESOLVED: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [23:03:09] FIRING: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [23:22:03] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [23:22:10] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [23:22:29] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [23:22:38] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [23:26:33] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [23:26:42] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [23:27:46] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [23:27:54] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [23:28:09] RESOLVED: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning