[00:30:02] (03open) 10samwilson: Add CommunityRequests extension and update all submodules [toolforge-repos/wishlist-test] - 10https://gitlab.wikimedia.org/toolforge-repos/wishlist-test/-/merge_requests/5 (https://phabricator.wikimedia.org/T371098) [00:32:18] (03merge) 10samwilson: Add CommunityRequests extension and update all submodules [toolforge-repos/wishlist-test] - 10https://gitlab.wikimedia.org/toolforge-repos/wishlist-test/-/merge_requests/5 (https://phabricator.wikimedia.org/T371098) [00:50:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [01:00:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [02:50:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [03:00:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [05:31:26] (03CR) 10Slavina Stefanova: [C:03+1] [wmcs-cookbook] update toolsbeta-test-k8s-control vms [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1071052 (https://phabricator.wikimedia.org/T359641) (owner: 10Raymond Ndibe) [05:36:52] 10Toolforge: [infra, k8s, webservice] remove deprecated kubectl --wait flag before k8s 1.29 upgrade - https://phabricator.wikimedia.org/T373866#10124104 (10Slst2020) a:03LucasWerkmeister [05:38:53] 10Toolforge: [builds-builder] Cache .m2 folder (local maven repository) between builds - https://phabricator.wikimedia.org/T350307#10124109 (10Slst2020) >>! In T350307#10120054, @Don-vip wrote: > @dcaro @Slst2020 have you enabled the cache already? I switched to Toolforge Build Service tonight in order to get Ja... [06:06:33] (03open) 10sstefanova: d/changelog: bump to 0.103.11 [repos/cloud/toolforge/tools-webservice] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/56 (https://phabricator.wikimedia.org/T373866) [06:06:37] (03update) 10sstefanova: d/changelog: bump to 0.103.11 [repos/cloud/toolforge/tools-webservice] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/56 (https://phabricator.wikimedia.org/T373866) [06:07:15] (03approved) 10sstefanova: d/changelog: bump to 0.103.11 [repos/cloud/toolforge/tools-webservice] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/56 (https://phabricator.wikimedia.org/T373866) [06:07:30] (03merge) 10sstefanova: d/changelog: bump to 0.103.11 [repos/cloud/toolforge/tools-webservice] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/56 (https://phabricator.wikimedia.org/T373866) [06:20:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [06:24:26] !log sstefanova@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component tools-webservice [06:24:28] !log sstefanova@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.component.deploy (exit_code=99) for component tools-webservice [06:30:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [06:45:12] 10Toolforge: [infra, k8s, webservice] remove deprecated kubectl --wait flag before k8s 1.29 upgrade - https://phabricator.wikimedia.org/T373866#10124159 (10Slst2020) 05In progress→03Resolved [07:10:47] 10Toolforge: [k8s, infra] update pause image to 3.6 - https://phabricator.wikimedia.org/T374193 (10Slst2020) 03NEW [07:14:18] !log sstefanova@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.image.copy_to_registry [07:14:19] !log sstefanova@cloudcumin1001 tools Updating container image docker-registry.tools.wmflabs.org/pause:3.6 [07:14:24] !log sstefanova@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.image.copy_to_registry (exit_code=0) [07:18:00] 10Toolforge: [k8s, infra] update pause image to 3.6 - https://phabricator.wikimedia.org/T374193#10124244 (10Slst2020) ` sstefanova@cloudcumin1001:~$ sudo cookbook wmcs.toolforge.k8s.image.copy_to_registry --origin-image registry.k8s.io/pause:3.6 --dest-image-name pause --dest-image-version 3.6 ` https://docker-... [07:27:05] !log dcaro@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.drain_node (T373986) [07:27:11] T373986: cloudsw1-c8-eqiad is unstable - https://phabricator.wikimedia.org/T373986 [07:34:03] 10cloud-services-team (FY2024/2025-Q1-Q2): Drain C8 rack - https://phabricator.wikimedia.org/T374043#10124268 (10dcaro) [07:56:06] 10Tools, 06Infrastructure-Foundations: Requested offboarding-to-volunteer of HTriedman // Transfer ownership of SpinachBot from HTriedman (WMF) to HTriedman - https://phabricator.wikimedia.org/T371644#10124315 (10SLyngshede-WMF) 05In progress→03Resolved [08:12:13] (03update) 10raymond-ndibe: [jobs-api] better error handling for services [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/113 (https://phabricator.wikimedia.org/T359804) [08:13:09] (03update) 10raymond-ndibe: [jobs-api] better error handling for services [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/113 (https://phabricator.wikimedia.org/T359804) [08:21:33] (03approved) 10raymond-ndibe: [jobs-api] better error handling for services [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/113 (https://phabricator.wikimedia.org/T359804) [08:21:38] (03merge) 10raymond-ndibe: [jobs-api] better error handling for services [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/113 (https://phabricator.wikimedia.org/T359804) [08:24:27] (03open) 10project_1317_bot_df3177307bed93c3f34e421e26c86e38: jobs-api: bump to 0.0.335-20240906082148-3b35f02b [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/518 (https://phabricator.wikimedia.org/T359804) [08:25:59] !log raymondndibe@wmf3402 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component jobs-api [08:26:01] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [08:28:54] !log raymondndibe@wmf3402 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.component.deploy (exit_code=99) for component jobs-api [08:28:56] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [08:29:57] !log raymondndibe@wmf3402 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component jobs-api [08:29:58] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [08:34:48] !log raymondndibe@wmf3402 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component jobs-api [08:34:50] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [08:36:58] !log raymondndibe@wmf3402 tools START - Cookbook wmcs.toolforge.component.deploy for component jobs-api [08:37:00] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [08:38:34] !log raymondndibe@wmf3402 tools END (FAIL) - Cookbook wmcs.toolforge.component.deploy (exit_code=99) for component jobs-api [08:38:35] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [08:42:17] !log raymondndibe@wmf3402 tools START - Cookbook wmcs.toolforge.component.deploy for component jobs-api [08:42:19] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [08:47:47] !log raymondndibe@wmf3402 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component jobs-api [08:47:50] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [08:49:08] (03approved) 10raymond-ndibe: jobs-api: bump to 0.0.335-20240906082148-3b35f02b [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/518 (https://phabricator.wikimedia.org/T359804) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [08:49:26] (03merge) 10raymond-ndibe: jobs-api: bump to 0.0.335-20240906082148-3b35f02b [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/518 (https://phabricator.wikimedia.org/T359804) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [08:49:36] (03update) 10raymond-ndibe: [toolforge-deploy] upgrade cert-manager [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/517 (https://phabricator.wikimedia.org/T359641) [08:54:58] (03update) 10raymond-ndibe: [toolforge-deploy] upgrade cert-manager [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/517 (https://phabricator.wikimedia.org/T359641) [08:55:47] !log raymondndibe@wmf3402 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component cert-manager [08:55:49] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [09:00:36] !log raymondndibe@wmf3402 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.component.deploy (exit_code=99) for component cert-manager [09:00:37] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [09:10:54] !log raymondndibe@wmf3402 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component cert-manager [09:10:57] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [09:13:21] !log raymondndibe@wmf3402 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.component.deploy (exit_code=99) for component cert-manager [09:13:23] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [09:14:45] !log raymondndibe@wmf3402 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component cert-manager [09:14:46] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [09:16:59] !log raymondndibe@wmf3402 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.component.deploy (exit_code=99) for component cert-manager [09:17:01] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [11:13:15] !log dcaro@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.drain_node (exit_code=0) (T373986) [11:13:20] T373986: cloudsw1-c8-eqiad is unstable - https://phabricator.wikimedia.org/T373986 [13:20:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [13:30:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [13:30:53] 10Cloud-VPS (Debian Buster Deprecation), 10linkwatcher: Cloud VPS "linkwatcher" project Buster deprecation - https://phabricator.wikimedia.org/T367536#10125697 (10fnegri) @Beetstra do you still plan to work on the upgrade? The VMs are now also alerting with the following messages: * //Puppet CA certificate co... [13:46:04] !log dcaro@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.drain_node (T373986) [13:46:10] T373986: cloudsw1-c8-eqiad is unstable - https://phabricator.wikimedia.org/T373986 [13:50:07] 10Data-Services, 06Data-Platform-SRE, 06DBA: Prepare and check storage layer for bdrwiki - https://phabricator.wikimedia.org/T371759#10125808 (10Ladsgroup) [13:50:36] 10cloud-services-team (FY2024/2025-Q1-Q2): Drain C8 rack - https://phabricator.wikimedia.org/T374043#10125810 (10dcaro) [13:51:30] !log fnegri@cloudcumin1001 linkwatcher START - Cookbook wmcs.vps.refresh_puppet_certs on coibot.linkwatcher.eqiad1.wikimedia.cloud (T367536) [13:51:33] 10Data-Services, 06DBA, 10Data-Platform-SRE (2024.09.06 - 2024.09.27): Prepare and check storage layer for bdrwiki - https://phabricator.wikimedia.org/T371759#10125811 (10BTullis) a:03BTullis [13:51:34] T367536: Cloud VPS "linkwatcher" project Buster deprecation - https://phabricator.wikimedia.org/T367536 [13:51:53] !log fnegri@cloudcumin1001 linkwatcher END (FAIL) - Cookbook wmcs.vps.refresh_puppet_certs (exit_code=99) on coibot.linkwatcher.eqiad1.wikimedia.cloud (T367536) [13:57:24] 10Data-Services, 06DBA, 10Data-Platform-SRE (2024.09.06 - 2024.09.27): Prepare and check storage layer for bdrwiki - https://phabricator.wikimedia.org/T371759#10125847 (10BTullis) Interesting. The `sre.wikireplicas.add-wiki` cookbook failed when running `/usr/local/sbin/maintain-views` on `an-redacteddb1001`... [14:07:52] !log fnegri@cloudcumin1001 linkwatcher START - Cookbook wmcs.vps.refresh_puppet_certs on coibot.linkwatcher.eqiad1.wikimedia.cloud (T367536) [14:07:55] T367536: Cloud VPS "linkwatcher" project Buster deprecation - https://phabricator.wikimedia.org/T367536 [14:08:15] !log fnegri@cloudcumin1001 linkwatcher END (FAIL) - Cookbook wmcs.vps.refresh_puppet_certs (exit_code=99) on coibot.linkwatcher.eqiad1.wikimedia.cloud (T367536) [14:13:35] 10Data-Services, 06DBA, 10Data-Platform-SRE (2024.09.06 - 2024.09.27): Prepare and check storage layer for bdrwiki - https://phabricator.wikimedia.org/T371759#10125931 (10BTullis) That database has definitely not been created on `an-redacteddb1001`. ` root@an-redacteddb1001:s5[(none)]> use bdrwiki_p; ERROR 1... [14:16:15] 10Data-Services, 06DBA, 10Data-Platform-SRE (2024.09.06 - 2024.09.27): Prepare and check storage layer for bdrwiki - https://phabricator.wikimedia.org/T371759#10125938 (10Ladsgroup) `bdrwiki_p` just provides the view, the actual database is `bdrwiki` which should be there so the replication and the data is r... [14:19:00] !log fnegri@cloudcumin1001 linkwatcher START - Cookbook wmcs.vps.refresh_puppet_certs on liwa3.linkwatcher.eqiad1.wikimedia.cloud (T367536) [14:19:03] T367536: Cloud VPS "linkwatcher" project Buster deprecation - https://phabricator.wikimedia.org/T367536 [14:19:27] !log fnegri@cloudcumin1001 linkwatcher END (FAIL) - Cookbook wmcs.vps.refresh_puppet_certs (exit_code=99) on liwa3.linkwatcher.eqiad1.wikimedia.cloud (T367536) [14:23:37] 10Data-Services, 06DBA, 10Data-Platform-SRE (2024.09.06 - 2024.09.27): Prepare and check storage layer for bdrwiki - https://phabricator.wikimedia.org/T371759#10125982 (10BTullis) >>! In T371759#10125938, @Ladsgroup wrote: > `bdrwiki_p` just provides the view, the actual database is `bdrwiki` which should be... [14:25:49] 10Data-Services, 06DBA, 10Data-Platform-SRE (2024.09.06 - 2024.09.27): Prepare and check storage layer for bdrwiki - https://phabricator.wikimedia.org/T371759#10125986 (10BTullis) ` root@an-redacteddb1001:s5[bdrwiki]> create database bdrwiki_p; Query OK, 1 row affected (0.001 sec) root@an-redacteddb1001:s5[... [14:27:47] 10Cloud-VPS (Debian Buster Deprecation), 10linkwatcher: Cloud VPS "linkwatcher" project Buster deprecation - https://phabricator.wikimedia.org/T367536#10126004 (10fnegri) The cookbook failed but I renewed the certs manually, the alerts are no longer firing. [15:10:08] 06cloud-services-team, 10Cloud-VPS: openstack: consider moving resource creation at project creation time to a templating system - https://phabricator.wikimedia.org/T374253 (10aborrero) 03NEW [15:10:23] 06cloud-services-team, 10Cloud-VPS: openstack: consider moving resource creation at project creation time to a templating system - https://phabricator.wikimedia.org/T374253#10126202 (10aborrero) p:05Triage→03Low [15:29:21] 10Data-Services, 06DBA, 10Data-Platform-SRE (2024.09.06 - 2024.09.27): Prepare and check storage layer for bdrwiki - https://phabricator.wikimedia.org/T371759#10126272 (10BTullis) 05In progress→03Resolved OK, I'll mark this ticket as resolved. >>! In T371759#10125982, @BTullis wrote: > I wonder if w... [16:03:27] FIRING: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [16:12:58] 10Tools, 06Infrastructure-Foundations: Requested offboarding-to-volunteer of HTriedman // Transfer ownership of SpinachBot from HTriedman (WMF) to HTriedman - https://phabricator.wikimedia.org/T371644#10126494 (10Dzahn) Would be nice to learn what was done to get this to resolved. Did it involve running th... [16:20:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [16:30:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [16:40:31] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Data-Services, 06Data-Persistence: [wikireplicas] Update Admin docs - https://phabricator.wikimedia.org/T365717#10126628 (10fnegri) I reviewed and updated all the admin docs related to Wiki Replicas. They're all under [Category:Wiki_Replica_admin](https://wikitech... [16:52:40] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Quarry, 07User-notice: Support queries against Quarry's own database and ToolsDB - https://phabricator.wikimedia.org/T151158#10126672 (10UOzurumba) [16:58:09] RESOLVED: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [17:00:42] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Quarry, 07User-notice: Support queries against Quarry's own database and ToolsDB - https://phabricator.wikimedia.org/T151158#10126678 (10UOzurumba) Hello @fnegri I think this improvement is worth announcing in [[ https://meta.wikimedia.org/wiki/Tech/News | Tec... [17:58:37] !log dcaro@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.drain_node (exit_code=0) (T373986) [17:58:42] T373986: cloudsw1-c8-eqiad is unstable - https://phabricator.wikimedia.org/T373986 [18:11:41] 10Data-Services, 06DBA, 10Data-Platform-SRE (2024.09.06 - 2024.09.27): Prepare and check storage layer for bdrwiki - https://phabricator.wikimedia.org/T371759#10126812 (10Ladsgroup) Arnaud would be the right person to say whether that is being cookbooked or not. [18:17:37] !log dcaro@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.drain_node (T373986) [18:17:43] T373986: cloudsw1-c8-eqiad is unstable - https://phabricator.wikimedia.org/T373986 [20:27:12] 06cloud-services-team: Update wmcloud.org MX records - https://phabricator.wikimedia.org/T374278#10127145 (10jhathaway) [20:31:00] 06cloud-services-team: Update wmcloud.org MX records - https://phabricator.wikimedia.org/T374278#10127185 (10jhathaway) [21:05:09] FIRING: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [21:25:43] 10wikitech.wikimedia.org, 10MW-on-K8s, 06serviceops, 13Patch-For-Review: mediawiki-config: consolidate labswiki - https://phabricator.wikimedia.org/T371374#10127333 (10Krinkle) > * wgMFAutodetectMobileView > > `lang=php > 'wgMFAutodetectMobileView' => [ > 'default' => false, > 'wikitech' => true, // Not... [22:20:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [23:00:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [23:18:16] !log dcaro@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.ceph.osd.drain_node (exit_code=99) (T373986) [23:18:22] T373986: cloudsw1-c8-eqiad is unstable - https://phabricator.wikimedia.org/T373986