[00:04:31] !log dcaro@urcuchillay admin END (PASS) - Cookbook wmcs.ceph.osd.depool_and_destroy (exit_code=0) [00:04:33] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.osd.bootstrap_and_add [00:04:35] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [00:04:38] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [00:21:05] (03PS1) 10Samwilson: Check for composer name before adding it to the output [labs/tools/extjsonuploader] - 10https://gerrit.wikimedia.org/r/1056050 (https://phabricator.wikimedia.org/T370729) [01:26:49] 10Toolforge: Toolforge build service: Can't process an image larger than 128 Mpx using ImageMagick - https://phabricator.wikimedia.org/T370610#10005446 (10tstarling) @aborrero tells me that the Debian maintainer is willing to remove the resource limits from policy.xml in the Debian package. That would be great t... [01:27:39] (03update) 10raymond-ndibe: helpers: add toolforge_redeploy_components.sh [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/166 (owner: 10aborrero) [01:53:37] FIRING: [2x] PowerSupplyFailure: Power Supply - PS Redundancy - issue on cloudbackup2003:9290 - https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook#Power_Supply_Failures - https://grafana.wikimedia.org/d/ZA1I-IB4z/ipmi-sensor-state?orgId=1&var-Sensor=Power%20Supply&var-server=cloudbackup2003 - https://alerts.wikimedia.org/?q=alertname%3DPowerSupplyFailure [01:53:45] 06cloud-services-team: PowerSupplyFailure - https://phabricator.wikimedia.org/T370732 (10phaultfinder) 03NEW [02:30:17] !log dcaro@urcuchillay admin END (PASS) - Cookbook wmcs.ceph.osd.bootstrap_and_add (exit_code=0) [02:30:21] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [02:30:29] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.osd.depool_and_destroy [02:30:32] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [03:03:34] !log dcaro@urcuchillay admin END (PASS) - Cookbook wmcs.ceph.osd.depool_and_destroy (exit_code=0) [03:03:36] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.osd.bootstrap_and_add [03:03:38] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [03:03:41] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [03:10:09] FIRING: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [03:15:09] RESOLVED: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [03:15:56] FIRING: SystemdUnitDown: The service unit opentofu-infra-diff.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [03:18:56] FIRING: CloudVPSDesignateLeaks: Detected 6 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [03:32:21] (03approved) 10raymond-ndibe: api: remove deprecated endpoints [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/106 (https://phabricator.wikimedia.org/T365014) (owner: 10sstefanova) [03:32:49] (03update) 10raymond-ndibe: api: remove deprecated endpoints [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/106 (https://phabricator.wikimedia.org/T365014) (owner: 10sstefanova) [05:05:33] (03update) 10sstefanova: api: remove deprecated endpoints [repos/cloud/toolforge/envvars-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-api/-/merge_requests/42 (https://phabricator.wikimedia.org/T365014) [05:10:56] FIRING: SystemdUnitDown: The systemd unit opentofu-infra-diff.service on node cloudcontrol1007 has been failing for more than two hours. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [05:11:07] 06cloud-services-team: SystemdUnitDown Unit opentofu-infra-diff.service on node cloudcontrol1007 has been down for long. - https://phabricator.wikimedia.org/T370742 (10phaultfinder) 03NEW [05:12:44] (03update) 10sstefanova: api: drop deprecated endpoints [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/108 (https://phabricator.wikimedia.org/T363346 https://phabricator.wikimedia.org/T365014) [05:13:16] (03update) 10sstefanova: api: drop deprecated endpoints [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/108 (https://phabricator.wikimedia.org/T363346 https://phabricator.wikimedia.org/T365014) [05:17:53] 10Tool-extjsonuploader, 13Patch-For-Review: Validate Composer names against Packagist - https://phabricator.wikimedia.org/T370729#10005657 (10Samwilson) There are quite a few: ` [2024-07-23 00:22:39+00] AdminLinks: Composer name 'mediawiki/admin-links' not found on Packagist. [2024-07-23 00:22:40+00] Advanced... [05:29:17] !log dcaro@urcuchillay admin END (PASS) - Cookbook wmcs.ceph.osd.bootstrap_and_add (exit_code=0) [05:29:21] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [05:29:29] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.osd.depool_and_destroy [05:29:32] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [05:53:52] FIRING: [2x] PowerSupplyFailure: Power Supply - PS Redundancy - issue on cloudbackup2003:9290 - https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook#Power_Supply_Failures - https://grafana.wikimedia.org/d/ZA1I-IB4z/ipmi-sensor-state?orgId=1&var-Sensor=Power%20Supply&var-server=cloudbackup2003 - https://alerts.wikimedia.org/?q=alertname%3DPowerSupplyFailure [05:59:04] !log dcaro@urcuchillay admin END (PASS) - Cookbook wmcs.ceph.osd.depool_and_destroy (exit_code=0) [05:59:06] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.osd.bootstrap_and_add [05:59:08] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [05:59:11] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [06:06:09] FIRING: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [06:11:09] RESOLVED: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [06:33:30] (03open) 10sstefanova: ingress-admission: fix local values [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/440 [06:36:01] (03merge) 10dsavuljesku: Add wrapper script for manual execution [toolforge-repos/cr-grants-team-metasync] - 10https://gitlab.wikimedia.org/toolforge-repos/cr-grants-team-metasync/-/merge_requests/1 (owner: 10rvogel) [07:18:56] FIRING: CloudVPSDesignateLeaks: Detected 6 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [07:37:50] (03approved) 10dcaro: ingress-admission: fix local values [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/440 (owner: 10sstefanova) [07:52:44] (03open) 10dsavuljesku: Draft: Bugfixing [toolforge-repos/cr-grants-team-metasync] - 10https://gitlab.wikimedia.org/toolforge-repos/cr-grants-team-metasync/-/merge_requests/2 [08:00:36] !log dcaro@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.component.deploy for component envvars-api [08:00:44] !log dcaro@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component envvars-api [08:00:55] !log dcaro@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.component.deploy for component envvars-api [08:01:06] !log dcaro@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component envvars-api [08:02:33] (03approved) 10dcaro: envvars-api: bump to 0.0.57-20240722141446-921d24d0 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/439 (https://phabricator.wikimedia.org/T367181) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [08:02:37] (03merge) 10dcaro: envvars-api: bump to 0.0.57-20240722141446-921d24d0 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/439 (https://phabricator.wikimedia.org/T367181) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [08:05:58] RESOLVED: PuppetAgentFailure: Puppet agent failure detected on instance tools-sgebastion-10 in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [08:24:17] !log dcaro@urcuchillay admin END (PASS) - Cookbook wmcs.ceph.osd.bootstrap_and_add (exit_code=0) [08:24:22] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [08:24:29] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.osd.depool_and_destroy [08:24:32] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [08:25:56] RESOLVED: SystemdUnitDown: The service unit opentofu-infra-diff.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [08:25:56] RESOLVED: SystemdUnitDown: The systemd unit opentofu-infra-diff.service on node cloudcontrol1007 has been failing for more than two hours. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [08:32:06] 10Toolforge: Toolforge build service: Can't process an image larger than 128 Mpx using ImageMagick - https://phabricator.wikimedia.org/T370610#10005875 (10dcaro) >>! In T370610#10005444, @tstarling wrote: > I'm not sure what the other options are. A custom stack (runimage)? Or fork the [[https://github.com/herok... [08:54:28] !log dcaro@urcuchillay admin END (PASS) - Cookbook wmcs.ceph.osd.depool_and_destroy (exit_code=0) [08:54:30] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.osd.bootstrap_and_add [08:54:32] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [08:54:35] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [09:14:34] (03open) 10dcaro: docs: add hint on how to access tools webservices [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/174 [09:39:09] FIRING: CephSlowOps: Ceph cluster in eqiad has 178 slow ops - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephSlowOps - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephSlowOps [09:39:22] 06cloud-services-team: CephSlowOps Ceph cluster in eqiad has slow ops, which might be blocking some writes - https://phabricator.wikimedia.org/T370752 (10phaultfinder) 03NEW [09:39:44] !log dcaro@urcuchillay admin END (FAIL) - Cookbook wmcs.ceph.osd.bootstrap_and_add (exit_code=99) [09:39:48] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [09:42:09] FIRING: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [09:42:31] FIRING: ToolsToolsDBWritableState: There should be exactly one writable MariaDB instance instead of -1 - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsToolsDBWritableState - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsToolsDBWritableState [09:46:57] 06cloud-services-team: CephSlowOps Ceph cluster in eqiad has slow ops, which might be blocking some writes - https://phabricator.wikimedia.org/T370752#10006058 (10dcaro) It was osd.93, running on cloudcephosd1012:/dev/sdh device (an old one, not part of the ones showing error counters going up). [09:47:09] RESOLVED: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [09:47:31] RESOLVED: ToolsToolsDBWritableState: There should be exactly one writable MariaDB instance instead of -1 - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsToolsDBWritableState - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsToolsDBWritableState [09:53:52] FIRING: [2x] PowerSupplyFailure: Power Supply - PS Redundancy - issue on cloudbackup2003:9290 - https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook#Power_Supply_Failures - https://grafana.wikimedia.org/d/ZA1I-IB4z/ipmi-sensor-state?orgId=1&var-Sensor=Power%20Supply&var-server=cloudbackup2003 - https://alerts.wikimedia.org/?q=alertname%3DPowerSupplyFailure [09:54:39] RESOLVED: CephSlowOps: Ceph cluster in eqiad has 60 slow ops - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephSlowOps - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephSlowOps [09:56:52] 06cloud-services-team: CephSlowOps Ceph cluster in eqiad has slow ops, which might be blocking some writes - https://phabricator.wikimedia.org/T370752#10006088 (10dcaro) The first slow op in the logs for that osd: ` Jul 23 09:35:39 cloudcephosd1012 ceph-osd[22074]: 2024-07-23T09:35:39.356+0000 7f55d5842700 0... [10:06:28] FIRING: InstanceDown: Project tools instance tools-db-3 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [10:07:09] FIRING: CephSlowOps: Ceph cluster in eqiad has 24 slow ops - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephSlowOps - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephSlowOps [10:07:15] !log aborrero@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan for https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/21 [10:07:16] 06cloud-services-team: CephSlowOps Ceph cluster in eqiad has slow ops, which might be blocking some writes - https://phabricator.wikimedia.org/T370752#10006137 (10phaultfinder) [10:07:34] !log aborrero@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.tofu (exit_code=0) running tofu plan for https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/21 [10:08:23] (03merge) 10aborrero: data/: introduce eqiad1-r network and subnet information [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/21 (https://phabricator.wikimedia.org/T370037) [10:08:41] !log aborrero@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan+apply for main branch [10:09:09] !log aborrero@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.tofu (exit_code=0) running tofu plan+apply for main branch [10:11:09] FIRING: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [10:11:28] RESOLVED: InstanceDown: Project tools instance tools-db-3 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [10:15:31] FIRING: ToolsToolsDBReplicationLagIsTooHigh: ToolsDB replication on tools-db-3 is lagging behind the primary, the current lag is 35478 - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsDBReplication - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsToolsDBReplicationLagIsTooHigh [10:16:09] RESOLVED: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [10:23:07] (03update) 10sstefanova: ingress-admission: fix local values [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/440 [10:25:39] RESOLVED: CephSlowOps: Ceph cluster in eqiad has 34 slow ops - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephSlowOps - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephSlowOps [10:28:04] 10Data-Services: [toolsdb] ToolsToolsDBReplicationLagIsTooHigh - 2024-07-23 - https://phabricator.wikimedia.org/T370760 (10fnegri) 03NEW [10:30:22] 06cloud-services-team, 10Data-Services: [toolsdb] ToolsToolsDBReplicationLagIsTooHigh - 2024-07-23 - https://phabricator.wikimedia.org/T370760#10006239 (10fnegri) 05Open→03In progress a:03fnegri [10:32:30] 10Tool-spacemedia, 10Toolforge (Toolforge iteration 13), 10video2commons: Enable SonarCloud usage for GitHub Toolforge projects - https://phabricator.wikimedia.org/T369267#10006244 (10Don-vip) >>! In T369267#10002632, @dcaro wrote: > I think I've done that, but the way to change the branch was to delete the... [10:32:33] 06cloud-services-team, 10Data-Services: [toolsdb] ToolsToolsDBReplicationLagIsTooHigh - 2024-07-23 - https://phabricator.wikimedia.org/T370760#10006245 (10fnegri) I acked the alert for 24 hours, hopefully it will catch up by then. [10:32:54] 06cloud-services-team, 10Data-Services: [toolsdb] ToolsToolsDBReplicationLagIsTooHigh - 2024-07-23 - https://phabricator.wikimedia.org/T370760#10006250 (10fnegri) [10:33:33] 10Data-Services: [toolsdb] Replica is frequently lagging behind the primary - https://phabricator.wikimedia.org/T357624#10006256 (10fnegri) [10:33:51] 10Data-Services: [toolsdb] Replica is frequently lagging behind the primary - https://phabricator.wikimedia.org/T357624#10006253 (10fnegri) [10:37:54] (03approved) 10sstefanova: docs: add hint on how to access tools webservices [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/174 (owner: 10dcaro) [10:37:59] (03update) 10sstefanova: docs: add hint on how to access tools webservices [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/174 (owner: 10dcaro) [10:38:54] (03merge) 10sstefanova: ingress-admission: fix local values [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/440 [10:39:05] 10Tool-spacemedia, 10Toolforge (Toolforge iteration 13), 10video2commons: Enable SonarCloud usage for GitHub Toolforge projects - https://phabricator.wikimedia.org/T369267#10006294 (10dcaro) >>! In T369267#10006244, @Don-vip wrote: >>>! In T369267#10002632, @dcaro wrote: >> I think I've done that, but th... [10:39:43] 10Toolforge (Toolforge iteration 13), 13Patch-For-Review: [envvars-api] Remove authentication and use api-gateway provided headers - https://phabricator.wikimedia.org/T367181#10006298 (10dcaro) 05In progress→03Resolved [10:41:22] 10Tool-spacemedia, 10Toolforge (Toolforge iteration 13), 10video2commons: Enable SonarCloud usage for GitHub Toolforge projects - https://phabricator.wikimedia.org/T369267#10006296 (10dcaro) 05In progress→03Resolved [10:41:43] (03update) 10sstefanova: build: upgrade k8s dependencies [repos/cloud/toolforge/ingress-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/ingress-admission/-/merge_requests/7 (https://phabricator.wikimedia.org/T370046) [10:41:48] (03update) 10sstefanova: build: upgrade k8s dependencies [repos/cloud/toolforge/ingress-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/ingress-admission/-/merge_requests/7 (https://phabricator.wikimedia.org/T370046) [10:47:04] (03open) 10aborrero: tofu-infra: refactor providers into its own file [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 (https://phabricator.wikimedia.org/T370037) [10:47:20] !log aborrero@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan for https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 [10:47:28] !log aborrero@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.tofu (exit_code=99) running tofu plan for https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 [10:48:03] (03update) 10aborrero: tofu-infra: refactor providers into its own file [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 (https://phabricator.wikimedia.org/T370037) [10:48:08] !log aborrero@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan for https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 [10:48:13] !log aborrero@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.tofu (exit_code=99) running tofu plan for https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 [10:48:40] (03merge) 10dcaro: docs: add hint on how to access tools webservices [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/174 [10:49:22] (03update) 10sstefanova: api: drop deprecated endpoints [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/108 (https://phabricator.wikimedia.org/T363346 https://phabricator.wikimedia.org/T365014) [10:50:22] !log aborrero@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan for https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 [10:50:22] (03update) 10aborrero: tofu-infra: refactor providers into its own file [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 (https://phabricator.wikimedia.org/T370037) [10:50:30] !log aborrero@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.tofu (exit_code=99) running tofu plan for https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 [10:51:01] !log aborrero@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan for https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 [10:51:02] (03update) 10aborrero: tofu-infra: refactor providers into its own file [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 (https://phabricator.wikimedia.org/T370037) [10:51:06] (03update) 10sstefanova: api: remove deprecated endpoints [repos/cloud/toolforge/envvars-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-api/-/merge_requests/42 (https://phabricator.wikimedia.org/T365014) [10:51:07] !log aborrero@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.tofu (exit_code=99) running tofu plan for https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 [10:52:22] !log aborrero@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan for https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 [10:52:24] (03update) 10aborrero: tofu-infra: refactor providers into its own file [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 (https://phabricator.wikimedia.org/T370037) [10:52:32] !log aborrero@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.tofu (exit_code=99) running tofu plan for https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 [10:55:24] (03approved) 10fnegri: tofu-infra: refactor providers into its own file [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 (https://phabricator.wikimedia.org/T370037) (owner: 10aborrero) [10:57:21] 10Toolforge: Toolforge build service: Can't process an image larger than 128 Mpx using ImageMagick - https://phabricator.wikimedia.org/T370610#10006323 (10dcaro) From that pull request (now closed), they suggest using: https://help.heroku.com/RFDJQSG3/how-can-i-override-imagemagick-settings-in-a-policy-xml-file... [10:57:22] (03update) 10sstefanova: wmcs-k8s-metrics: bump kube-state-metrics version [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/434 (https://phabricator.wikimedia.org/T370046) [10:58:32] (03approved) 10dcaro: api: remove deprecated endpoints [repos/cloud/toolforge/envvars-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-api/-/merge_requests/42 (https://phabricator.wikimedia.org/T365014) (owner: 10sstefanova) [10:58:34] (03update) 10dcaro: api: remove deprecated endpoints [repos/cloud/toolforge/envvars-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-api/-/merge_requests/42 (https://phabricator.wikimedia.org/T365014) (owner: 10sstefanova) [10:59:34] !log aborrero@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan for https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 [10:59:36] (03update) 10aborrero: tofu-infra: refactor providers into its own file [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 (https://phabricator.wikimedia.org/T370037) [10:59:44] !log aborrero@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.tofu (exit_code=99) running tofu plan for https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 [11:18:56] FIRING: CloudVPSDesignateLeaks: Detected 6 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [11:21:56] FIRING: SystemdUnitDown: The service unit wikitech_run_jobs.service is in failed status on host cloudweb1004. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudweb1004 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [11:26:56] RESOLVED: SystemdUnitDown: The service unit wikitech_run_jobs.service is in failed status on host cloudweb1004. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudweb1004 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [11:35:31] !log aborrero@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan for main branch [11:35:46] !log aborrero@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.tofu (exit_code=0) running tofu plan for main branch [11:48:32] (03open) 10sstefanova: functional-tests/webservice: add test case for lima-kilo [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/441 [11:51:31] (03update) 10sstefanova: api: drop deprecated endpoints [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/108 (https://phabricator.wikimedia.org/T363346 https://phabricator.wikimedia.org/T365014) [11:51:36] (03merge) 10sstefanova: api: drop deprecated endpoints [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/108 (https://phabricator.wikimedia.org/T363346 https://phabricator.wikimedia.org/T365014) [11:52:19] (03merge) 10sstefanova: api: remove deprecated endpoints [repos/cloud/toolforge/envvars-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-api/-/merge_requests/42 (https://phabricator.wikimedia.org/T365014) [11:52:34] (03update) 10sstefanova: api: remove deprecated endpoints [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/106 (https://phabricator.wikimedia.org/T365014) [11:52:55] (03merge) 10sstefanova: api: remove deprecated endpoints [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/106 (https://phabricator.wikimedia.org/T365014) [11:54:11] (03update) 10project_1317_bot_df3177307bed93c3f34e421e26c86e38: jobs-api: bump to 0.0.323-20240723115142-863de5d7 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/442 (https://phabricator.wikimedia.org/T363346 https://phabricator.wikimedia.org/T365014) [11:54:16] (03open) 10project_1317_bot_df3177307bed93c3f34e421e26c86e38: jobs-api: bump to 0.0.323-20240723115142-863de5d7 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/442 (https://phabricator.wikimedia.org/T363346 https://phabricator.wikimedia.org/T365014) [11:57:13] (03update) 10project_1317_bot_df3177307bed93c3f34e421e26c86e38: envvars-api: bump to 0.0.58-20240723115225-033f7657 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/443 (https://phabricator.wikimedia.org/T365014) [11:57:16] (03open) 10project_1317_bot_df3177307bed93c3f34e421e26c86e38: envvars-api: bump to 0.0.58-20240723115225-033f7657 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/443 (https://phabricator.wikimedia.org/T365014) [11:58:20] !log sstefanova@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.component.deploy for component jobs-api [11:58:30] !log sstefanova@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component jobs-api [12:00:00] (03open) 10project_1317_bot_df3177307bed93c3f34e421e26c86e38: builds-api: bump to 0.0.166-20240723115306-1e901808 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/444 (https://phabricator.wikimedia.org/T365014) [12:08:45] !log sstefanova@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.component.deploy for component jobs-api [12:08:56] !log sstefanova@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component jobs-api [12:10:00] !log sstefanova@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.component.deploy for component envvars-api [12:10:10] !log sstefanova@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component envvars-api [12:13:26] 06cloud-services-team, 10Cloud-VPS (Debian Buster Deprecation), 10Beta-Cluster-Infrastructure: Remove or replace deployment-sessionstore04.deployment-prep.eqiad1.wikimedia.cloud (Buster deprecation) - https://phabricator.wikimedia.org/T370461#10006532 (10Southparkfan) I didn't get a response in `-sre`, but A... [12:14:49] !log sstefanova@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.component.deploy for component envvars-api [12:15:00] !log sstefanova@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component envvars-api [12:15:48] (03approved) 10dcaro: functional-tests/webservice: add test case for lima-kilo [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/441 (owner: 10sstefanova) [12:16:21] !log sstefanova@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.component.deploy for component builds-api [12:16:31] !log sstefanova@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component builds-api [12:19:14] (03update) 10sstefanova: jobs-api: bump to 0.0.323-20240723115142-863de5d7 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/442 (https://phabricator.wikimedia.org/T363346 https://phabricator.wikimedia.org/T365014) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [12:19:15] (03approved) 10sstefanova: jobs-api: bump to 0.0.323-20240723115142-863de5d7 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/442 (https://phabricator.wikimedia.org/T363346 https://phabricator.wikimedia.org/T365014) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [12:19:21] (03merge) 10sstefanova: jobs-api: bump to 0.0.323-20240723115142-863de5d7 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/442 (https://phabricator.wikimedia.org/T363346 https://phabricator.wikimedia.org/T365014) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [12:20:04] (03update) 10sstefanova: envvars-api: bump to 0.0.58-20240723115225-033f7657 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/443 (https://phabricator.wikimedia.org/T365014) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [12:20:11] 10Toolforge: Elasticsearch credential request for wikitermbase - https://phabricator.wikimedia.org/T368376#10006552 (10ForzaGreen) Hello, is there any update on this request ? please let me know if you need any information or actions from my side. Thanks, [12:20:17] !log sstefanova@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.component.deploy for component builds-api [12:20:28] !log sstefanova@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component builds-api [12:20:49] (03update) 10sstefanova: envvars-api: bump to 0.0.58-20240723115225-033f7657 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/443 (https://phabricator.wikimedia.org/T365014) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [12:21:08] (03approved) 10sstefanova: envvars-api: bump to 0.0.58-20240723115225-033f7657 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/443 (https://phabricator.wikimedia.org/T365014) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [12:21:20] (03merge) 10sstefanova: envvars-api: bump to 0.0.58-20240723115225-033f7657 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/443 (https://phabricator.wikimedia.org/T365014) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [12:22:16] (03open) 10dcaro: builds: override default imagemagick config [repos/cloud/toolforge/builds-builder] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/56 (https://phabricator.wikimedia.org/T370610) [12:25:03] (03update) 10sstefanova: builds-api: bump to 0.0.166-20240723115306-1e901808 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/444 (https://phabricator.wikimedia.org/T365014) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [12:27:03] (03update) 10sstefanova: builds-api: bump to 0.0.166-20240723115306-1e901808 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/444 (https://phabricator.wikimedia.org/T365014) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [12:27:10] (03approved) 10sstefanova: builds-api: bump to 0.0.166-20240723115306-1e901808 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/444 (https://phabricator.wikimedia.org/T365014) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [12:27:14] (03merge) 10sstefanova: builds-api: bump to 0.0.166-20240723115306-1e901808 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/444 (https://phabricator.wikimedia.org/T365014) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [12:28:29] (03update) 10sstefanova: build: upgrade k8s dependencies [repos/cloud/toolforge/registry-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/registry-admission/-/merge_requests/10 (https://phabricator.wikimedia.org/T370046) [12:31:01] 10Toolforge (Toolforge iteration 13), 13Patch-For-Review: [jobs-api, jobs-cli] Prefix all endpoints with `/tool/` - https://phabricator.wikimedia.org/T363346#10006573 (10Slst2020) [12:32:41] 10Toolforge (Toolforge iteration 13), 13Patch-For-Review: [jobs-api,builds-api,envvars-api] consolidate api paths - https://phabricator.wikimedia.org/T365014#10006577 (10Slst2020) [12:33:46] (03CR) 10Sebastian Berlin (WMSE): Use tasks for updating images when a campaign is updated (031 comment) [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1055949 (owner: 10Sebastian Berlin (WMSE)) [12:34:15] 10Toolforge (Toolforge iteration 13), 13Patch-For-Review: [jobs-api, jobs-cli] Prefix all endpoints with `/tool/` - https://phabricator.wikimedia.org/T363346#10006575 (10Slst2020) 05In progress→03Resolved [12:34:25] (03PS2) 10Sebastian Berlin (WMSE): Use tasks for updating images when a campaign is updated [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1055949 (https://phabricator.wikimedia.org/T367397) [12:35:53] 10Toolforge (Toolforge iteration 13), 13Patch-For-Review: [jobs-api,builds-api,envvars-api] consolidate api paths - https://phabricator.wikimedia.org/T365014#10006585 (10Slst2020) 05In progress→03Resolved [12:36:47] (03approved) 10dcaro: build: upgrade k8s dependencies [repos/cloud/toolforge/ingress-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/ingress-admission/-/merge_requests/7 (https://phabricator.wikimedia.org/T370046) (owner: 10sstefanova) [12:36:49] (03update) 10dcaro: build: upgrade k8s dependencies [repos/cloud/toolforge/ingress-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/ingress-admission/-/merge_requests/7 (https://phabricator.wikimedia.org/T370046) (owner: 10sstefanova) [12:37:13] (03approved) 10dcaro: build: upgrade k8s dependencies [repos/cloud/toolforge/registry-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/registry-admission/-/merge_requests/10 (https://phabricator.wikimedia.org/T370046) (owner: 10sstefanova) [12:37:15] (03update) 10dcaro: build: upgrade k8s dependencies [repos/cloud/toolforge/registry-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/registry-admission/-/merge_requests/10 (https://phabricator.wikimedia.org/T370046) (owner: 10sstefanova) [12:37:49] (03approved) 10dcaro: build: upgrade k8s dependencies [repos/cloud/toolforge/volume-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/volume-admission/-/merge_requests/14 (https://phabricator.wikimedia.org/T370046) (owner: 10sstefanova) [12:37:51] (03update) 10dcaro: build: upgrade k8s dependencies [repos/cloud/toolforge/volume-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/volume-admission/-/merge_requests/14 (https://phabricator.wikimedia.org/T370046) (owner: 10sstefanova) [12:38:04] (03close) 10sstefanova: jobs: add job for managing harbor quotas [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/22 (https://phabricator.wikimedia.org/T352417) [12:39:22] (03close) 10sstefanova: Draft: dev: add test script for harbor [repos/cloud/toolforge/builds-builder] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/20 [12:44:10] (03update) 10sstefanova: build: upgrade k8s dependencies [repos/cloud/toolforge/ingress-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/ingress-admission/-/merge_requests/7 (https://phabricator.wikimedia.org/T370046) [12:44:21] (03merge) 10sstefanova: build: upgrade k8s dependencies [repos/cloud/toolforge/ingress-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/ingress-admission/-/merge_requests/7 (https://phabricator.wikimedia.org/T370046) [12:45:00] (03merge) 10sstefanova: build: upgrade k8s dependencies [repos/cloud/toolforge/registry-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/registry-admission/-/merge_requests/10 (https://phabricator.wikimedia.org/T370046) [12:45:17] 10Toolforge, 07Kubernetes: toolforge-jobs and packbuild images - https://phabricator.wikimedia.org/T369786#10006614 (10dcaro) >>! In T369786#9999914, @Hawkeye7 wrote: > That's exactly what I did. I used envvars list to hold the password: > > ` tools.milhistbot@tools-bastion-13:~$ toolforge envvars list > nam... [12:45:25] (03merge) 10sstefanova: build: upgrade k8s dependencies [repos/cloud/toolforge/volume-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/volume-admission/-/merge_requests/14 (https://phabricator.wikimedia.org/T370046) [12:47:17] !log aborrero@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan for main branch [12:47:24] (03open) 10project_1317_bot_df3177307bed93c3f34e421e26c86e38: ingress-admission: bump to 0.0.47-20240723124431-37ffef74 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/445 (https://phabricator.wikimedia.org/T370046) [12:47:25] (03update) 10aborrero: tofu-infra: refactor providers into its own file [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 (https://phabricator.wikimedia.org/T370037) [12:47:35] !log aborrero@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.tofu (exit_code=0) running tofu plan for main branch [12:48:01] (03update) 10project_1317_bot_df3177307bed93c3f34e421e26c86e38: registry-admission: bump to 0.0.47-20240723124511-4fbbf982 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/447 (https://phabricator.wikimedia.org/T370046) [12:48:02] (03open) 10project_1317_bot_df3177307bed93c3f34e421e26c86e38: volume-admission: bump to 0.0.52-20240723124535-01d0aa11 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/446 (https://phabricator.wikimedia.org/T370046) [12:48:04] (03open) 10project_1317_bot_df3177307bed93c3f34e421e26c86e38: registry-admission: bump to 0.0.47-20240723124511-4fbbf982 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/447 (https://phabricator.wikimedia.org/T370046) [12:51:04] vivian-rook closed https://github.com/toolforge/paws/pull/433 [12:53:07] !log sstefanova@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.component.deploy for component registry-admission [12:53:16] !log sstefanova@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component registry-admission [12:56:29] (03update) 10aborrero: tofu-infra: refactor providers into its own file [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 (https://phabricator.wikimedia.org/T370037) [12:56:34] !log aborrero@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan for https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 [12:57:15] !log aborrero@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.tofu (exit_code=0) running tofu plan for https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 [12:59:58] (03update) 10aborrero: tofu-infra: refactor providers into its own file [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 (https://phabricator.wikimedia.org/T370037) [13:00:27] !log aborrero@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan for https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 [13:00:34] !log aborrero@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.tofu (exit_code=99) running tofu plan for https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 [13:01:03] (03update) 10aborrero: tofu-infra: refactor providers into its own file [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 (https://phabricator.wikimedia.org/T370037) [13:01:05] !log aborrero@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan for https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 [13:01:24] !log aborrero@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.tofu (exit_code=0) running tofu plan for https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 [13:02:03] (03merge) 10aborrero: tofu-infra: refactor providers into its own file [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/22 (https://phabricator.wikimedia.org/T370037) [13:02:15] !log aborrero@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan+apply for main branch [13:02:45] !log aborrero@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.tofu (exit_code=0) running tofu plan+apply for main branch [13:06:06] (03open) 10sstefanova: functional-tests/direct-api: update direct api endpoints [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/448 (https://phabricator.wikimedia.org/T365014) [13:11:27] (03CR) 10Gergő Tisza: Check for composer name before adding it to the output (031 comment) [labs/tools/extjsonuploader] - 10https://gerrit.wikimedia.org/r/1056050 (https://phabricator.wikimedia.org/T370729) (owner: 10Samwilson) [13:15:25] (03PS2) 10Samwilson: Check for composer name before adding it to the output [labs/tools/extjsonuploader] - 10https://gerrit.wikimedia.org/r/1056050 (https://phabricator.wikimedia.org/T370729) [13:15:35] (03CR) 10Samwilson: Check for composer name before adding it to the output (031 comment) [labs/tools/extjsonuploader] - 10https://gerrit.wikimedia.org/r/1056050 (https://phabricator.wikimedia.org/T370729) (owner: 10Samwilson) [13:19:08] (03update) 10sstefanova: functional-tests/webservice: add test case for lima-kilo [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/441 [13:20:03] (03merge) 10sstefanova: functional-tests/webservice: add test case for lima-kilo [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/441 [13:46:49] (03update) 10sstefanova: functional-tests/direct-api: update direct api endpoints [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/448 (https://phabricator.wikimedia.org/T365014) [13:48:05] (03approved) 10dcaro: functional-tests/direct-api: update direct api endpoints [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/448 (https://phabricator.wikimedia.org/T365014) (owner: 10sstefanova) [13:49:27] (03update) 10sstefanova: functional-tests/direct-api: update direct api endpoints [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/448 (https://phabricator.wikimedia.org/T365014) [13:49:35] (03merge) 10sstefanova: functional-tests/direct-api: update direct api endpoints [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/448 (https://phabricator.wikimedia.org/T365014) [13:49:48] !log sstefanova@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.component.deploy for component registry-admission [13:49:58] !log sstefanova@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component registry-admission [13:53:52] FIRING: [2x] PowerSupplyFailure: Power Supply - PS Redundancy - issue on cloudbackup2003:9290 - https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook#Power_Supply_Failures - https://grafana.wikimedia.org/d/ZA1I-IB4z/ipmi-sensor-state?orgId=1&var-Sensor=Power%20Supply&var-server=cloudbackup2003 - https://alerts.wikimedia.org/?q=alertname%3DPowerSupplyFailure [13:53:58] (03update) 10sstefanova: registry-admission: bump to 0.0.47-20240723124511-4fbbf982 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/447 (https://phabricator.wikimedia.org/T370046) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [13:53:58] (03update) 10sstefanova: registry-admission: bump to 0.0.47-20240723124511-4fbbf982 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/447 (https://phabricator.wikimedia.org/T370046) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [13:56:33] (03approved) 10sstefanova: registry-admission: bump to 0.0.47-20240723124511-4fbbf982 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/447 (https://phabricator.wikimedia.org/T370046) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [13:56:33] 10Toolforge (Toolforge iteration 13): [lima-kilo] add ingress-admission - https://phabricator.wikimedia.org/T370774 (10Slst2020) 03NEW [13:56:37] (03merge) 10sstefanova: registry-admission: bump to 0.0.47-20240723124511-4fbbf982 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/447 (https://phabricator.wikimedia.org/T370046) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [13:58:29] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q4:rack/setup/install cloudcephosd10[35-38] - https://phabricator.wikimedia.org/T363344#10006880 (10Jclark-ctr) a:05Jclark-ctr→03cmooney [13:58:51] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q4:rack/setup/install new cloudcephmon hosts - https://phabricator.wikimedia.org/T364870#10006894 (10Jclark-ctr) a:03Papaul [14:00:33] (03update) 10sstefanova: volume-admission: bump to 0.0.52-20240723124535-01d0aa11 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/446 (https://phabricator.wikimedia.org/T370046) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [14:01:49] !log sstefanova@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.component.deploy for component volume-admission [14:01:59] !log sstefanova@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component volume-admission [14:14:10] (03CR) 10Gergő Tisza: [C:03+2] Check for composer name before adding it to the output [labs/tools/extjsonuploader] - 10https://gerrit.wikimedia.org/r/1056050 (https://phabricator.wikimedia.org/T370729) (owner: 10Samwilson) [14:14:42] (03Merged) 10jenkins-bot: Check for composer name before adding it to the output [labs/tools/extjsonuploader] - 10https://gerrit.wikimedia.org/r/1056050 (https://phabricator.wikimedia.org/T370729) (owner: 10Samwilson) [14:16:32] (03CR) 10Gergő Tisza: [C:03+2] Check for composer name before adding it to the output (031 comment) [labs/tools/extjsonuploader] - 10https://gerrit.wikimedia.org/r/1056050 (https://phabricator.wikimedia.org/T370729) (owner: 10Samwilson) [14:58:42] RESOLVED: CloudVPSDesignateLeaks: Detected 6 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [15:04:36] !log sstefanova@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.component.deploy for component volume-admission [15:04:46] !log sstefanova@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component volume-admission [15:08:48] 06cloud-services-team, 06DC-Ops, 10ops-codfw: PowerSupplyFailure - https://phabricator.wikimedia.org/T370732#10007131 (10fnegri) The same error happened last month (T368211), and was fixed by @Jhancock.wm by reseating the cable. [15:09:21] !log andrew@cloudcumin1001 deployment-prep START - Cookbook wmcs.openstack.rebuild_dbinstance [15:11:18] !log andrew@cloudcumin1001 deployment-prep END (PASS) - Cookbook wmcs.openstack.rebuild_dbinstance (exit_code=0) [15:12:16] 06cloud-services-team, 06DC-Ops, 10ops-codfw: PowerSupplyFailure - https://phabricator.wikimedia.org/T370732#10007161 (10fnegri) The alert went back to green 1 minute after posting the comment above :) [15:12:43] !log andrew@cloudcumin1001 deployment-prep START - Cookbook wmcs.openstack.rebuild_dbinstance [15:13:19] (03update) 10sstefanova: volume-admission: bump to 0.0.52-20240723124535-01d0aa11 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/446 (https://phabricator.wikimedia.org/T370046) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [15:13:43] (03approved) 10sstefanova: volume-admission: bump to 0.0.52-20240723124535-01d0aa11 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/446 (https://phabricator.wikimedia.org/T370046) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [15:13:47] (03merge) 10sstefanova: volume-admission: bump to 0.0.52-20240723124535-01d0aa11 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/446 (https://phabricator.wikimedia.org/T370046) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [15:14:38] !log andrew@cloudcumin1001 deployment-prep END (PASS) - Cookbook wmcs.openstack.rebuild_dbinstance (exit_code=0) [15:20:21] 06cloud-services-team, 06DC-Ops, 10ops-codfw: PowerSupplyFailure - https://phabricator.wikimedia.org/T370732#10007219 (10Jhancock.wm) 05Open→03Resolved a:03Jhancock.wm [15:22:27] !log andrew@cloudcumin1001 deployment-prep START - Cookbook wmcs.openstack.rebuild_dbinstance [15:24:27] !log andrew@cloudcumin1001 deployment-prep END (PASS) - Cookbook wmcs.openstack.rebuild_dbinstance (exit_code=0) [15:24:36] !log andrew@cloudcumin1001 deployment-prep START - Cookbook wmcs.openstack.rebuild_dbinstance [15:25:22] RESOLVED: [2x] PowerSupplyFailure: Power Supply - PS Redundancy - issue on cloudbackup2003:9290 - https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook#Power_Supply_Failures - https://grafana.wikimedia.org/d/ZA1I-IB4z/ipmi-sensor-state?orgId=1&var-Sensor=Power%20Supply&var-server=cloudbackup2003 - https://alerts.wikimedia.org/?q=alertname%3DPowerSupplyFailure [15:26:33] !log andrew@cloudcumin1001 deployment-prep END (PASS) - Cookbook wmcs.openstack.rebuild_dbinstance (exit_code=0) [15:26:47] !log andrew@cloudcumin1001 deployment-prep START - Cookbook wmcs.openstack.rebuild_dbinstance [15:28:45] !log andrew@cloudcumin1001 deployment-prep END (PASS) - Cookbook wmcs.openstack.rebuild_dbinstance (exit_code=0) [15:29:05] !log andrew@cloudcumin1001 deployment-prep START - Cookbook wmcs.openstack.rebuild_dbinstance [15:31:03] !log andrew@cloudcumin1001 deployment-prep END (PASS) - Cookbook wmcs.openstack.rebuild_dbinstance (exit_code=0) [15:31:24] !log andrew@cloudcumin1001 deployment-prep START - Cookbook wmcs.openstack.rebuild_dbinstance [15:33:22] !log andrew@cloudcumin1001 deployment-prep END (PASS) - Cookbook wmcs.openstack.rebuild_dbinstance (exit_code=0) [15:33:34] !log andrew@cloudcumin1001 deployment-prep START - Cookbook wmcs.openstack.rebuild_dbinstance [15:35:32] !log andrew@cloudcumin1001 deployment-prep END (PASS) - Cookbook wmcs.openstack.rebuild_dbinstance (exit_code=0) [15:36:11] !log andrew@cloudcumin1001 deployment-prep START - Cookbook wmcs.openstack.rebuild_dbinstance [15:38:10] !log andrew@cloudcumin1001 deployment-prep END (PASS) - Cookbook wmcs.openstack.rebuild_dbinstance (exit_code=0) [15:39:23] !log andrew@cloudcumin1001 deployment-prep START - Cookbook wmcs.openstack.rebuild_dbinstance [15:41:26] !log andrew@cloudcumin1001 deployment-prep END (PASS) - Cookbook wmcs.openstack.rebuild_dbinstance (exit_code=0) [15:41:48] !log andrew@cloudcumin1001 deployment-prep START - Cookbook wmcs.openstack.rebuild_dbinstance [15:43:46] !log andrew@cloudcumin1001 deployment-prep END (PASS) - Cookbook wmcs.openstack.rebuild_dbinstance (exit_code=0) [15:43:54] !log andrew@cloudcumin1001 deployment-prep START - Cookbook wmcs.openstack.rebuild_dbinstance [15:45:52] !log andrew@cloudcumin1001 deployment-prep END (PASS) - Cookbook wmcs.openstack.rebuild_dbinstance (exit_code=0) [15:46:01] !log andrew@cloudcumin1001 deployment-prep START - Cookbook wmcs.openstack.rebuild_dbinstance [15:47:58] !log andrew@cloudcumin1001 deployment-prep END (PASS) - Cookbook wmcs.openstack.rebuild_dbinstance (exit_code=0) [15:48:06] !log andrew@cloudcumin1001 deployment-prep START - Cookbook wmcs.openstack.rebuild_dbinstance [15:50:05] !log andrew@cloudcumin1001 deployment-prep END (PASS) - Cookbook wmcs.openstack.rebuild_dbinstance (exit_code=0) [15:50:16] !log andrew@cloudcumin1001 deployment-prep START - Cookbook wmcs.openstack.rebuild_dbinstance [15:52:14] !log andrew@cloudcumin1001 deployment-prep END (PASS) - Cookbook wmcs.openstack.rebuild_dbinstance (exit_code=0) [15:52:27] !log andrew@cloudcumin1001 deployment-prep START - Cookbook wmcs.openstack.rebuild_dbinstance [15:54:23] !log andrew@cloudcumin1001 deployment-prep END (PASS) - Cookbook wmcs.openstack.rebuild_dbinstance (exit_code=0) [15:55:07] !log andrew@cloudcumin1001 deployment-prep START - Cookbook wmcs.openstack.rebuild_dbinstance [15:55:35] 06cloud-services-team, 10Cloud-VPS, 10Data-Services, 05Goal: Update all trove VMs to a modern guest image - https://phabricator.wikimedia.org/T369723#10007469 (10Andrew) [15:56:45] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Data-Services: [wikireplicas] frequent replag spikes in clouddb hosts - https://phabricator.wikimedia.org/T367778#10007475 (10fnegri) Replication lag on clouddb1019 (s4) remained at 0 until 11:25 UTC today, then it started increasing again. I checked the processlis... [15:57:05] !log andrew@cloudcumin1001 deployment-prep END (PASS) - Cookbook wmcs.openstack.rebuild_dbinstance (exit_code=0) [16:04:00] (03open) 10aborrero: tofu-infra: define neutron routers [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/23 (https://phabricator.wikimedia.org/T370037) [16:04:06] (03update) 10aborrero: Draft: tofu-infra: define neutron routers [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/23 (https://phabricator.wikimedia.org/T370037) [16:04:59] !log andrew@cloudcumin1001 deployment-prep START - Cookbook wmcs.openstack.rebuild_dbinstance [16:05:02] !log andrew@cloudcumin1001 deployment-prep END (FAIL) - Cookbook wmcs.openstack.rebuild_dbinstance (exit_code=99) [16:07:17] !log andrew@cloudcumin1001 deployment-prep START - Cookbook wmcs.openstack.rebuild_dbinstance [16:09:15] !log andrew@cloudcumin1001 deployment-prep END (PASS) - Cookbook wmcs.openstack.rebuild_dbinstance (exit_code=0) [16:09:59] !log andrew@cloudcumin1001 deployment-prep START - Cookbook wmcs.openstack.rebuild_dbinstance [16:11:54] !log andrew@cloudcumin1001 deployment-prep END (PASS) - Cookbook wmcs.openstack.rebuild_dbinstance (exit_code=0) [16:17:50] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q4:rack/setup/install new cloudcephmon hosts - https://phabricator.wikimedia.org/T364870#10007580 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by pt1979@cumin1002 for host cloudcephmon1004.eqiad.wmnet with OS bullseye [16:18:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [16:28:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [16:55:30] 06cloud-services-team, 10Cloud-VPS (Debian Buster Deprecation), 10Beta-Cluster-Infrastructure: Remove or replace deployment-sessionstore04.deployment-prep.eqiad1.wikimedia.cloud (Buster deprecation) - https://phabricator.wikimedia.org/T370461#10007736 (10Southparkfan) Couldn't upgrade Buster to 4.x, because... [17:03:05] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q4:rack/setup/install new cloudcephmon hosts - https://phabricator.wikimedia.org/T364870#10007769 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by pt1979@cumin1002 for host cloudcephmon1004.eqiad.wmnet with OS bullseye ex... [17:12:51] 06Toolforge-standards-committee, 10Tools, 07Privacy: api-docs.toolforge.org loads loads 3rd party content - https://phabricator.wikimedia.org/T370532#10007797 (10dcaro) Fixed! {F56621002} Thanks a lot for the task :) [17:13:54] (03CR) 10Eugene233: [C:03+1] "Looks good." [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1055949 (https://phabricator.wikimedia.org/T367397) (owner: 10Sebastian Berlin (WMSE)) [17:15:30] (03CR) 10Eugene233: [C:03+1] "Added a couple of people who could check further." [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1055949 (https://phabricator.wikimedia.org/T367397) (owner: 10Sebastian Berlin (WMSE)) [17:16:43] 10Toolforge (Toolforge iteration 13), 06Toolforge-standards-committee, 10Tools, 07Privacy: api-docs.toolforge.org loads loads 3rd party content - https://phabricator.wikimedia.org/T370532#10007811 (10dcaro) [17:16:51] 10Toolforge (Toolforge iteration 13), 06Toolforge-standards-committee, 10Tools, 07Privacy: api-docs.toolforge.org loads loads 3rd party content - https://phabricator.wikimedia.org/T370532#10007813 (10dcaro) a:03dcaro [17:16:56] 10Toolforge (Toolforge iteration 13), 06Toolforge-standards-committee, 10Tools, 07Privacy: api-docs.toolforge.org loads loads 3rd party content - https://phabricator.wikimedia.org/T370532#10007814 (10dcaro) p:05Triage→03Medium [17:17:55] 10Toolforge (Toolforge iteration 13), 06Toolforge-standards-committee, 10Tools, 07Privacy: api-docs.toolforge.org loads loads 3rd party content - https://phabricator.wikimedia.org/T370532#10007816 (10dcaro) 05Open→03Resolved [17:19:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [17:29:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [17:37:42] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q4:rack/setup/install new cloudcephmon hosts - https://phabricator.wikimedia.org/T364870#10007870 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by pt1979@cumin1002 for host cloudcephmon1004.eqiad.wmnet with OS bullseye [18:34:22] (03CR) 10Krinkle: frontend: Rewrite codesearch-beta, make feature-complete, fix bugs (034 comments) [labs/codesearch] - 10https://gerrit.wikimedia.org/r/804785 (https://phabricator.wikimedia.org/T263354) (owner: 10Krinkle) [18:40:19] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q4:rack/setup/install new cloudcephmon hosts - https://phabricator.wikimedia.org/T364870#10008171 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by pt1979@cumin1002 for host cloudcephmon1004.eqiad.wmnet with OS bullseye ex... [19:40:58] 06cloud-services-team, 10Toolforge: Elasticsearch credential request for wikitermbase - https://phabricator.wikimedia.org/T368376#10008345 (10bd808) [19:49:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [19:57:53] 06cloud-services-team, 10Toolforge: Elasticsearch credential request for wikitermbase - https://phabricator.wikimedia.org/T368376#10008448 (10bd808) My apologies @ForzaGreen. As often happens roles and responsibilities have shifted without all things on-wiki being updated. I have now flagged your request as so... [19:59:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [20:07:01] (03update) 10raymond-ndibe: [jobs-api] refactor before moving jobs load to backend [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/103 (https://phabricator.wikimedia.org/T359804 https://phabricator.wikimedia.org/T366209) [20:07:41] (03update) 10raymond-ndibe: [jobs-api] refactor before moving jobs load to backend [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/103 (https://phabricator.wikimedia.org/T359804 https://phabricator.wikimedia.org/T366209) [20:08:52] (03update) 10raymond-ndibe: [jobs-api] refactor before moving jobs load to backend [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/103 (https://phabricator.wikimedia.org/T359804 https://phabricator.wikimedia.org/T366209) [21:13:24] (03PS1) 10Krinkle: frontend: Server-side rendering (take 2) [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1056248 [21:23:22] 10Cloud-VPS (Debian Buster Deprecation), 10Community-Tech (Darwin's Fox (July 15-26, 2024)): Cloud VPS "eventmetrics" project Buster deprecation - https://phabricator.wikimedia.org/T367530#10008714 (10MusikAnimal) This is now done. I'd like to wait a day or so to monitor for problems before deleting the old VM. [21:23:44] 10Cloud-VPS (Quota-requests): Subdomain for DUCT project - https://phabricator.wikimedia.org/T370826 (10SDunlap) 03NEW [22:35:29] 10VPS-project-Codesearch: Let codesearch-frontend reques to local Hound instances directly - https://phabricator.wikimedia.org/T361899#10008964 (10Krinkle) [22:40:45] (03PS2) 10Krinkle: frontend: Server-side rendering (take 2) [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1056248 [22:45:35] 06cloud-services-team, 10Cloud-VPS (Quota-requests): Subdomain for DUCT project - https://phabricator.wikimedia.org/T370826#10008977 (10bd808) +1 https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Web_proxy#Enable_per-project_subdomain_delegation [23:27:56] FIRING: SystemdUnitDown: The service unit wikitech_run_jobs.service is in failed status on host cloudweb1004. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudweb1004 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [23:32:56] RESOLVED: SystemdUnitDown: The service unit wikitech_run_jobs.service is in failed status on host cloudweb1004. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudweb1004 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [23:40:56] FIRING: SystemdUnitDown: The service unit wikitech_run_jobs.service is in failed status on host cloudweb1003. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudweb1003 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [23:45:56] FIRING: [2x] SystemdUnitDown: The service unit wikitech_run_jobs.service is in failed status on host cloudweb1003. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [23:46:19] (03PS3) 10Krinkle: frontend: Server-side rendering (take 2) [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1056248 [23:48:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [23:50:56] RESOLVED: [2x] SystemdUnitDown: The service unit wikitech_run_jobs.service is in failed status on host cloudweb1003. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [23:58:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks