[00:10:28] (PuppetAgentStaleLastRun) firing: Last Puppet run was over 24 hours ago on instance tf-infra-test in project tf-infra-test - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [00:15:28] (PuppetAgentStaleLastRun) resolved: Last Puppet run was over 24 hours ago on instance tf-infra-test in project tf-infra-test - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [00:16:28] (InstanceDown) firing: Project tf-infra-test instance tf-infra-test is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [00:21:28] (InstanceDown) resolved: Project tf-infra-test instance tf-infra-test is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [03:40:26] 10Tool-gitlab-account-approval: Add error logging - https://phabricator.wikimedia.org/T361079 (10bd808) 03NEW [03:40:39] 10Tool-gitlab-account-approval: Add error logging - https://phabricator.wikimedia.org/T361079#9664258 (10bd808) p:05Triage→03Medium [03:42:37] 10Toolforge (Toolforge iteration 07), 13Patch-For-Review: 14[jobs-cli,jobs-api] Allow using file logs with build service images - 14https://phabricator.wikimedia.org/T353537#9664243 (10bd808) 14https://wikitech.wikimedia.org/wiki/Help:Toolforge/Jobs_framework#Job_logs still says "Jobs using build service... [03:46:41] (CloudVPSDesignateLeaks) firing: (5) Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [03:51:41] (CloudVPSDesignateLeaks) firing: (5) Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [03:56:41] (CloudVPSDesignateLeaks) resolved: (5) Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [04:04:24] (03CR) 10Abijeet Patro: Localisation updates from https://translatewiki.net. (031 comment) [labs/tools/weapon-of-mass-description] - 10https://gerrit.wikimedia.org/r/1013987 (owner: 10L10n-bot) [04:04:48] (03Abandoned) 10Abijeet Patro: Localisation updates from https://translatewiki.net. [labs/tools/weapon-of-mass-description] - 10https://gerrit.wikimedia.org/r/1013987 (owner: 10L10n-bot) [04:04:54] (03CR) 10Abijeet Patro: Localisation updates from https://translatewiki.net. (031 comment) [labs/tools/weapon-of-mass-description] - 10https://gerrit.wikimedia.org/r/1013987 (owner: 10L10n-bot) [06:09:06] 10Wikibugs, 07Software-Licensing: 14Relicense Wikibugs from MIT to GPL-3.0-or-later after approval by all substantive contributors - 14https://phabricator.wikimedia.org/T360718#9664395 (10hashar) [09:21:20] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resources on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [10:06:35] 10Toolforge (Toolforge iteration 07), 13Patch-For-Review: 14[jobs-cli,jobs-api] Allow using file logs with build service images - 14https://phabricator.wikimedia.org/T353537#9664547 (10dcaro) 14>>! In T353537#9664243, @bd808 wrote: > https://wikitech.wikimedia.org/wiki/Help:Toolforge/Jobs_framework#Job_lo... [10:07:39] (03CR) 10FNegri: [C:03+1] "LGTM" [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1014470 (owner: 10Majavah) [10:11:00] (03CR) 10FNegri: "I'm testing this with test-cookbook and there seems to be something wrong with the parser:" [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1014471 (owner: 10Majavah) [10:11:28] (PuppetAgentStaleLastRun) firing: Last Puppet run was over 24 hours ago on instance tf-infra-test in project tf-infra-test - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [10:13:56] (03CR) 10Majavah: [C:03+2] wmcs_libs: Generalize the batch runner pattern [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1014470 (owner: 10Majavah) [10:14:22] (03CR) 10Majavah: [C:03+2] nfs: add_server: Use the ENC wmcs_libs library [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1014513 (owner: 10Majavah) [10:14:27] (03CR) 10Majavah: [C:03+2] nfs: migrate_service: Use the ENC wmcs_libs library [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1014520 (owner: 10Majavah) [10:14:31] (03CR) 10Majavah: [C:03+2] nfs: migrate_service: Remove overzealous validation of service fqdn [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1014521 (owner: 10Andrew Bogott) [10:14:35] (03CR) 10Majavah: [C:03+2] nfs: add_server: Allow formatting newly created volumes [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1014544 (owner: 10Majavah) [10:14:35] (TfInfraTestApplyFailed) resolved: Terraform failed to apply/create the resources on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [10:14:59] (03Merged) 10jenkins-bot: wmcs_libs: Generalize the batch runner pattern [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1014470 (owner: 10Majavah) [10:15:03] (03Merged) 10jenkins-bot: nfs: add_server: Use the ENC wmcs_libs library [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1014513 (owner: 10Majavah) [10:15:07] (03Merged) 10jenkins-bot: nfs: migrate_service: Use the ENC wmcs_libs library [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1014520 (owner: 10Majavah) [10:15:11] (03Merged) 10jenkins-bot: nfs: migrate_service: Remove overzealous validation of service fqdn [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1014521 (owner: 10Andrew Bogott) [10:15:15] (03Merged) 10jenkins-bot: nfs: add_server: Allow formatting newly created volumes [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1014544 (owner: 10Majavah) [10:15:33] (03PS5) 10Majavah: openstack: cloudcontrol: Convert reboot cookbook to batch base [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1014471 [10:15:43] (03CR) 10Majavah: openstack: cloudcontrol: Convert reboot cookbook to batch base (031 comment) [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1014471 (owner: 10Majavah) [10:15:59] (03CR) 10CI reject: [V:04-1] openstack: cloudcontrol: Convert reboot cookbook to batch base [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1014471 (owner: 10Majavah) [10:16:19] (03PS6) 10Majavah: openstack: cloudcontrol: Convert reboot cookbook to batch base [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1014471 [10:16:28] (PuppetAgentStaleLastRun) resolved: Last Puppet run was over 24 hours ago on instance tf-infra-test in project tf-infra-test - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [10:16:33] (03CR) 10FNegri: [C:03+1] "Now it's working!" [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1014471 (owner: 10Majavah) [10:17:28] (03CR) 10jenkins-bot: openstack: cloudcontrol: Convert reboot cookbook to batch base [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1014471 (owner: 10Majavah) [10:20:41] (03CR) 10Majavah: [C:03+2] openstack: cloudcontrol: Convert reboot cookbook to batch base [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1014471 (owner: 10Majavah) [10:23:46] (03Merged) 10jenkins-bot: openstack: cloudcontrol: Convert reboot cookbook to batch base [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1014471 (owner: 10Majavah) [10:52:56] 10Toolforge, 07Documentation: Move content out of Kubernetes doc and into web and jobs framework docs - https://phabricator.wikimedia.org/T347888#9664820 (10dcaro) I moved some of the content to the other pages, I left only kubernetes specific things in there for users that want to know more about it. Also rem... [10:53:00] (OpenstackAPIResponse) firing: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [11:01:19] 06cloud-services-team, 13Patch-For-Review: [wmcs][alerting] Allow volunteer admins silencing alerts from cloudvps/toolforge/paws/quarry - https://phabricator.wikimedia.org/T320973#9664847 (10dcaro) 05Open→03In progress [11:03:31] (ToolsToolsDBReplicationError) firing: ToolsDB replication is broken on tools-db-3 (errno 1236) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsDBReplication - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsToolsDBReplicationError [11:05:20] 10cloud-services-team (FY2023/2024-Q3-Q4), 13Patch-For-Review: [wmcs][alerting] Allow volunteer admins silencing alerts from cloudvps/toolforge/paws/quarry - https://phabricator.wikimedia.org/T320973#9664852 (10dcaro) [11:08:31] (ToolsToolsDBReplicationMissing) firing: ToolsDB replication is not running on tools-db-3 (errno 0) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsDBReplication - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsToolsDBReplicationMissing [11:31:39] 10Tool-refill: Refill tool stuck "waiting for an available worker" - https://phabricator.wikimedia.org/T361091 (10Keith_D) 03NEW [11:44:35] 10Tool-refill: Refill tool stuck "waiting for an available worker" - https://phabricator.wikimedia.org/T361091#9664939 (10TheresNoTime) a:03TheresNoTime Hi @Keith_D, could you give it another go and let me know if its working now? [12:13:28] (InstanceDown) firing: Project cloudinfra instance cloudinfra-cloudvps-puppetserver-1 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [12:19:48] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.vps.remove_instance for instance toolserver-proxy-01 [12:20:38] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.vps.remove_instance (exit_code=0) for instance toolserver-proxy-01 [12:20:53] 06cloud-services-team, 10Toolforge (Toolforge iteration 07), 13Patch-For-Review: 14Upgrade Toolforge legacy URL redirectors to Debian Bullseye or later - 14https://phabricator.wikimedia.org/T311909#9665045 (10taavi) 05In progress→03Resolved [12:23:28] (InstanceDown) resolved: Project cloudinfra instance cloudinfra-cloudvps-puppetserver-1 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [12:23:45] (ProbeDown) firing: (2) Service toolserver-proxy-01:443 has failed probes (http_toolserver_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#toolserver-proxy-01:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [12:23:58] (InstanceDown) firing: Project cloudinfra instance cloudinfra-cloudvps-puppetserver-1 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [12:27:51] !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.vps.remove_instance for instance toolsbeta-nfs-2 [12:27:57] !log taavi@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.vps.remove_instance (exit_code=0) for instance toolsbeta-nfs-2 [12:28:40] 10Toolforge: 14Upgrade toolsbeta-nfs to Debian Bullseye/Bookworm - 14https://phabricator.wikimedia.org/T360419#9665108 (10taavi) 05Open→03Resolved [12:28:45] (ProbeDown) resolved: (2) Service toolserver-proxy-01:443 has failed probes (http_toolserver_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#toolserver-proxy-01:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [12:28:58] (InstanceDown) resolved: Project cloudinfra instance cloudinfra-cloudvps-puppetserver-1 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [12:29:15] (ProbeDown) firing: Service toolserver-proxy-01:443 has failed probes (http_toolserver_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#toolserver-proxy-01:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [12:34:00] (ProbeDown) firing: (2) Service toolserver-proxy-01:443 has failed probes (http_toolserver_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#toolserver-proxy-01:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [12:34:15] (ProbeDown) firing: (2) Service toolserver-proxy-01:443 has failed probes (http_toolserver_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#toolserver-proxy-01:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [12:35:07] 10cloud-services-team (FY2023/2024-Q3-Q4), 13Patch-For-Review: [wmcs][alerting] Allow silencing alerts metricsinfra alerts on alerts.wikimedia.org - https://phabricator.wikimedia.org/T320973#9665149 (10taavi) [12:39:00] (ProbeDown) resolved: (2) Service toolserver-proxy-01:443 has failed probes (http_toolserver_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#toolserver-proxy-01:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [12:40:28] (PuppetStaleCertificates) firing: Found non-revoked Puppet certificates for 1 deleted instances on toolsbeta-puppetserver-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [13:10:41] (CloudVPSDesignateLeaks) firing: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [13:15:41] (CloudVPSDesignateLeaks) firing: (5) Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [13:38:51] 10Toolforge (Toolforge iteration 07): I can't connect to Toolforge DB replicas from my PC using MySQL Workbench - https://phabricator.wikimedia.org/T360839#9665335 (10MBH) Okay, I probably forgot that it's needed to set up the tunnel in PuTTY settings, and it seemed to me that MySQL Workbench creates tunnel for... [13:38:55] (03Abandoned) 10Majavah: vps: Add cookbook to migrate data from Puppet 5 to Puppet 7 [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/977219 (https://phabricator.wikimedia.org/T351454) (owner: 10Majavah) [13:39:02] 06cloud-services-team, 10Cloud-VPS, 13Patch-For-Review, 10Puppet (Puppet 7.0): 14Write script or cookbook to migrate data from a Puppet 5 puppetmaster to a Puppet 7 puppetserver - 14https://phabricator.wikimedia.org/T351454#9665339 (10taavi) 05Open→03Invalid [13:47:08] 06cloud-services-team, 10Toolforge: 14Upgrade Toolforge Puppet infrastructure to Debian Bullseye or later - 14https://phabricator.wikimedia.org/T311912#9665381 (10taavi) 05Open→03Resolved a:03Andrew [13:47:34] 06cloud-services-team, 10Toolforge, 10Puppet (Puppet 7.0): 14Migrate Toolforge to Puppet 7 - 14https://phabricator.wikimedia.org/T351494#9665376 (10taavi) 05Open→03Resolved a:03Andrew [13:48:23] 10Toolforge (Toolforge iteration 07): I can't connect to Toolforge DB replicas from my PC using MySQL Workbench - https://phabricator.wikimedia.org/T360839#9665405 (10dcaro) > loaded key file, opened connection, entered name and password, successfully entered to Toolforge - but bot still can't connect to DB. W... [13:49:26] 10cloud-services-team (FY2023/2024-Q3-Q4), 06Infrastructure-Foundations, 10Spicerack, 10SRE-tools, 13Patch-For-Review: spicerack: tox fails to install PyYAML using python 3.11 on bookworm - https://phabricator.wikimedia.org/T345337#9665410 (10bking) Sure, I'm happy to create a new package. Curator itself... [13:50:41] (CloudVPSDesignateLeaks) firing: (5) Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [13:52:01] 10cloud-services-team (FY2023/2024-Q3-Q4), 06Infrastructure-Foundations, 10Spicerack, 10SRE-tools: create and deploy new Elastic Curator deb package - https://phabricator.wikimedia.org/T361105 (10bking) 03NEW [13:52:14] 10cloud-services-team (FY2023/2024-Q3-Q4), 06Data-Platform-SRE, 06Infrastructure-Foundations, 10Spicerack, 10SRE-tools: create and deploy new Elastic Curator deb package - https://phabricator.wikimedia.org/T361105#9665431 (10bking) [13:53:45] 10Toolforge (Toolforge iteration 07): I can't connect to Toolforge DB replicas from my PC using MySQL Workbench - https://phabricator.wikimedia.org/T360839#9665439 (10MBH) The error looks like the same: `The connection has not been established because the destination computer rejected the connection request 127.... [13:54:28] (PuppetStaleCertificates) firing: Found non-revoked Puppet certificates for 2 deleted instances on tools-puppetserver-01 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [14:09:14] 06cloud-services-team, 10Toolforge: Upgrade Toolforge apt repository (tools-services hosts) to Debian Bullseye or later - https://phabricator.wikimedia.org/T311914#9665524 (10taavi) a:03taavi [14:09:28] (PuppetSyncFailure) firing: Failed to update Puppet repository /srv/git/operations/puppet on instance metricsinfra-puppetserver-1 in project metricsinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetSyncFailure [14:14:28] (PuppetStaleCertificates) resolved: Found non-revoked Puppet certificates for 2 deleted instances on tools-puppetserver-01 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [14:19:28] (PuppetSyncFailure) resolved: Failed to update Puppet repository /srv/git/operations/puppet on instance metricsinfra-puppetserver-1 in project metricsinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetSyncFailure [14:40:41] (CloudVPSDesignateLeaks) firing: (5) Detected 5 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:53:15] (OpenstackAPIResponse) firing: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [15:18:37] 10Toolforge, 07Software-Licensing: [builds-api] builds-api is missing a software license - https://phabricator.wikimedia.org/T361007#9665851 (10dcaro) p:05Triage→03High [15:23:37] 10cloud-services-team (FY2023/2024-Q3-Q4), 06Data-Platform-SRE, 06Infrastructure-Foundations, 10Spicerack, 10SRE-tools: create and deploy new Elastic Curator deb package - https://phabricator.wikimedia.org/T361105#9665882 (10Gehel) p:05Triage→03High [15:50:41] (CloudVPSDesignateLeaks) firing: (5) Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [15:55:41] (CloudVPSDesignateLeaks) firing: (5) Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [16:00:41] (CloudVPSDesignateLeaks) resolved: (5) Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [16:03:28] 10Toolforge: [jobs-cli,jobs-api] quota shows different units for limit and usage - https://phabricator.wikimedia.org/T361120 (10dcaro) 03NEW [16:03:29] 10Toolforge: [jobs-cli,jobs-api] quota shows different units for limit and usage - https://phabricator.wikimedia.org/T361120#9666063 (10dcaro) p:05Triage→03Medium [16:05:23] 10Toolforge (Toolforge iteration 07): I can't connect to Toolforge DB replicas from my PC using MySQL Workbench - https://phabricator.wikimedia.org/T360839#9666070 (10dcaro) > computer rejected the connection request 127.0.0.1:3306 You are using the default port there, that's the issue, so you have two options:... [16:39:01] (ToolsToolsDBReplicationError) resolved: ToolsDB replication is broken on tools-db-3 (errno 1236) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsDBReplication - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsToolsDBReplicationError [16:39:01] (ToolsToolsDBReplicationMissing) resolved: ToolsDB replication is not running on tools-db-3 (errno 0) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsDBReplication - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsToolsDBReplicationMissing [16:40:41] (CloudVPSDesignateLeaks) firing: Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [16:45:41] (CloudVPSDesignateLeaks) firing: (5) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [16:50:11] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-codfw, 06SRE, 13Patch-For-Review: Q#:rack/setup/install (2) cloudbackup hosts - https://phabricator.wikimedia.org/T356216#9666319 (10Papaul) @Jhancock.wm this is what 2003 is showing on console ` ┌───────────────────────┤ [!!] Partition disks ├──────────... [16:50:41] (CloudVPSDesignateLeaks) firing: (5) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [16:55:41] (CloudVPSDesignateLeaks) resolved: (5) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [18:21:55] 10Cloud-VPS: Add check in nova-fullstack to ensure Puppet certificates are being revoked on instance deletion - https://phabricator.wikimedia.org/T361142 (10taavi) 03NEW [18:22:15] 06cloud-services-team, 10Cloud-VPS: Add check in nova-fullstack to ensure Puppet certificates are being revoked on instance deletion - https://phabricator.wikimedia.org/T361142#9666660 (10taavi) [18:45:41] (CloudVPSDesignateLeaks) firing: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [18:53:15] (OpenstackAPIResponse) firing: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [18:55:41] (CloudVPSDesignateLeaks) resolved: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [20:13:41] (CloudVPSDesignateLeaks) firing: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [20:18:41] (CloudVPSDesignateLeaks) firing: (3) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [20:23:41] (CloudVPSDesignateLeaks) firing: (3) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [20:28:41] (CloudVPSDesignateLeaks) resolved: (3) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [20:40:10] (03PS1) 10Dzahn: delete ticket.discovery.wmnet dummy key, migrated to cfssl [labs/private] - 10https://gerrit.wikimedia.org/r/1015111 (https://phabricator.wikimedia.org/T360413) [20:41:01] (03PS2) 10Dzahn: delete ticket.discovery.wmnet dummy key, migrated to cfssl [labs/private] - 10https://gerrit.wikimedia.org/r/1015111 (https://phabricator.wikimedia.org/T360413) [20:45:34] (03CR) 10Dzahn: [V:03+2 C:03+2] delete ticket.discovery.wmnet dummy key, migrated to cfssl [labs/private] - 10https://gerrit.wikimedia.org/r/1015111 (https://phabricator.wikimedia.org/T360413) (owner: 10Dzahn) [21:12:09] 10Toolforge, 07Documentation: Consolidate information about tool memory, resources, and quota into one doc - https://phabricator.wikimedia.org/T347887#9667177 (10bd808) https://wikitech.wikimedia.org/wiki/Help:Toolforge/Kubernetes#Quotas_and_Resources is marked for moving to a new doc and I would like to under... [21:45:41] (CloudVPSDesignateLeaks) firing: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [21:55:41] (CloudVPSDesignateLeaks) resolved: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [22:39:31] 10Toolforge, 13Patch-For-Review: [jobs-api,jobs-cli] Support services in jobs - https://phabricator.wikimedia.org/T348758#9667402 (10CodeReviewBot) raymond-ndibe opened https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/18 [jobs-cli] support services in jobs [22:39:36] 10Toolforge, 13Patch-For-Review: [jobs-api,jobs-cli] Support services in jobs - https://phabricator.wikimedia.org/T348758#9667404 (10CodeReviewBot) raymond-ndibe opened https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/71 [jobs-api] support services in jobs [22:53:16] (OpenstackAPIResponse) firing: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [23:00:22] (HAProxyBackendUnavailable) firing: HAProxy service nova-api_backend backend cloudcontrol1005.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [23:10:41] (CloudVPSDesignateLeaks) firing: Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [23:15:41] (CloudVPSDesignateLeaks) firing: (5) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [23:20:41] (CloudVPSDesignateLeaks) firing: (5) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [23:25:41] (CloudVPSDesignateLeaks) firing: (5) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [23:40:45] 10Wikibugs: Wikibugs testing task - https://phabricator.wikimedia.org/T90594#9667557 (10bd808) test [23:55:41] (CloudVPSDesignateLeaks) resolved: (2) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks