[00:08:28] (InstanceDown) firing: Project tf-infra-test instance tf-infra-test is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [00:18:28] (InstanceDown) resolved: Project tf-infra-test instance tf-infra-test is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [01:38:15] 14Grid-Engine-to-K8s-Migration, 10Tools, 06All-and-every-Wikisource: Migrate phetools from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319965#9635358 (10Soda) >>! In T319965#9633459, @Tpt wrote: > @Soda Amazing! Thank you! A user request from French Wikisource: Would it... [01:46:57] (CloudVPSDesignateLeaks) firing: (5) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [01:48:28] (PuppetAgentStaleLastRun) firing: (2) Last Puppet run was over 24 hours ago on instance toolsbeta-test-k8s-control-7 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [02:05:39] 14Grid-Engine-to-K8s-Migration: 14Migrate dibot from Toolforge GridEngine to Toolforge Kubernetes - 14https://phabricator.wikimedia.org/T319676#9635363 (10MBH) 14https://github.com/Saisengen/dmitry89-tools [02:14:41] 14Grid-Engine-to-K8s-Migration: 14Migrate mbh from Toolforge GridEngine to Toolforge Kubernetes - 14https://phabricator.wikimedia.org/T319883#9635364 (10MBH) 14Yeah, this tool works now. [03:31:49] (DiskSpace) firing: Disk space cloudbackup1004:9100:/ 5.489% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudbackup1004 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [04:48:28] (PuppetAgentStaleLastRun) firing: (2) Last Puppet run was over 24 hours ago on instance toolsbeta-test-k8s-control-7 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [05:46:57] (CloudVPSDesignateLeaks) firing: (5) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [06:08:04] 14Grid-Engine-to-K8s-Migration: 14Migrate wikisaurusbot from Toolforge GridEngine to Toolforge Kubernetes - 14https://phabricator.wikimedia.org/T320164#9635399 (10MBH) 14New errors on `autopurge-daily`: ` WARNING: API error protectedpage: This page has been protected to prevent editing or other actions. Tra... [06:10:35] 14Grid-Engine-to-K8s-Migration: 14Migrate wikisaurusbot from Toolforge GridEngine to Toolforge Kubernetes - 14https://phabricator.wikimedia.org/T320164#9635400 (10MBH) 14`validation-stats.err`: ` WARNING: /workspace/scripts/../facenapalmscripts/validstats.py:78: DeprecationWarning: datetime.datetime.utcnow(... [06:14:25] 14Grid-Engine-to-K8s-Migration: 14Migrate wikisaurusbot from Toolforge GridEngine to Toolforge Kubernetes - 14https://phabricator.wikimedia.org/T320164#9635401 (10MBH) 14This bots has also a strange issue: they write messages about correct working into error stream, see contents of files `techtasks.err`, `a... [06:36:28] (PuppetAgentFailure) firing: Puppet agent failure detected on instance toolsbeta-puppetdb-02 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [06:51:28] (PuppetAgentFailure) resolved: Puppet agent failure detected on instance toolsbeta-puppetdb-02 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [07:31:49] (DiskSpace) firing: Disk space cloudbackup1004:9100:/ 4.994% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudbackup1004 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [07:53:28] (PuppetAgentStaleLastRun) firing: (2) Last Puppet run was over 24 hours ago on instance toolsbeta-test-k8s-control-7 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [09:46:57] (CloudVPSDesignateLeaks) firing: (5) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [10:53:28] (PuppetAgentStaleLastRun) firing: (2) Last Puppet run was over 24 hours ago on instance toolsbeta-test-k8s-control-7 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [11:23:28] (PuppetAgentStaleLastRun) firing: (2) Last Puppet run was over 24 hours ago on instance toolsbeta-test-k8s-control-7 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [11:28:28] (PuppetAgentStaleLastRun) resolved: (2) Last Puppet run was over 24 hours ago on instance toolsbeta-test-k8s-control-7 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [11:31:49] (DiskSpace) firing: Disk space cloudbackup1004:9100:/ 4.781% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudbackup1004 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [11:56:01] (OpenstackAPIResponse) firing: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [13:46:57] (CloudVPSDesignateLeaks) firing: (5) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:19:50] (ProbeDown) firing: Service tools-static-14:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-14:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [14:24:50] (ProbeDown) resolved: Service tools-static-14:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-14:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [15:31:49] (DiskSpace) firing: Disk space cloudbackup1004:9100:/ 4.54% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudbackup1004 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [15:56:15] (OpenstackAPIResponse) firing: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [17:46:57] (CloudVPSDesignateLeaks) firing: (5) Detected 3 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [18:43:48] (PuppetConstantChange) firing: Puppet performing a change on every puppet run on cloudcontrol2004-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [19:00:16] 10Toolforge (Toolforge iteration 07): 14Upgrade Toolforge image builder to Bookworm - 14https://phabricator.wikimedia.org/T358483#9635624 (10taavi) 05In progress→03Resolved [19:00:24] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS (Debian Buster Deprecation), 10Toolforge, 07Epic, 05Goal: Toolforge: migrate to Debian Bullseye or later - https://phabricator.wikimedia.org/T311897#9635625 (10taavi) [19:00:32] 06cloud-services-team, 10Toolforge: Upgrade Toolforge Puppet infrastructure to Debian Bullseye or later - https://phabricator.wikimedia.org/T311912#9635627 (10taavi) [19:00:40] 06cloud-services-team, 10Toolforge, 10Puppet (Puppet 7.0): Migrate Toolforge to Puppet 7 - https://phabricator.wikimedia.org/T351494#9635626 (10taavi) [19:01:45] (03PS4) 10DannyS712: releases: Bump Codesniffer to 43.0.0 [labs/libraryupgrader/config] - 10https://gerrit.wikimedia.org/r/993504 (https://phabricator.wikimedia.org/T353909) [19:01:53] (03CR) 10DannyS712: "was merged" [labs/libraryupgrader/config] - 10https://gerrit.wikimedia.org/r/993504 (https://phabricator.wikimedia.org/T353909) (owner: 10DannyS712) [19:02:09] (03CR) 10Majavah: [C:03+2] releases: Bump Codesniffer to 43.0.0 [labs/libraryupgrader/config] - 10https://gerrit.wikimedia.org/r/993504 (https://phabricator.wikimedia.org/T353909) (owner: 10DannyS712) [19:02:17] (03Merged) 10jenkins-bot: releases: Bump Codesniffer to 43.0.0 [labs/libraryupgrader/config] - 10https://gerrit.wikimedia.org/r/993504 (https://phabricator.wikimedia.org/T353909) (owner: 10DannyS712) [19:31:49] (DiskSpace) firing: Disk space cloudbackup1004:9100:/ 4.126% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudbackup1004 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [19:56:15] (OpenstackAPIResponse) firing: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [21:46:57] (CloudVPSDesignateLeaks) firing: (5) Detected 3 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [22:43:48] (PuppetConstantChange) firing: Puppet performing a change on every puppet run on cloudcontrol2004-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [22:51:35] PROBLEM - Disk space on cloudbackup1004 is CRITICAL: DISK CRITICAL - free space: / 934 MB (3% inode=93%): /tmp 934 MB (3% inode=93%): /var/tmp 934 MB (3% inode=93%): https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space https://grafana.wikimedia.org/d/000000377/host-overview?var-server=cloudbackup1004&var-datasource=eqiad+prometheus/ops [23:25:50] (ProbeDown) firing: Service tools-static-14:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-14:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [23:30:50] (ProbeDown) resolved: Service tools-static-14:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-14:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [23:31:49] (DiskSpace) firing: Disk space cloudbackup1004:9100:/ 3.117% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudbackup1004 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [23:56:15] (OpenstackAPIResponse) firing: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse