[00:11:25] FIRING: TfInfraTestApplyFailed: Terraform failed to apply/create the resources on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [01:12:57] FIRING: [3x] CloudVPSDesignateLeaks: Detected 21 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [01:17:03] FIRING: [3x] PuppetConstantChange: Puppet performing a change on every puppet run on cloudcontrol2004-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [02:41:48] FIRING: [3x] PuppetConstantChange: Puppet performing a change on every puppet run on cloudcontrol2004-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [02:46:48] FIRING: [3x] PuppetConstantChange: Puppet performing a change on every puppet run on cloudcontrol2004-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [03:01:48] RESOLVED: [2x] PuppetConstantChange: Puppet performing a change on every puppet run on cloudcontrol2004-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [04:15:03] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Kubernetes worker tools-k8s-worker-nfs-35 has many processes stuck on IO (probably NFS) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [05:12:57] FIRING: [3x] CloudVPSDesignateLeaks: Detected 21 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [09:12:57] FIRING: [3x] CloudVPSDesignateLeaks: Detected 21 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [10:14:03] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Kubernetes worker tools-k8s-worker-nfs-35 has many processes stuck on IO (probably NFS) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [13:12:57] FIRING: [3x] CloudVPSDesignateLeaks: Detected 21 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [13:53:34] 10Quarry: Queries take forever to run - https://phabricator.wikimedia.org/T367654 (10GTrang) 03NEW [13:54:04] 10Quarry: Queries take forever to run - https://phabricator.wikimedia.org/T367654#9896074 (10GTrang) [13:56:17] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Quarry: [bug] Access denied for user 'quarry'@'172.16.2.72' (using password: NO) - https://phabricator.wikimedia.org/T365374#9896076 (10GTrang) >>! In T365374#9890676, @Liz wrote: > I'm not getting that message any more but some queries are just running endlessl... [14:48:59] 10Tools: supercount @ toolforge is timing out - https://phabricator.wikimedia.org/T366584#9896118 (10Titore) 05Open→03Resolved a:03Titore Looks like it's back up and running. [14:49:31] FIRING: ToolsToolsDBReplicationMissing: ToolsDB replication is not running on tools-db-1 (errno 0) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsDBReplication - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsToolsDBReplicationMissing [14:49:31] FIRING: ToolsToolsDBReplicationError: ToolsDB replication is broken on tools-db-3 (errno 1595) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsDBReplication - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsToolsDBReplicationError [15:22:03] 10Quarry: Queries take forever to run - https://phabricator.wikimedia.org/T367654#9896126 (10SD0001) →14Duplicate dup:03T367464 [15:22:24] 10Quarry: [bug] Quarry queries not completing - https://phabricator.wikimedia.org/T367464#9896128 (10SD0001) [15:32:41] 10Cloud-VPS (Debian Buster Deprecation): Cloud VPS "math" project Buster deprecation - https://phabricator.wikimedia.org/T367540#9896134 (10Physikerwelt) 05Open→03Resolved a:03Physikerwelt It was shut down for a while and has now been deleted. [16:21:56] 10Quarry: [bug] Quarry queries not completing - https://phabricator.wikimedia.org/T367464#9896165 (10Liz) Now, there is a little grey box saying "Explain" at the bottom of the page under Query Status (see [[ https://quarry.wmcloud.org/history/83084/901583/875059 | this Quarry query ]] ) that just produces a tabl... [17:12:57] FIRING: [3x] CloudVPSDesignateLeaks: Detected 23 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [17:27:40] 10Quarry: [bug] Quarry queries not completing - https://phabricator.wikimedia.org/T367464#9896220 (10Novem_Linguae) >>! In T367464#9896165, @Liz wrote: > Now, there is a little grey box saying "Explain" at the bottom of the page under Query Status (see [[ https://quarry.wmcloud.org/history/83084/901583/875059 |... [17:32:46] 10Toolforge: Cannot output "Ερευνητής Αλήθειας" - https://phabricator.wikimedia.org/T367663 (10Soda) 03NEW [18:43:12] 10Cloud-VPS (Debian Buster Deprecation): Cloud VPS "mailman" project Buster deprecation - https://phabricator.wikimedia.org/T367538#9896289 (10Ladsgroup) 05Open→03Resolved a:03Ladsgroup I just deleted the VM. [19:04:11] 10Quarry: [bug] Quarry queries not completing - https://phabricator.wikimedia.org/T367464#9896329 (10SD0001) >>! In T367464#9896220, @Novem_Linguae wrote: >>>! In T367464#9896165, @Liz wrote: >> Now, there is a little grey box saying "Explain" at the bottom of the page under Query Status (see [[ https://quarry.w... [19:06:15] 10Quarry: [bug] Quarry queries not completing - https://phabricator.wikimedia.org/T367464#9896332 (10SD0001) It's surprising that this issue still occurs with pre-ping enabled – Trove connections from the pool are now validated to ensure liveness before being used. Perhaps I don't understand SQLAlchemy well enou... [19:15:22] 10Quarry, 13Patch-Needs-Improvement: EXPLAIN is broken because new analytics wiki replica cluster contains multiple servers - https://phabricator.wikimedia.org/T205214#9896350 (10SD0001) 05Open→03Resolved a:03SD0001 The EXPLAIN capability is back up now! There's a UI issue though - as this feature... [19:26:47] 10Tool-citationanalyzer: Citation Analyzer is currently unavailable - https://phabricator.wikimedia.org/T367489#9896356 (10Aklapper) 05Open→03Invalid Closing as invalid. For future reference, please fill out ALL sections in the bug report template, instead of deleting them. Thanks. [19:33:26] 10Quarry, 13Patch-Needs-Improvement: EXPLAIN is broken because new analytics wiki replica cluster contains multiple servers - https://phabricator.wikimedia.org/T205214#9896369 (10github-toolforge-bot) siddharthvp opened https://github.com/toolforge/quarry/pull/56 [19:33:35] siddharthvp opened https://github.com/toolforge/quarry/pull/56 [19:57:44] 10Tools: Investigate AdvertiseDetector Outage - https://phabricator.wikimedia.org/T367292#9896387 (10Peachey88) [19:59:03] 10Quarry: [bug] Quarry queries not completing - https://phabricator.wikimedia.org/T367464#9896388 (10Wurgl) For my query: https://quarry.wmcloud.org/query/83617 When I wait for about 20 Minutes and click on "Explain": Nothing happens When I start the query an click on "Explain" immediate after starting, I see s... [20:25:26] 10Quarry: [bug] Quarry queries not completing - https://phabricator.wikimedia.org/T367464#9896408 (10Liz) But hitting "Explain" fills up the part of the page with a table where the query results should be plus it is in code that only a developer or coder would understand, not us mere editors. So, it's not help... [20:36:20] 10Quarry: [bug] Quarry queries not completing - https://phabricator.wikimedia.org/T367464#9896411 (10Wurgl) Liz: Explain tells you in which order the query is processed and which index (if there is some) is used. This is an output of the database itself and you may find some (more or less) helpful information on... [20:38:57] 10Cloud-VPS (Debian Buster Deprecation): Cloud VPS "pki" project Buster deprecation - https://phabricator.wikimedia.org/T367546#9896413 (10Andrew) 05Open→03Resolved a:03Andrew These VMs have been upgraded in place, according to T363829 [20:51:11] 10Quarry: [bug] Quarry queries not completing - https://phabricator.wikimedia.org/T367464#9896436 (10Novem_Linguae) Results will go under the explain. I forget if the explain disappears. The explain is useful to the sql query writer, because you can use it to see why your query is slow. For example you can see... [21:04:00] 10Quarry, 13Patch-Needs-Improvement: EXPLAIN is broken because new analytics wiki replica cluster contains multiple servers - https://phabricator.wikimedia.org/T205214#9896453 (10github-toolforge-bot) siddharthvp closed https://github.com/toolforge/quarry/pull/56 [21:04:03] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Kubernetes worker tools-k8s-worker-nfs-35 has many processes stuck on IO (probably NFS) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [21:04:07] siddharthvp closed https://github.com/toolforge/quarry/pull/56 [21:04:33] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Kubernetes worker tools-k8s-worker-nfs-35 has many processes stuck on IO (probably NFS) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [21:09:18] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Kubernetes worker tools-k8s-worker-nfs-35 has many processes stuck on IO (probably NFS) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [21:12:57] FIRING: [3x] CloudVPSDesignateLeaks: Detected 27 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [22:37:28] FIRING: PuppetAgentNoResources: No Puppet resources found on instance tf-bastion on project tf-infra-test - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [22:38:28] FIRING: [2x] PuppetAgentNoResources: No Puppet resources found on instance cvn-apache10 on project cvn - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [22:38:28] FIRING: PuppetAgentNoResources: No Puppet resources found on instance etcd-discovery-1 on project cloudinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [22:39:28] FIRING: PuppetAgentNoResources: No Puppet resources found on instance bastion on project paws - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [22:40:28] FIRING: PuppetAgentNoResources: No Puppet resources found on instance metricsinfra-puppetserver-1 on project metricsinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [22:44:28] FIRING: PuppetAgentNoResources: No Puppet resources found on instance project-proxy-puppetserver-1 on project project-proxy - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [22:49:28] FIRING: [2x] PuppetAgentNoResources: No Puppet resources found on instance bastion on project paws - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [22:52:28] RESOLVED: PuppetAgentNoResources: No Puppet resources found on instance tf-bastion on project tf-infra-test - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [22:53:28] RESOLVED: [2x] PuppetAgentNoResources: No Puppet resources found on instance cvn-apache10 on project cvn - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [22:53:28] FIRING: [4x] PuppetAgentNoResources: No Puppet resources found on instance cloudinfra-internal-puppetserver-1 on project cloudinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [22:58:43] FIRING: [3x] PuppetAgentNoResources: No Puppet resources found on instance cvn-apache10 on project cvn - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [22:59:28] FIRING: [2x] PuppetAgentNoResources: No Puppet resources found on instance bastion on project paws - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [23:00:28] RESOLVED: PuppetAgentNoResources: No Puppet resources found on instance metricsinfra-puppetserver-1 on project metricsinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [23:04:28] FIRING: PuppetAgentNoResources: No Puppet resources found on instance gitlab-runners-puppetserver-01 on project gitlab-runners - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [23:04:28] RESOLVED: PuppetAgentNoResources: No Puppet resources found on instance project-proxy-puppetserver-1 on project project-proxy - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [23:04:28] RESOLVED: [2x] PuppetAgentNoResources: No Puppet resources found on instance bastion on project paws - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [23:08:28] RESOLVED: [4x] PuppetAgentNoResources: No Puppet resources found on instance cloudinfra-internal-puppetserver-1 on project cloudinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [23:08:43] FIRING: [4x] PuppetAgentNoResources: No Puppet resources found on instance cvn-apache10 on project cvn - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [23:13:43] FIRING: [2x] PuppetAgentNoResources: No Puppet resources found on instance cvn-app10 on project cvn - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [23:23:43] RESOLVED: [2x] PuppetAgentNoResources: No Puppet resources found on instance cvn-app10 on project cvn - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [23:24:28] RESOLVED: PuppetAgentNoResources: No Puppet resources found on instance gitlab-runners-puppetserver-01 on project gitlab-runners - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources