[00:14:03] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [00:17:03] (InstanceDown) firing: Project tf-infra-test instance tf-infra-test is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [00:19:23] (HAProxyBackendUnavailable) firing: HAProxy service nova-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [00:22:03] (InstanceDown) resolved: Project tf-infra-test instance tf-infra-test is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [00:36:24] (03PS1) 10Jforrester: Add bluespice/mw-config and bluespice/mw-config/overrides [labs/codesearch] - 10https://gerrit.wikimedia.org/r/990331 (https://phabricator.wikimedia.org/T354852) [00:39:15] (03PS2) 10Jforrester: Add bluespice/mw-config and bluespice/mw-config/overrides [labs/codesearch] - 10https://gerrit.wikimedia.org/r/990331 (https://phabricator.wikimedia.org/T354852) [00:39:40] 10VPS-project-Codesearch, 10Patch-For-Review: Add bluespice/mw-config/overrides to CodeSearch - https://phabricator.wikimedia.org/T354852 (10Jdforrester-WMF) a:03Jdforrester-WMF [01:29:22] (HAProxyBackendUnavailable) resolved: HAProxy service nova-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [01:38:03] (PuppetAgentNoResources) firing: No Puppet resources found on instance toolsbeta-bastion-6 on project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [02:02:12] (SystemdUnitDown) firing: The systemd unit purge_vm_rbd_images.service on node cloudcontrol1005 has been failing for more than two hours. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1005 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [02:05:41] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [02:08:37] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 26899 bytes in 0.276 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [03:14:03] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [04:01:57] (SystemdUnitDown) resolved: The service unit purge_vm_rbd_images.service is in failed status on host cloudcontrol1005. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1005 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [04:01:59] (SystemdUnitDown) resolved: The systemd unit purge_vm_rbd_images.service on node cloudcontrol1005 has been failing for more than two hours. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1005 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [04:27:13] 10Tools: Tool:Panoviewer - Grid Engine web service cannot be reached. - https://phabricator.wikimedia.org/T354949 (10Peachey88) Adding Maintainers per https://toolsadmin.wikimedia.org/tools/id/panoviewer (although it's missing a toolsinfo to provide more information such as correct issue tracker location). [04:38:03] (PuppetAgentNoResources) firing: No Puppet resources found on instance toolsbeta-bastion-6 on project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [06:14:03] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [07:43:03] (PuppetAgentNoResources) firing: No Puppet resources found on instance toolsbeta-bastion-6 on project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [09:14:03] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [09:17:05] 10Tool-bub2: Uploads failing for Panjab Digital Library - https://phabricator.wikimedia.org/T354351 (10wassan.anmol117) Fixed this with the latest [[ https://github.com/coderwassananmol/BUB2/commit/71e9aa533310169658fb7dacb6e5ec5bc0582c79 | commit ]] [10:43:03] (PuppetAgentNoResources) firing: No Puppet resources found on instance toolsbeta-bastion-6 on project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [11:44:29] 10Tool-bub2: Uploads failing for Panjab Digital Library - https://phabricator.wikimedia.org/T354351 (10wassan.anmol117) a:03wassan.anmol117 [11:44:43] 10Tool-bub2: Uploads failing for Panjab Digital Library - https://phabricator.wikimedia.org/T354351 (10wassan.anmol117) 05Open→03Resolved [12:14:03] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [12:44:03] (03CR) 10Ladsgroup: [C: 03+2] Add bluespice/mw-config and bluespice/mw-config/overrides [labs/codesearch] - 10https://gerrit.wikimedia.org/r/990331 (https://phabricator.wikimedia.org/T354852) (owner: 10Jforrester) [12:44:57] (03Merged) 10jenkins-bot: Add bluespice/mw-config and bluespice/mw-config/overrides [labs/codesearch] - 10https://gerrit.wikimedia.org/r/990331 (https://phabricator.wikimedia.org/T354852) (owner: 10Jforrester) [12:50:04] 10VPS-project-Codesearch, 10Patch-For-Review: Add bluespice/mw-config/overrides to CodeSearch - https://phabricator.wikimedia.org/T354852 (10Ladsgroup) 05Open→03Resolved It'll be deployed in a day or two. [13:43:03] (PuppetAgentNoResources) firing: No Puppet resources found on instance toolsbeta-bastion-6 on project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [14:35:03] (InstanceDown) firing: Project tools instance tools-k8s-worker-84 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [14:40:23] (ToolforgeKubernetesNodeNotReady) firing: Kubernetes node tools-k8s-worker-84 is not ready - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesNodeNotReady - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesNodeNotReady [15:10:03] (InstanceDown) resolved: Project tools instance tools-k8s-worker-84 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [15:10:23] (ToolforgeKubernetesNodeNotReady) resolved: Kubernetes node tools-k8s-worker-84 is not ready - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesNodeNotReady - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesNodeNotReady [15:14:03] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [15:20:22] (HAProxyBackendUnavailable) firing: HAProxy service neutron-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [15:25:22] (HAProxyBackendUnavailable) resolved: HAProxy service neutron-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [16:43:03] (PuppetAgentNoResources) firing: No Puppet resources found on instance toolsbeta-bastion-6 on project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [18:14:03] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [18:52:25] 10Tool-wikiloves: Include Wiki Loves Bangla in the Wikiloves tool - https://phabricator.wikimedia.org/T355006 (10Bodhisattwa) [19:48:03] (PuppetAgentNoResources) firing: No Puppet resources found on instance toolsbeta-bastion-6 on project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [21:14:03] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [22:48:04] (PuppetAgentNoResources) firing: No Puppet resources found on instance toolsbeta-bastion-6 on project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [22:57:07] 10Grid-Engine-to-K8s-Migration: Migrate request from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320003 (10FNDE) Hi all and @Tkarcher, Please find below a summary of the current status of the migration regarding the `request` tool: - [x] PHP webservice - [x] ENV vars - [x]... [23:00:05] 10Grid-Engine-to-K8s-Migration: Migrate request from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320003 (10FNDE) @komla could you move this ticket to Backlog? [23:49:31] 10Tools: Tool:Panoviewer - Grid Engine web service cannot be reached. - https://phabricator.wikimedia.org/T354949 (10Sdkb) p:05Triage→03Unbreak!