[00:00:09] 10Toolforge Build Service, 10cloud-services-team, 10Documentation: Document how to update heroku-builder:22 container image - https://phabricator.wikimedia.org/T353566 (10dcaro) I added some hints in the other task, all you need to do is: podman pull heroku/builder:22 podman login tools-harbor.wmcloud.org (... [00:15:01] 10cloud-services-team (FY2023/2024-Q1-Q2), 10Cloud-Services-Origin-Alert, 10Cloud-Services-Worktype-Unplanned, 10User-dcaro: [toolsdb] MariaDB process is killed by OOM killer (December 2023) - https://phabricator.wikimedia.org/T353093 (10fnegri) I agree we should upgrade asap, but I'd also like to find a t... [00:15:38] 10cloud-services-team (FY2023/2024-Q1-Q2), 10Cloud-Services-Origin-Alert, 10Cloud-Services-Worktype-Unplanned, 10User-dcaro: [toolsdb] MariaDB process is killed by OOM killer (December 2023) - https://phabricator.wikimedia.org/T353093 (10fnegri) [00:15:41] 10Data-Services: [toolsdb] Upgrade to MariaDB 10.6 - https://phabricator.wikimedia.org/T352206 (10fnegri) [00:16:21] 10Data-Services: [toolsdb] Upgrade to MariaDB 10.6 - https://phabricator.wikimedia.org/T352206 (10fnegri) p:05Medium→03High [00:16:25] 10Data-Services: [toolsdb] Upgrade to MariaDB 10.6 - https://phabricator.wikimedia.org/T352206 (10fnegri) [00:16:28] 10cloud-services-team (FY2023/2024-Q1-Q2), 10Cloud-Services-Origin-Alert, 10Cloud-Services-Worktype-Unplanned, 10User-dcaro: [toolsdb] MariaDB process is killed by OOM killer (December 2023) - https://phabricator.wikimedia.org/T353093 (10fnegri) p:05High→03Unbreak! [00:31:19] (HAProxyBackendUnavailable) firing: HAProxy service neutron-api_backend backend cloudcontrol1005.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [00:36:03] (InstanceDown) firing: Project toolsbeta instance toolsbeta-bastion-6 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [00:36:19] (HAProxyBackendUnavailable) resolved: HAProxy service neutron-api_backend backend cloudcontrol1005.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [02:45:04] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [02:50:04] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [03:36:03] (InstanceDown) firing: Project toolsbeta instance toolsbeta-bastion-6 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [05:45:04] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [05:50:04] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [06:36:03] (InstanceDown) firing: Project toolsbeta instance toolsbeta-bastion-6 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [07:13:19] (HAProxyBackendUnavailable) firing: HAProxy service neutron-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [07:18:19] (HAProxyBackendUnavailable) resolved: HAProxy service neutron-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [08:14:01] 10cloud-services-team (FY2023/2024-Q1-Q2), 10Cloud-Services-Origin-Alert, 10Cloud-Services-Worktype-Unplanned, 10User-dcaro: [toolsdb] MariaDB process is killed by OOM killer (December 2023) - https://phabricator.wikimedia.org/T353093 (10fnegri) I'm gonna try using jemalloc as suggested in [this stackoverf... [08:14:18] 10cloud-services-team (FY2023/2024-Q1-Q2), 10Cloud-Services-Origin-Alert, 10Cloud-Services-Worktype-Unplanned, 10User-dcaro: [toolsdb] MariaDB process is killed by OOM killer (December 2023) - https://phabricator.wikimedia.org/T353093 (10fnegri) a:05dcaro→03fnegri [08:45:04] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [08:50:04] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [09:06:37] 10Tool-hitaden, 10Toolforge Build Service: [buildservice,nodejs] nodejs buildpack does not take envvars into account - https://phabricator.wikimedia.org/T353557 (10dcaro) I think I know what happened, heroku deprecated the heroku/builder-classic:22, that uses the old, non-cnb buildpacks, and the supported buil... [09:18:27] 10cloud-services-team (FY2023/2024-Q1-Q2), 10Cloud-Services-Origin-Alert, 10Cloud-Services-Worktype-Unplanned, 10Patch-For-Review, 10User-dcaro: [toolsdb] MariaDB process is killed by OOM killer (December 2023) - https://phabricator.wikimedia.org/T353093 (10fnegri) I think jemalloc did the trick! Memory... [09:18:29] 10cloud-services-team (FY2023/2024-Q1-Q2), 10Cloud-Services-Origin-Alert, 10Cloud-Services-Worktype-Unplanned, 10Patch-For-Review, 10User-dcaro: [toolsdb] MariaDB process is killed by OOM killer (December 2023) - https://phabricator.wikimedia.org/T353093 (10fnegri) p:05Unbreak!→03High [09:36:03] (InstanceDown) firing: Project toolsbeta instance toolsbeta-bastion-6 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [10:18:41] 10cloud-services-team (FY2023/2024-Q1-Q2), 10Cloud-Services-Origin-Alert, 10Cloud-Services-Worktype-Unplanned, 10Patch-For-Review, 10User-dcaro: [toolsdb] MariaDB process is killed by OOM killer (December 2023) - https://phabricator.wikimedia.org/T353093 (10fnegri) 05Open→03In progress [10:29:59] 10Toolforge: [build-serviece,clojure] Current supported heroku builder does not yet include clojure support - https://phabricator.wikimedia.org/T353575 (10dcaro) [11:13:38] (CephSlowOps) firing: Ceph cluster in eqiad has 2 slow ops - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephSlowOps - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephSlowOps [11:14:20] 10cloud-services-team: CephSlowOps Ceph cluster in eqiad has slow ops, which might be blocking some writes - https://phabricator.wikimedia.org/T352570 (10phaultfinder) [11:18:37] (CephSlowOps) resolved: Ceph cluster in eqiad has 8 slow ops - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephSlowOps - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephSlowOps [11:45:04] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [11:50:04] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [12:41:03] (InstanceDown) firing: Project toolsbeta instance toolsbeta-bastion-6 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [13:56:19] (HAProxyBackendUnavailable) firing: HAProxy service neutron-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [14:01:20] (HAProxyBackendUnavailable) resolved: HAProxy service neutron-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [14:37:59] (03Abandoned) 10Reputation22: Update backend architecture for huge traffic [labs/tools/VideoCutTool] - 10https://gerrit.wikimedia.org/r/942508 (https://phabricator.wikimedia.org/T342810) (owner: 10Reputation22) [14:45:04] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [14:50:04] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [14:57:11] 10Tool-hitaden, 10Toolforge Build Service: [buildservice,nodejs] nodejs buildpack does not take envvars into account - https://phabricator.wikimedia.org/T353557 (10Lofhi) It seems so complicated. The `npm cli` or `npm install` choice is only depending of the major version of npm. https://github.com/heroku/buil... [15:04:45] 10Tool-hitaden, 10Toolforge Build Service: [buildservice,nodejs] nodejs buildpack does not take envvars into account - https://phabricator.wikimedia.org/T353557 (10Lofhi) Something I don't understand is: if the Build Service is already using Cloud Native Buildpacks with Builder:22, why using `pnpm` is not supp... [15:10:31] 10Tool-hitaden, 10Toolforge Build Service: [buildservice,nodejs] nodejs buildpack does not take envvars into account - https://phabricator.wikimedia.org/T353557 (10Lofhi) > This buildpack's bin/detect will only pass if a pnpm-lock.json exists in the project root. This is done to prevent the buildpack from prov... [15:12:59] 10Tool-hitaden, 10Toolforge Build Service: [buildservice,nodejs] nodejs buildpack does not take envvars into account - https://phabricator.wikimedia.org/T353557 (10Lofhi) It seems that the documentation is just wrong and the YAML lockfile is indeed looked up. Running out of explanations on my side! https://git... [15:41:03] (InstanceDown) firing: Project toolsbeta instance toolsbeta-bastion-6 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [16:16:54] (03PS1) 10Majavah: shared: lighttpd: fix override file path [docker-images/toollabs-images] - 10https://gerrit.wikimedia.org/r/983520 [16:17:59] (03PS2) 10Majavah: shared: lighttpd: fix override file path [docker-images/toollabs-images] - 10https://gerrit.wikimedia.org/r/983520 (https://phabricator.wikimedia.org/T293552) [17:32:51] 10Toolforge Build Service: `build delete` gives a confusing error message on a non-existent build - https://phabricator.wikimedia.org/T353583 (10taavi) [17:33:13] 10Toolforge Build Service: `build delete` gives a confusing error message on a non-existent build - https://phabricator.wikimedia.org/T353583 (10taavi) [17:45:04] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [17:50:04] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [18:08:50] 10Toolforge Build Service, 10Patch-For-Review: `build delete` gives a confusing error message on a non-existent build - https://phabricator.wikimedia.org/T353583 (10CodeReviewBot) taavi opened https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/68 auth: Clarify error message [18:12:34] (SystemdUnitDown) firing: The service unit wikitech_run_jobs.service is in failed status on host cloudweb1004. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudweb1004 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [18:17:33] (SystemdUnitDown) resolved: The service unit wikitech_run_jobs.service is in failed status on host cloudweb1004. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudweb1004 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [18:36:46] 10Tool-hitaden, 10Toolforge Build Service: [buildservice,nodejs] nodejs buildpack does not take envvars into account - https://phabricator.wikimedia.org/T353557 (10Lofhi) Successfully deployed the (broken) webservice from a built image: https://hitaden.toolforge.org/ So the envvars buildpack support removed is... [18:41:03] (InstanceDown) firing: Project toolsbeta instance toolsbeta-bastion-6 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [18:58:01] 10Grid-Engine-to-K8s-Migration: Migrate qic from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319982 (10Mike_Peel) Oh, me again. On it... [19:11:21] 10Tool-hitaden, 10Codex, 10Design-Systems-Team: Lookup: unconsistensy between "value" and "label" Vue props and usage of FormData interface on a form - https://phabricator.wikimedia.org/T353116 (10Lofhi) [19:12:33] (SystemdUnitDown) firing: The service unit wikitech_run_jobs.service is in failed status on host cloudweb1003. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudweb1003 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [19:14:12] 10Tool-tool-watch: Sort tools based on Title | ToolWatch - https://phabricator.wikimedia.org/T353579 (10Gopavasanth) [19:16:18] 10Tool-tool-watch: Improve the functionality of Toolwatch - https://phabricator.wikimedia.org/T353573 (10Gopavasanth) [19:16:31] 10Tool-tool-watch, 10Technical-Tool-Request: Tool Request: ToolForge Health Dashboard Tool (ToolWatch) - https://phabricator.wikimedia.org/T341379 (10Gopavasanth) [19:17:33] (SystemdUnitDown) resolved: The service unit wikitech_run_jobs.service is in failed status on host cloudweb1003. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudweb1003 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [19:38:18] 10Grid-Engine-to-K8s-Migration: Migrate pibot from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319967 (10Mike_Peel) So I must be missing something, but I was running this at /data/project/pibot/wikicode: ` jsub query_run.sh ` which worked nicely. However, the new system se... [19:40:39] 10Grid-Engine-to-K8s-Migration: Migrate pibot from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319967 (10taavi) >>! In T319967#9411221, @Mike_Peel wrote: > /bin/sh: 1: /data/project/pibot/query_run.sh: not found You seem to be missing a directory in the file name: `lang=shel... [19:46:17] 10Grid-Engine-to-K8s-Migration: Migrate pibot from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319967 (10Mike_Peel) Ah, well spotted, thanks! That moves the problem on a stage, where it now says: ` /data/project/pibot/wikicode/query_run.sh: 1: python3: not found ` [19:57:36] 10Grid-Engine-to-K8s-Migration: Migrate pibot from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319967 (10taavi) It seems like you're using the `bullseye` image which only has the base Debian 11 setup and is mostly intended for compiled languages. Try using a [[ https://wikit... [19:59:10] 10Grid-Engine-to-K8s-Migration: Migrate pibot from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319967 (10Mike_Peel) Can't I use my local environment, which loads when I log in? [20:00:37] 10Grid-Engine-to-K8s-Migration: Migrate pibot from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319967 (10taavi) The "virtual" in virtual environments means that it would still need a Python installation on the container image. [20:06:15] 10Grid-Engine-to-K8s-Migration: Migrate pibot from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319967 (10Mike_Peel) The setup works fine with jsub, how do I just use the same environment with this new setup? [20:34:11] (03CR) 10BryanDavis: [C: 03+2] "Nice find." [docker-images/toollabs-images] - 10https://gerrit.wikimedia.org/r/983520 (https://phabricator.wikimedia.org/T293552) (owner: 10Majavah) [20:34:48] (03Merged) 10jenkins-bot: shared: lighttpd: fix override file path [docker-images/toollabs-images] - 10https://gerrit.wikimedia.org/r/983520 (https://phabricator.wikimedia.org/T293552) (owner: 10Majavah) [20:40:19] (HAProxyBackendUnavailable) firing: HAProxy service neutron-api_backend backend cloudcontrol1005.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [20:44:28] 10Cloud-VPS, 10cloud-services-team, 10CAS-SSO, 10Infrastructure-Foundations: Update redirect URL for 'catalyst' OpenID Connect client on idp.wmcloud.org - https://phabricator.wikimedia.org/T353586 (10CCicalese_WMF) [20:45:04] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [20:45:19] (HAProxyBackendUnavailable) resolved: HAProxy service neutron-api_backend backend cloudcontrol1005.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [20:50:04] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [20:56:23] 10Grid-Engine-to-K8s-Migration: Migrate pibot from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319967 (10taavi) You would need to use a Python-specific image and a venv created in a container as documented on https://wikitech.wikimedia.org/wiki/Help:Toolforge/Python. [20:58:03] 10Cloud-VPS, 10cloud-services-team: Update redirect URL for 'catalyst' OpenID Connect client on idp.wmcloud.org - https://phabricator.wikimedia.org/T353586 (10taavi) How are you planning to expose the service on a non-default port? [20:59:57] 10wikitech.wikimedia.org, 10Parsoid: Wikitech job runner often OOMs on parsoidCachePrewarm - https://phabricator.wikimedia.org/T353587 (10taavi) [21:05:23] 10Cloud-VPS, 10cloud-services-team: Update redirect URL for 'catalyst' OpenID Connect client on idp.wmcloud.org - https://phabricator.wikimedia.org/T353586 (10CCicalese_WMF) 05Open→03Stalled Excellent question. That won't work. Let me figure out an alternative. Thanks for pointing that out. [21:08:10] 10wikitech.wikimedia.org, 10Parsoid: Wikitech job runner often OOMs on parsoidCachePrewarm - https://phabricator.wikimedia.org/T353587 (10ssastry) Do you have access to the titles where this is happening? That would help the performance debugging. [21:23:31] 10Tool-tool-watch: Sort tools based on Title | ToolWatch - https://phabricator.wikimedia.org/T353579 (10Aklapper) >>! In T353579#9411198, @Gopavasanth wrote: > can you help us create a project tag #toolwatch in Phabricator Please see https://www.mediawiki.org/wiki/Phabricator/Creating_and_renaming_projects link... [21:40:54] 10Cloud-VPS, 10cloud-services-team: Update redirect URL for 'catalyst' OpenID Connect client on idp.wmcloud.org - https://phabricator.wikimedia.org/T353586 (10CCicalese_WMF) 05Stalled→03Open [21:41:09] 10Cloud-VPS, 10cloud-services-team: Update redirect URL for 'catalyst' OpenID Connect client on idp.wmcloud.org - https://phabricator.wikimedia.org/T353586 (10CCicalese_WMF) I updated the requested redirect URL to "https://catalyst.wmcloud.org/auth". [21:41:17] 10Tools, 10WMDE-TechWish-Maintenance, 10Release-Engineering-Team (Quid Pro Crow 🦃): Delete technischewuensche tool code repository in Diffusion - https://phabricator.wikimedia.org/T349847 (10Aklapper) 05Open→03Resolved Thanks! Per https://wikitech.wikimedia.org/wiki/Phabricator#Remove_a_repo , ` aklappe... [21:46:04] (InstanceDown) firing: Project toolsbeta instance toolsbeta-bastion-6 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [21:46:33] 10Cloud-VPS, 10cloud-services-team: Update redirect URL for 'catalyst' OpenID Connect client on idp.wmcloud.org - https://phabricator.wikimedia.org/T353586 (10taavi) 05Open→03Resolved a:03taavi You should now be able to use OAuth with a redirect URL on that domain. `lang=diff - service_id: https://cat... [21:56:55] 10Cloud-VPS, 10cloud-services-team: Update redirect URL for 'catalyst' OpenID Connect client on idp.wmcloud.org - https://phabricator.wikimedia.org/T353586 (10CCicalese_WMF) Thank you! [22:00:56] (ToolsGridQueueProblem) firing: (3) Grid queue webgrid-lighttpd@tools-sgeweblight-10-21.tools.eqiad1.wikimedia.cloud is in state E - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsGridQueueProblem - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsGridQueueProblem [22:01:26] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.grid.cleanup_queue_errors [22:01:29] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.grid.cleanup_queue_errors (exit_code=0) [22:05:56] (ToolsGridQueueProblem) resolved: (3) Grid queue webgrid-lighttpd@tools-sgeweblight-10-21.tools.eqiad1.wikimedia.cloud is in state E - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsGridQueueProblem - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsGridQueueProblem [23:37:04] 10Tools: montage tool violates the Toolforge database connection handling policy - https://phabricator.wikimedia.org/T353554 (10mahmoud) Hey there! Sorry for the inconvenience, going to have to plead ignorance on this one. Just curious, how long has this policy been around? As for the behavior, we are using SQL... [23:45:04] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [23:45:34] 10Tools: montage tool violates the Toolforge database connection handling policy - https://phabricator.wikimedia.org/T353554 (10mahmoud) Just tried a restart while monitoring the following query: ` [s53490__montage]> show status where variable_name = 'Threads_connected'; +-------------------+-------+ | Variabl... [23:50:04] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed