[00:01:49] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resources on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [00:01:49] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resources on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [00:10:28] (PuppetAgentStaleLastRun) firing: Last Puppet run was over 24 hours ago on instance tf-infra-test in project tf-infra-test - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [00:11:49] (TfInfraTestDestroyFailed) resolved: Terraform failed to destroy the resources on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [00:11:49] (TfInfraTestApplyFailed) resolved: Terraform failed to apply/create the resources on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [00:15:28] (PuppetAgentStaleLastRun) resolved: Last Puppet run was over 24 hours ago on instance tf-infra-test in project tf-infra-test - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [00:20:05] 10Data-Services, 10cloud-services-team, 10Data-Platform, 10Patch-For-Review: Add global_edit_count to wikireplicas - https://phabricator.wikimedia.org/T344108 (10lbowmaker) [00:42:14] 10Grid-Engine-to-K8s-Migration: Migrate zoomviewer from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320210 (10tstarling) It's fast when I take my network out of the loop by doing requests with `ab` on the server. Also, it's 2x faster (13s -> 7s) when I disable HTTP/2 in my... [01:12:41] (CloudVPSDesignateLeaks) firing: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [01:17:41] (CloudVPSDesignateLeaks) firing: (2) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [01:24:04] 10Cloud-VPS, 10MediaWiki-Vagrant, 10Patch-For-Review: Update Vagrant puppet role to work on Bookworm. - https://phabricator.wikimedia.org/T356551 (10bd808) >>! In T356551#9532560, @taavi wrote: > With the [[ https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1049999 | non-free licensing status of modern Vagr... [01:40:01] (NovafullstackSustainedFailures) firing: Novafullstack tests have been failing for more than 5hours in eqiad - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/NovafullstackSustainedFailures - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-nova-fullstack?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DNovafullstackSustainedFailures [02:24:09] (03CR) 10Eugene233: "So does that mean we can abandon this fix?" [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/991814 (owner: 10Josefanthony) [03:36:28] 10Grid-Engine-to-K8s-Migration: Migrate zoomviewer from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320210 (10tstarling) 05Open→03Resolved Performance seems unrelated so can be discussed elsewhere. I migrated prune.sh to a scheduled job. I removed the rest of the cronta... [03:38:24] (03PS1) 10Eugene233: Backslashes in some ISA messages [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1002422 (https://phabricator.wikimedia.org/T299863) [03:38:53] (03Abandoned) 10Eugene233: Backslashes in some ISA messages [labs/tools/Isa] (m2c) - 10https://gerrit.wikimedia.org/r/990674 (https://phabricator.wikimedia.org/T299863) (owner: 10Eugene233) [03:40:57] (03CR) 10Eugene233: [C: 03+1] Backslashes in some ISA messages [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1002422 (https://phabricator.wikimedia.org/T299863) (owner: 10Eugene233) [04:01:56] (03CR) 10Amire80: [C: 03+1] Backslashes in some ISA messages [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1002422 (https://phabricator.wikimedia.org/T299863) (owner: 10Eugene233) [04:09:59] 10Data-Services, 10cloud-services-team (Kanban): 2021-06-15: Tools NFS share cleanup - https://phabricator.wikimedia.org/T284964 (10tstarling) [04:10:32] 10Tools: zoomviewer taking up a lot of NFS space -- please clean up - https://phabricator.wikimedia.org/T285018 (10tstarling) 05Open→03Resolved a:03tstarling I reduced the expiry time to 30 days. Also, I fixed a bug causing originals to be deleted less than 1 day after download. Previously, pyramids wer... [04:30:48] (03PS2) 10Eugene233: Correct mistypes in messages [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/954937 [04:35:05] 10Tools, 10Commons: ZoomViewer produces a 503 error - https://phabricator.wikimedia.org/T343796 (10tstarling) In the new job system, vips was killed when it ran with the default limit of 500MB, but it completed when I ran it manually with a 6GB memory limit. So I'll change the memory limit in the source. I g... [04:41:35] (03CR) 10Eugene233: [C: 03+2] Backslashes in some ISA messages [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1002422 (https://phabricator.wikimedia.org/T299863) (owner: 10Eugene233) [04:42:01] (03Merged) 10jenkins-bot: Backslashes in some ISA messages [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1002422 (https://phabricator.wikimedia.org/T299863) (owner: 10Eugene233) [04:54:07] (03PS1) 10Eugene233: Update README with client side build instructions [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1002423 (https://phabricator.wikimedia.org/T355467) [04:55:00] (03Abandoned) 10Eugene233: Update README with client side build instructions [labs/tools/Isa] (m2c) - 10https://gerrit.wikimedia.org/r/991810 (https://phabricator.wikimedia.org/T355467) (owner: 10Eugene233) [05:16:20] (03CR) 10Eugene233: "Is this fix related/same as" [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/998432 (owner: 10Juniorbesong) [05:17:01] (03CR) 10Amire80: Update README with client side build instructions (031 comment) [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1002423 (https://phabricator.wikimedia.org/T355467) (owner: 10Eugene233) [05:17:42] (CloudVPSDesignateLeaks) firing: (2) Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [05:20:36] (03CR) 10Eugene233: worked on footer appears at the bottom (031 comment) [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/997831 (owner: 10Josefanthony) [05:21:22] (03CR) 10Eugene233: "review" [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/997831 (owner: 10Josefanthony) [05:40:16] (NovafullstackSustainedFailures) firing: Novafullstack tests have been failing for more than 5hours in eqiad - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/NovafullstackSustainedFailures - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-nova-fullstack?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DNovafullstackSustainedFailures [06:30:48] 10Tools, 10Commons: ZoomViewer produces a 503 error - https://phabricator.wikimedia.org/T343796 (10tstarling) 05Open→03Resolved a:03tstarling Deployed, purged cache, refreshed. Worked very nicely this time, no errors. I increased the CPU limit to 1 core (from 0.5). [07:44:08] 10Toolforge (Quota-requests): Request increased memory quota for wd-shex-infer Toolforge tool - https://phabricator.wikimedia.org/T357209 (10Raymond_Ndibe) @dcaro do we have a way to automatically handle requests like this? [07:46:13] 10Toolforge (Software install/update): Add a container for Swift - https://phabricator.wikimedia.org/T354815 (10Raymond_Ndibe) >>! In T354815#9502250, @dcaro wrote: > There's a relatively well maintained swift buildpack too: https://github.com/vapor-community/heroku-buildpack @dcaro do you think this is somethi... [07:48:34] 10Grid-Engine-to-K8s-Migration: Migrate phetools from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319965 (10Xover) >>! In T319965#9534722, @Soda wrote: > I'm looking into migrating some of the usable aspects (statistics and match + split) of phetools into seperate standalone... [08:00:13] 10Toolforge (Toolforge iteration 05): [toolforge-cd] discuss the possibility of removing tests from merge request ci/cd pipelines - https://phabricator.wikimedia.org/T353740 (10Raymond_Ndibe) Ok. My initial thought was that we can just make a few changes to the gitlab-ci repo and have it only run tests once, but... [08:01:17] 10Toolforge: [toolforge-cd] discuss the possibility of removing tests from merge request ci/cd pipelines - https://phabricator.wikimedia.org/T353740 (10Raymond_Ndibe) [08:02:45] 10Toolforge: [toolforge-cd] discuss the possibility of eliminating test repetitions from merge request ci/cd pipelines - https://phabricator.wikimedia.org/T353740 (10Raymond_Ndibe) [08:06:48] 10Toolforge: [toolforge-cd] discuss the possibility of eliminating test repetitions from merge request ci/cd pipelines - https://phabricator.wikimedia.org/T353740 (10Raymond_Ndibe) a:05Raymond_Ndibe→03None [08:06:52] 10Toolforge, 10Tools: zoomviewer runs IO intensive operations locally on tools-webgrid-lighttpd* hosts - https://phabricator.wikimedia.org/T186222 (10tstarling) 05Open→03Resolved a:03dschwen Thanks for fixing this 6 years ago [08:36:34] 10Tools: http (not https) image URLs in zoomviewer manifest files - https://phabricator.wikimedia.org/T287890 (10tstarling) 05Open→03Resolved a:03tstarling I fixed it. [09:17:42] (CloudVPSDesignateLeaks) firing: (2) Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [09:25:51] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_node for host tools-k8s-worker-64 [09:26:34] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.remove_k8s_node (exit_code=0) for host tools-k8s-worker-64 [09:26:41] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.add_k8s_node for a worker-nfs role in the tools cluster [09:29:18] 10Toolforge: [toolforge-cd] minimize test repetitions from merge request ci/cd pipelines - https://phabricator.wikimedia.org/T353740 (10dcaro) [09:29:40] 10Toolforge: [toolforge-cd] minimize test repetitions from merge request ci/cd pipelines - https://phabricator.wikimedia.org/T353740 (10dcaro) p:05Medium→03Low [09:33:13] 10Grid-Engine-to-K8s-Migration: Migrate phetools from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319965 (10dcaro) >>! In T319965#9536707, @Xover wrote: >>>! In T319965#9534722, @Soda wrote: >> I'm looking into migrating some of the usable aspects (statistics and match + spl... [09:36:19] !log taavi@cloudcumin1001 tools Added a new k8s worker-nfs tools-k8s-worker-nfs-21.tools.eqiad1.wikimedia.cloud to the cluster [09:36:19] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.add_k8s_node (exit_code=0) for a worker-nfs role in the tools cluster [09:38:42] !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_node for host toolsbeta-test-k8s-worker-7 [09:39:14] !log taavi@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.remove_k8s_node (exit_code=0) for host toolsbeta-test-k8s-worker-7 [09:39:36] !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_node for host toolsbeta-test-k8s-worker-8 [09:40:07] !log taavi@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.remove_k8s_node (exit_code=0) for host toolsbeta-test-k8s-worker-8 [09:40:33] !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.add_k8s_node for a worker-nfs role in the toolsbeta cluster [09:41:37] (ProbeDown) firing: (2) Service toolsbeta-test-k8s-haproxy-3:30000 has failed probes (http_admin_beta_toolforge_org_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [09:44:15] 10Toolforge (Quota-requests): Request increased memory quota for wd-shex-infer Toolforge tool - https://phabricator.wikimedia.org/T357209 (10dcaro) >>! In T357209#9536704, @Raymond_Ndibe wrote: > @dcaro do we have a way to automatically handle requests like this? Not yet no, feel free to try to create a cookboo... [09:46:28] (InstanceDown) firing: Project toolsbeta instance toolsbeta-test-k8s-worker-8 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [09:49:42] 10Toolforge Jobs framework, 10User-aborrero: toolforge-jobs job emails should have information on why events happened - https://phabricator.wikimedia.org/T306310 (10aborrero) [09:50:13] !log taavi@cloudcumin1001 toolsbeta Added a new k8s worker-nfs toolsbeta-test-k8s-worker-nfs-4.toolsbeta.eqiad1.wikimedia.cloud to the cluster [09:50:13] !log taavi@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.add_k8s_node (exit_code=0) for a worker-nfs role in the toolsbeta cluster [09:51:28] (InstanceDown) resolved: Project toolsbeta instance toolsbeta-test-k8s-worker-8 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [09:52:12] !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.add_k8s_node for a ingress role in the toolsbeta cluster [09:57:43] 10Grid-Engine-to-K8s-Migration: Migrate phetools from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319965 (10Soda) >>! In T319965#9536707, @Xover wrote: >>>! In T319965#9534722, @Soda wrote: >> I'm looking into migrating some of the usable aspects (statistics and match + spli... [09:58:16] 10Toolforge Jobs framework, 10User-aborrero: toolforge-jobs: fix pkg_resources deprecation warning - https://phabricator.wikimedia.org/T357387 (10aborrero) [09:59:56] !log taavi@cloudcumin1001 toolsbeta Added a new k8s ingress toolsbeta-test-k8s-ingress-7.toolsbeta.eqiad1.wikimedia.cloud to the cluster [09:59:56] !log taavi@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.add_k8s_node (exit_code=0) for a ingress role in the toolsbeta cluster [10:03:23] !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_node for host toolsbeta-test-k8s-ingress-3 [10:03:53] !log taavi@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.remove_k8s_node (exit_code=0) for host toolsbeta-test-k8s-ingress-3 [10:04:16] !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.add_k8s_node for a ingress role in the toolsbeta cluster [10:05:09] 10Toolforge Jobs framework: toolforge jobs current image alias - https://phabricator.wikimedia.org/T357388 (10tstarling) [10:08:30] 10Toolforge Jobs framework, 10User-aborrero: toolforge jobs current image alias - https://phabricator.wikimedia.org/T357388 (10aborrero) p:05Triage→03Medium [10:09:06] 10Toolforge (Software install/update), 10Toolforge Build Service: Add a container for Swift - https://phabricator.wikimedia.org/T354815 (10dcaro) p:05Triage→03Low >>! In T354815#9536706, @Raymond_Ndibe wrote: >>>! In T354815#9502250, @dcaro wrote: >> There's a relatively well maintained swift buildpack too... [10:09:55] 10Toolforge: rm'ing a specific file on NFS hangs on (dev|login).toolforge.org - https://phabricator.wikimedia.org/T357340 (10Count_Count) 05Open→03Resolved a:03Count_Count Not sure what happened but for some reason the file is gone now. [10:11:20] !log taavi@cloudcumin1001 toolsbeta Added a new k8s ingress toolsbeta-test-k8s-ingress-8.toolsbeta.eqiad1.wikimedia.cloud to the cluster [10:11:20] !log taavi@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.add_k8s_node (exit_code=0) for a ingress role in the toolsbeta cluster [10:11:24] 10Toolforge Build Service: Add a container for Swift - https://phabricator.wikimedia.org/T354815 (10taavi) [10:17:58] 10Grid-Engine-to-K8s-Migration: Migrate women-in-red from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320183 (10dcaro) >>! In T320183#9535032, @Ragesoss wrote: > @dcaro I just disabled the cron. \o/ thanks! Feel free to close this task whenever you are sure everything works... [11:28:48] 10Grid-Engine-to-K8s-Migration: Migrate navlink-recommendation from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319920 (10dcaro) @Ashwinpp, Hi, you are listed as a maintainer of this tool with @Nirzar (see https://toolsadmin.wikimedia.org/tools/id/navlink-recommendation), ca... [11:31:37] 10Grid-Engine-to-K8s-Migration: Migrate noclaims from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319927 (10dcaro) Hi @Multichill, have you been able to take a look at this? It seems it's still running on the cron on the grid. Let me know if you are having issues/find bugs/e... [11:37:46] 10Grid-Engine-to-K8s-Migration: Migrate projektneuheiten-feed from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319978 (10dcaro) Pinged the user (hopefully the correct one xd) on their talk page: https://de.wikipedia.org/wiki/Benutzer_Diskussion:Manuel_Bieling#Migrate_projekt... [11:40:09] 10Grid-Engine-to-K8s-Migration: Migrate render from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320001 (10dcaro) @daniel @kai.nissen can you take a look at this tool? (you are listed as maintainers too https://toolsadmin.wikimedia.org/tools/id/render) It will be turned off t... [11:40:53] 10Grid-Engine-to-K8s-Migration: Migrate render-tests from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320002 (10dcaro) @daniel @kai.nissen can you take a look at this tool too? (you are listed as maintainers too https://toolsadmin.wikimedia.org/tools/id/render) It will be tu... [11:51:23] 10Grid-Engine-to-K8s-Migration: Migrate ruwn-misc from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320024 (10dcaro) @korzhimanov Hi! You are listed as maintainer of this tool also, can you give it a look? The tool will be prevented from running on the Grid tomorrow if you do... [11:56:37] 10Grid-Engine-to-K8s-Migration: Migrate shrinitools from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320037 (10dcaro) Thanks @Tshrinivasan, I'll close this task for now, if you want to delete the tool please mark it for deletion using https://wikitech.wikimedia.org/wiki/Help... [11:57:03] 10Grid-Engine-to-K8s-Migration: Migrate ruwn-misc from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320024 (10korzhimanov) Thanks! Will take a look a bit later today [11:58:14] 10Grid-Engine-to-K8s-Migration: Migrate status from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320058 (10dcaro) I think this is done yes, nothing has run for a while, @Platonides can you close the task as resolved if that's the case? Thanks! [12:01:38] 10Grid-Engine-to-K8s-Migration: Migrate superyetkin from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320070 (10dcaro) Hi @Superyetkin, how is the migration going? Is there something blocking you/bugs/etc. that I can help with? [12:14:53] 10Grid-Engine-to-K8s-Migration: Migrate enwikt-translations from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319724 (10dcaro) Sorry for the delay, I missed the comment. > I don't know how to translate this into building an image, but I guess the first question is how to bui... [12:17:41] (03CR) 10Eugene233: "Review" [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/995350 (owner: 10AgnesAbah) [12:18:50] (03CR) 10Eugene233: "Is this a test fix? It seems like it aims at testing the process?" [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/995349 (owner: 10AgnesAbah) [12:22:14] 10Grid-Engine-to-K8s-Migration, 10urbanecmbot, 10Patch-For-Review, 10User-Urbanecm: Migrate urbanecmbot from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320108 (10dcaro) @Urbanecm Hi! I don't see any processes on the grid anymore, if you finished migrating, can you clo... [12:22:53] 10Grid-Engine-to-K8s-Migration, 10User-Dereckson: Migrate wikidata-nolabels from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320152 (10dcaro) Hi @Dereckson, we are going to be stopping any Grid processes for any tools that did not migrate yet to kubernetes tomorrow, I see... [12:23:45] (03CR) 10Eugene233: Reviewed and improved the code comments on the Isa tool (032 comments) [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/991970 (owner: 10Afimaame) [12:25:27] (03CR) 10Eugene233: "There seems to be a merge conflict with this fix can you do a rebase and submit again?" [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/991948 (owner: 10Ikeadeoyin) [12:25:32] 10Grid-Engine-to-K8s-Migration: Migrate wmds-archive from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320179 (10dcaro) Hi @Tgr, I see no processes running on the grid anymore, were you able to migrate your tool? If so, can you resolve this task? If not, is there anything th... [12:27:01] 10Grid-Engine-to-K8s-Migration: Migrate render from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320001 (10daniel) >>! In T320001#9537226, @dcaro wrote: > @daniel @kai.nissen can you take a look at this tool? (you are listed as maintainers too https://toolsadmin.wikimedia.org... [12:27:41] 10Grid-Engine-to-K8s-Migration, 10urbanecmbot, 10Patch-For-Review, 10User-Urbanecm: Migrate urbanecmbot from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320108 (10Urbanecm) Hi @dcaro, thanks for asking. I'm having troubles with migrating the webservice. I have a mix of... [12:33:55] 10Grid-Engine-to-K8s-Migration, 10Chinese-Sites: Migrate zhwiki from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320205 (10dcaro) Hi @liangent, I see this tool is still running on the Grid, it will be stopped tomorrow (just the grid processes, the tool and all the data are... [12:35:43] 10Grid-Engine-to-K8s-Migration: Migrate wordpile from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320184 (10dcaro) @Asaf Hi! Any updates on the migration of this tool? It will be stopped from running on the Grid tomorrow unless you ask for an extension (tops 1 month). It se... [12:40:36] (03CR) 10Eugene233: "recheck" [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/991970 (owner: 10Afimaame) [12:40:51] (03CR) 10Eugene233: "recheck" [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/995349 (owner: 10AgnesAbah) [12:41:00] (03CR) 10CI reject: [V: 04-1] Reviewed and improved the code comments on the Isa tool [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/991970 (owner: 10Afimaame) [12:41:06] (03PS1) 10Majavah: inventory: Fix cloudcontrol1006 hostname [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1002971 [12:41:13] (03CR) 10Eugene233: "recheck" [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/991948 (owner: 10Ikeadeoyin) [12:41:21] (03CR) 10CI reject: [V: 04-1] fix werkzeug.url error in T355466 [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/991948 (owner: 10Ikeadeoyin) [12:41:29] (03CR) 10Majavah: [C: 03+2] inventory: Fix cloudcontrol1006 hostname [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1002971 (owner: 10Majavah) [12:41:33] (03CR) 10Eugene233: "recheck" [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/995350 (owner: 10AgnesAbah) [12:41:37] (ProbeDown) firing: (2) Service toolsbeta-test-k8s-haproxy-3:30000 has failed probes (http_admin_beta_toolforge_org_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [12:41:45] (03CR) 10Eugene233: "recheck" [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/997831 (owner: 10Josefanthony) [12:41:58] (03CR) 10Eugene233: "recheck" [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/998432 (owner: 10Juniorbesong) [12:42:04] (03CR) 10Eugene233: "recheck" [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/991814 (owner: 10Josefanthony) [12:42:07] 10Grid-Engine-to-K8s-Migration: Migrate mbh from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319883 (10dcaro) >>! In T319883#9475600, @Ghuron wrote: > Thanks for the detailed responses, but I feel that one piece is still missing in the puzzle. As you can see, for instance [[... [12:42:14] (03CR) 10Eugene233: "recheck" [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/994376 (owner: 10Josefanthony) [12:42:21] (03CR) 10Eugene233: "recheck" [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/995348 (owner: 10AgnesAbah) [12:42:23] (03CR) 10CI reject: [V: 04-1] BUG: T320500 modified isa/campaigns/image_updater.py [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/998432 (owner: 10Juniorbesong) [12:42:25] (03CR) 10CI reject: [V: 04-1] campaign table is broken but fixed [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/994376 (owner: 10Josefanthony) [12:43:33] 10Grid-Engine-to-K8s-Migration: Migrate phetools from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319965 (10dcaro) Thanks @Soda, your work is greatly appreciated, we can extend the stopping of the tool one month to allow you to keep moving parts to different codebases/tools. [12:44:36] 10Grid-Engine-to-K8s-Migration, 10User-revi: Migrate revibot from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320006 (10dcaro) 05Open→03Resolved I don't see anything more running on the grid :), thanks, I'll close the task. [12:45:24] (03Merged) 10jenkins-bot: inventory: Fix cloudcontrol1006 hostname [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1002971 (owner: 10Majavah) [12:45:37] 10Grid-Engine-to-K8s-Migration, 10User-revi: Migrate tc-rc from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320079 (10dcaro) Hi @revi, I still see a couple of jobs running for this tool on the Grid, they will be stopped tomorrow unless you ask for an extension (tops 1 mont... [12:49:59] (03PS1) 10Majavah: openstack: cloudcontrol: fix reboot cookbook [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1002975 [12:50:37] (03CR) 10Majavah: [C: 03+2] openstack: cloudcontrol: fix reboot cookbook [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1002975 (owner: 10Majavah) [12:53:50] (03Merged) 10jenkins-bot: openstack: cloudcontrol: fix reboot cookbook [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1002975 (owner: 10Majavah) [12:56:15] 10Grid-Engine-to-K8s-Migration: Migrate wordpile from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320184 (10Ijon) No need for an extension. The tool does not seem to be used much. I'll redeploy it on toolforge once we figure out why erex-yomi isn't working (see comment I tag... [12:59:10] (GaleraClusterSizeMismatch) firing: Galera in eqiad1 has 2 nodes - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/GaleraClusterSizeMismatch - https://grafana.wikimedia.org/d/galera-cluster-summary/wmcs-openstack-eqiad-galera-cluster-summary - https://alerts.wikimedia.org/?q=alertname%3DGaleraClusterSizeMismatch [12:59:22] (HAProxyBackendUnavailable) firing: (13) HAProxy service cinder-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [13:04:10] (GaleraClusterSizeMismatch) resolved: (2) Galera in eqiad1 has 2 nodes - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/GaleraClusterSizeMismatch - https://grafana.wikimedia.org/d/galera-cluster-summary/wmcs-openstack-eqiad-galera-cluster-summary - https://alerts.wikimedia.org/?q=alertname%3DGaleraClusterSizeMismatch [13:04:22] (HAProxyBackendUnavailable) firing: (13) HAProxy service cinder-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [13:04:24] (03PS1) 10Majavah: openstack: cloudcontrol: do not try to run network tests [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1002978 [13:07:14] (03CR) 10CI reject: [V: 04-1] openstack: cloudcontrol: do not try to run network tests [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1002978 (owner: 10Majavah) [13:08:03] (03PS2) 10Majavah: openstack: cloudcontrol: do not try to run network tests [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1002978 [13:09:22] (HAProxyBackendUnavailable) resolved: (13) HAProxy service cinder-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [13:14:37] (HAProxyBackendUnavailable) firing: (16) HAProxy service cinder-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [13:15:22] (HAProxyServiceUnavailable) firing: (2) HAProxy service mysql has no available backends on cloudlb1001:9900 - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyServiceUnavailable [13:15:45] 10cloud-services-team: HAProxyServiceUnavailable - https://phabricator.wikimedia.org/T357405 (10phaultfinder) [13:16:40] (GaleraClusterSizeMismatch) firing: (2) Galera in eqiad1 has 2 nodes - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/GaleraClusterSizeMismatch - https://grafana.wikimedia.org/d/galera-cluster-summary/wmcs-openstack-eqiad-galera-cluster-summary - https://alerts.wikimedia.org/?q=alertname%3DGaleraClusterSizeMismatch [13:17:42] (CloudVPSDesignateLeaks) firing: (2) Detected 12 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [13:18:55] (GaleraClusterSizeMismatch) firing: (2) Galera in eqiad1 has 2 nodes - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/GaleraClusterSizeMismatch - https://grafana.wikimedia.org/d/galera-cluster-summary/wmcs-openstack-eqiad-galera-cluster-summary - https://alerts.wikimedia.org/?q=alertname%3DGaleraClusterSizeMismatch [13:19:37] (HAProxyBackendUnavailable) firing: (26) HAProxy service cinder-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [13:21:40] (GaleraClusterSizeMismatch) resolved: (2) Galera in eqiad1 has 2 nodes - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/GaleraClusterSizeMismatch - https://grafana.wikimedia.org/d/galera-cluster-summary/wmcs-openstack-eqiad-galera-cluster-summary - https://alerts.wikimedia.org/?q=alertname%3DGaleraClusterSizeMismatch [13:22:49] 10Toolforge Build Service: Build service: Calling nontrivial Procfile commands with arguments results in confusing error (“no such file or directory”) - https://phabricator.wikimedia.org/T356016 (10LucasWerkmeister) a:05LucasWerkmeister→03None [13:23:00] 10Cloud-VPS, 10cloud-services-team: "HAProxy service mysql has no available backends" fires when galera primary is down - https://phabricator.wikimedia.org/T357406 (10taavi) [13:24:08] 10Grid-Engine-to-K8s-Migration: Migrate wd-shex-infer from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320140 (10LucasWerkmeister) As the migration deadline approaches, and I’m still blocked on T357209, I request that you don’t shut down my tool tomorrow until I can actually... [13:24:37] (HAProxyBackendUnavailable) resolved: (13) HAProxy service cinder-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [13:25:22] (HAProxyServiceUnavailable) resolved: (2) HAProxy service mysql has no available backends on cloudlb1001:9900 - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyServiceUnavailable [13:26:19] 10cloud-services-team: HAProxyServiceUnavailable - https://phabricator.wikimedia.org/T357405 (10dcaro) 05Open→03Invalid See {T357406} [13:26:23] 10cloud-services-team: HAProxyServiceUnavailable - https://phabricator.wikimedia.org/T357405 (10aborrero) [13:26:28] 10Cloud-VPS, 10cloud-services-team: "HAProxy service mysql has no available backends" fires when galera primary is down - https://phabricator.wikimedia.org/T357406 (10aborrero) [13:42:22] (HAProxyBackendUnavailable) firing: (11) HAProxy service cinder-api_backend backend cloudcontrol1005.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [13:44:10] (GaleraClusterSizeMismatch) firing: Galera in eqiad1 has 2 nodes - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/GaleraClusterSizeMismatch - https://grafana.wikimedia.org/d/galera-cluster-summary/wmcs-openstack-eqiad-galera-cluster-summary - https://alerts.wikimedia.org/?q=alertname%3DGaleraClusterSizeMismatch [13:47:22] (HAProxyBackendUnavailable) firing: (13) HAProxy service cinder-api_backend backend cloudcontrol1005.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [13:49:10] (GaleraClusterSizeMismatch) resolved: (2) Galera in eqiad1 has 2 nodes - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/GaleraClusterSizeMismatch - https://grafana.wikimedia.org/d/galera-cluster-summary/wmcs-openstack-eqiad-galera-cluster-summary - https://alerts.wikimedia.org/?q=alertname%3DGaleraClusterSizeMismatch [13:52:22] (HAProxyBackendUnavailable) resolved: (13) HAProxy service cinder-api_backend backend cloudcontrol1005.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [14:32:03] 10Cloud-VPS, 10Toolforge, 10cloud-services-team, 10Upstream: Add SSHFP dns records to bastions - https://phabricator.wikimedia.org/T132225 (10taavi) I think I'd like to have toolforge.org DNSSEC signed before implementing this. [14:34:05] 10Grid-Engine-to-K8s-Migration, 10urbanecmbot, 10User-Urbanecm: Migrate urbanecmbot from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320108 (10dcaro) @Urbanecm I think there was a typo on your `.lighttpd.conf` file, an extra `=`: ` tools.urbanecmbot@tools-sgebastion-10 ~... [14:37:32] 10Grid-Engine-to-K8s-Migration: Migrate wmf-sitematrix from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320180 (10dcaro) @Abbe98 Is there a repository with the code I can look at? (and play with xd) [14:50:40] !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_node for host toolsbeta-test-k8s-ingress-4 [14:51:21] !log taavi@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.remove_k8s_node (exit_code=0) for host toolsbeta-test-k8s-ingress-4 [14:51:30] !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_node for host toolsbeta-test-k8s-ingress-5 [14:52:01] !log taavi@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.remove_k8s_node (exit_code=0) for host toolsbeta-test-k8s-ingress-5 [14:52:53] !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_node for host toolsbeta-test-k8s-control-4 [14:53:22] !log taavi@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.remove_k8s_node (exit_code=0) for host toolsbeta-test-k8s-control-4 [14:53:57] 10Grid-Engine-to-K8s-Migration: Migrate superyetkin from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320070 (10Superyetkin) 05Open→03Resolved The migration is complete. Thanks. [14:54:20] 10Toolforge: Automatically add required taints and labels to ingress nodes - https://phabricator.wikimedia.org/T357425 (10taavi) [14:55:19] !log fran@wmf3169 admin START - Cookbook wmcs.openstack.cloudvirt.safe_reboot (T356975) [14:55:23] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [14:58:33] PROBLEM - Host cloudvirt-wdqs1001 is DOWN: PING CRITICAL - Packet loss = 100% [14:59:11] RECOVERY - Host cloudvirt-wdqs1001 is UP: PING OK - Packet loss = 0%, RTA = 0.26 ms [14:59:16] !log fran@wmf3169 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.safe_reboot (exit_code=0) (T356975) [15:00:41] !log fran@wmf3169 admin START - Cookbook wmcs.openstack.cloudvirt.safe_reboot (T356975) [15:04:21] 10PAWS: Upgrade Jupyterlab - https://phabricator.wikimedia.org/T357027 (10rook) [15:04:55] !log fran@wmf3169 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.safe_reboot (exit_code=0) (T356975) [15:05:50] (NeutronAgentDown) firing: Neutron neutron-linuxbridge-agent on cloudvirt-wdqs1002 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown [15:06:25] !log fran@wmf3169 admin START - Cookbook wmcs.openstack.cloudvirt.safe_reboot (T356975) [15:08:10] 10Grid-Engine-to-K8s-Migration: Migrate huggle from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319797 (10dcaro) @Petrb Hi! That's just the current url of the webservice, it's the same for kubernetes and for grid webservices (https://.toolforge.org), I can see that... [15:10:40] !log fran@wmf3169 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.safe_reboot (exit_code=0) (T356975) [15:10:45] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [15:17:14] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_node for host tools-k8s-worker-65 [15:17:57] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.remove_k8s_node (exit_code=0) for host tools-k8s-worker-65 [15:19:00] 10Toolforge (Toolforge iteration 05), 10cloud-services-team, 10Kubernetes, 10Patch-For-Review: Toolforge k8s: Migrate workers to Containerd and Bookworm - https://phabricator.wikimedia.org/T284656 (10taavi) [15:19:14] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.add_k8s_node for a worker-nfs role in the tools cluster [15:25:50] (NeutronAgentDown) resolved: Neutron neutron-linuxbridge-agent on cloudvirt-wdqs1002 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown [15:28:51] 10Grid-Engine-to-K8s-Migration: Migrate persondata from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319962 (10dcaro) This might be interesting for you too {T341919} [15:30:16] !log taavi@cloudcumin1001 tools Added a new k8s worker-nfs tools-k8s-worker-nfs-22.tools.eqiad1.wikimedia.cloud to the cluster [15:30:16] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.add_k8s_node (exit_code=0) for a worker-nfs role in the tools cluster [15:30:33] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_node for host tools-k8s-worker-66 [15:31:13] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.remove_k8s_node (exit_code=0) for host tools-k8s-worker-66 [15:31:21] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.add_k8s_node for a worker-nfs role in the tools cluster [15:41:37] (ProbeDown) firing: (2) Service toolsbeta-test-k8s-haproxy-3:30000 has failed probes (http_admin_beta_toolforge_org_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [15:41:41] !log taavi@cloudcumin1001 tools Added a new k8s worker-nfs tools-k8s-worker-nfs-23.tools.eqiad1.wikimedia.cloud to the cluster [15:41:41] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.add_k8s_node (exit_code=0) for a worker-nfs role in the tools cluster [15:41:45] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_node for host tools-k8s-worker-67 [15:42:27] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.remove_k8s_node (exit_code=0) for host tools-k8s-worker-67 [15:47:17] 10Cloud-VPS, 10cloud-services-team: Recurring designate record leaks - https://phabricator.wikimedia.org/T356516 (10Andrew) 05Open→03Resolved I think this is a past bug, rather than a present bug. Many older designate records don't have an associated managed_resource_id which is what designate-sink uses to... [15:49:38] 10Toolforge (Toolforge iteration 05), 10Patch-For-Review: Support probes in kubernetes webservices - https://phabricator.wikimedia.org/T341919 (10CodeReviewBot) dcaro opened https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/24 k8s: allow passing the http probe path [16:01:35] 10Toolforge (Toolforge iteration 05), 10User-aborrero: [toolforge API] Investigate ways to present our multiple Openapi definitions to a future consolidated CLI client - https://phabricator.wikimedia.org/T354745 (10dcaro) >>! In T354745#9537865, @aborrero wrote: >>>! In T354745#9537843, @dcaro wrote: >> That w... [16:01:42] !log taavi@runko admin START - Cookbook wmcs.openstack.cloudvirt.safe_reboot on hosts matched by 'D{cloudvirt1001.eqiad.wmnet}' [16:01:43] !log taavi@runko admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.safe_reboot (exit_code=0) on hosts matched by 'D{cloudvirt1001.eqiad.wmnet}' [16:01:47] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [16:01:50] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [16:02:04] !log taavi@runko admin START - Cookbook wmcs.openstack.cloudvirt.safe_reboot on hosts matched by 'P:openstack::eqiad1::nova::compute::service' [16:02:04] !log taavi@runko admin END (FAIL) - Cookbook wmcs.openstack.cloudvirt.safe_reboot (exit_code=99) on hosts matched by 'P:openstack::eqiad1::nova::compute::service' [16:02:23] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [16:02:26] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [16:02:28] !log taavi@runko admin START - Cookbook wmcs.openstack.cloudvirt.safe_reboot on hosts matched by 'P{P:openstack::eqiad1::nova::compute::service}' [16:02:29] !log taavi@runko admin END (FAIL) - Cookbook wmcs.openstack.cloudvirt.safe_reboot (exit_code=99) on hosts matched by 'P{P:openstack::eqiad1::nova::compute::service}' [16:02:34] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [16:02:37] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [16:03:18] !log taavi@runko admin START - Cookbook wmcs.openstack.cloudvirt.safe_reboot on hosts matched by 'P{O:wmcs::openstack::eqiad1::virt_ceph}' [16:03:19] !log taavi@runko admin END (FAIL) - Cookbook wmcs.openstack.cloudvirt.safe_reboot (exit_code=99) on hosts matched by 'P{O:wmcs::openstack::eqiad1::virt_ceph}' [16:04:01] !log taavi@runko admin START - Cookbook wmcs.openstack.cloudvirt.safe_reboot on hosts matched by 'P{O:wmcs::openstack::eqiad1::virt_ceph}' [16:04:02] !log taavi@runko admin END (FAIL) - Cookbook wmcs.openstack.cloudvirt.safe_reboot (exit_code=99) on hosts matched by 'P{O:wmcs::openstack::eqiad1::virt_ceph}' [16:04:05] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [16:04:08] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [16:04:40] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [16:04:44] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [16:06:30] !log taavi@runko admin START - Cookbook wmcs.openstack.cloudvirt.safe_reboot on hosts matched by 'P{O:wmcs::openstack::eqiad1::virt_ceph}' [16:06:34] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [16:06:54] !log taavi@runko admin END (FAIL) - Cookbook wmcs.openstack.cloudvirt.safe_reboot (exit_code=99) on hosts matched by 'P{O:wmcs::openstack::eqiad1::virt_ceph}' [16:06:59] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [16:22:52] 10Grid-Engine-to-K8s-Migration: Migrate ato from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319580 (10Tacsipacsi) @dcaro I’m not a maintainer of this tool, but hasn’t it long been migrated away from the grid? https://grid-deprecation.toolforge.org/t/ato lists no grid jobs,... [16:28:41] 10Tools, 10Community-Tech (2015-2017), 10I18n: Add i18n support to Copyvio Detector [AOI] - https://phabricator.wikimedia.org/T110124 (10MusikAnimal) CopyPatrol has long had i18n support. This task is about the Copyvios tool. [16:33:13] 10Toolforge Jobs framework, 10User-aborrero: toolforge jobs current image alias - https://phabricator.wikimedia.org/T357388 (10Andrew) We discussed this at length during our toolforge council meeting. We considered two options, neither of which is very popular. * Add a default image selection for when --image... [16:44:22] (HAProxyBackendUnavailable) firing: HAProxy service neutron-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [16:46:59] 10wikitech.wikimedia.org, 10Unstewarded-production-error, 10Wikimedia-production-error: UrlShortener throws DBConnectionError exception on wikitech - https://phabricator.wikimedia.org/T341470 (10Krinkle) [16:49:22] (HAProxyBackendUnavailable) resolved: HAProxy service neutron-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [17:07:33] !log taavi@runko admin START - Cookbook wmcs.openstack.cloudvirt.safe_reboot on hosts matched by 'P{O:wmcs::openstack::eqiad1::virt_ceph}' [17:07:36] !log taavi@runko admin END (ERROR) - Cookbook wmcs.openstack.cloudvirt.safe_reboot (exit_code=97) on hosts matched by 'P{O:wmcs::openstack::eqiad1::virt_ceph}' [17:07:38] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [17:07:41] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [17:11:08] (03CR) 10Arturo Borrero Gonzalez: [C: 03+1] "LGTM." [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1002978 (owner: 10Majavah) [17:12:55] (03CR) 10Majavah: [C: 03+2] openstack: cloudcontrol: do not try to run network tests [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1002978 (owner: 10Majavah) [17:15:52] (03PS1) 10Majavah: openstack: cloudvirt: add support for batch reboots [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1003047 [17:17:43] (CloudVPSDesignateLeaks) firing: (2) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [17:19:21] (03CR) 10CI reject: [V: 04-1] openstack: cloudvirt: add support for batch reboots [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1003047 (owner: 10Majavah) [17:20:57] (03PS2) 10Majavah: openstack: cloudvirt: add support for batch reboots [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1003047 [17:24:23] (03CR) 10CI reject: [V: 04-1] openstack: cloudvirt: add support for batch reboots [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1003047 (owner: 10Majavah) [17:24:51] 10Grid-Engine-to-K8s-Migration: Migrate ato from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319580 (10dcaro) >>! In T319580#9538647, @Tacsipacsi wrote: > @dcaro I’m not a maintainer of this tool, but hasn’t it long been migrated away from the grid? https://grid-deprecation.... [17:27:09] 10Grid-Engine-to-K8s-Migration, 10urbanecmbot, 10User-Urbanecm: Migrate urbanecmbot from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320108 (10Urbanecm) Thanks @dcaro! I was figuring out how to make this work in k8s instead, and it was just a typo. The question still st... [17:29:45] (03Merged) 10jenkins-bot: openstack: cloudcontrol: do not try to run network tests [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1002978 (owner: 10Majavah) [17:37:07] (03PS3) 10Majavah: openstack: cloudvirt: add support for batch reboots [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1003047 [17:39:53] 10Toolforge (Toolforge iteration 05), 10Patch-For-Review: Support probes in kubernetes webservices - https://phabricator.wikimedia.org/T341919 (10bd808) >>! In T341919#9017404, @LucasWerkmeister wrote: > standardize on a common default path (e.g. `/health` or `/healthz`), which would then also be filtered out... [17:52:49] (PuppetConstantChange) firing: Puppet performing a change on every puppet run on cloudcumin1001:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [17:56:02] 10Grid-Engine-to-K8s-Migration: Migrate bawolff from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319584 (10Bawolff) I didn't see this task until now. However i think i have done this. [17:57:27] (03PS1) 10Ketulucas: Fix application instructions [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1003060 (https://phabricator.wikimedia.org/T123434) [17:58:31] 10Tool-Global-user-contributions, 10Stewards-and-global-tools, 10Temporary accounts, 10XTools, 10Epic: Investigate: How to make the GUC query performant - https://phabricator.wikimedia.org/T355672 (10Ladsgroup) >>! In T355672#9535008, @MusikAnimal wrote: > I didn't elaborate on IP ranges, but doing that... [17:59:53] (03CR) 10Ketulucas: "recheck" [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1003060 (https://phabricator.wikimedia.org/T123434) (owner: 10Ketulucas) [18:02:00] (NovafullstackSustainedFailures) firing: Novafullstack tests have been failing for more than 5hours in eqiad - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/NovafullstackSustainedFailures - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-nova-fullstack?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DNovafullstackSustainedFailures [18:22:49] (PuppetConstantChange) resolved: Puppet performing a change on every puppet run on cloudcumin1001:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [18:41:37] (ProbeDown) firing: (2) Service toolsbeta-test-k8s-haproxy-3:30000 has failed probes (http_admin_beta_toolforge_org_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [18:50:26] 10Tool-Global-user-contributions, 10Stewards-and-global-tools, 10Temporary accounts, 10XTools, 10Epic: Investigate: How to make the GUC query performant - https://phabricator.wikimedia.org/T355672 (10Tchanders) >>! In T355672#9535008, @MusikAnimal wrote: > I didn't elaborate on IP ranges, but doing that... [18:56:12] 10Toolforge Build Service: Build service: Calling nontrivial Procfile commands with arguments results in confusing error (“no such file or directory”) - https://phabricator.wikimedia.org/T356016 (10LucasWerkmeister) (Note: At the moment T320140 isn’t actually blocked on this, as I’m using Kubernetes directly. If... [19:00:22] (HAProxyBackendUnavailable) firing: HAProxy service neutron-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [19:02:18] (03PS1) 10Lewis Cawte: Set defaultbranch for git review [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1003081 [19:05:22] (HAProxyBackendUnavailable) resolved: HAProxy service neutron-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [19:08:17] (03PS1) 10Amire80: Fix spaces in some messages [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1003085 (https://phabricator.wikimedia.org/T357422) [19:09:16] (03PS1) 10Amire80: Define main as the default branch [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1003086 [19:15:20] (03CR) 10Eugene233: [C: 03+1] Define main as the default branch [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1003086 (owner: 10Amire80) [19:15:35] 10Tool-Global-user-contributions, 10Stewards-and-global-tools, 10Temporary accounts, 10XTools, 10Epic: Investigate: How to make the GUC query performant - https://phabricator.wikimedia.org/T355672 (10Ladsgroup) >>! In T355672#9539383, @Tchanders wrote: >>>! In T355672#9535008, @MusikAnimal wrote: >> I di... [19:19:10] 10Cloud-VPS, 10Wikimedia-production-error: labtestwikitech down - Wikimedia\Rdbms\DBConnectionError: Cannot access the database: Connection refused (clouddb2002-dev) - https://phabricator.wikimedia.org/T357459 (10brennen) [19:21:11] 10Cloud-VPS, 10cloud-services-team, 10Wikimedia-production-error: labtestwikitech down - Wikimedia\Rdbms\DBConnectionError: Cannot access the database: Connection refused (clouddb2002-dev) - https://phabricator.wikimedia.org/T357459 (10taavi) 05Open→03Resolved a:03taavi cloudweb2002-dev was rebooted f... [19:21:55] 10Cloud-VPS, 10cloud-services-team, 10Wikimedia-production-error: labtestwikitech down - Wikimedia\Rdbms\DBConnectionError: Cannot access the database: Connection refused (clouddb2002-dev) - https://phabricator.wikimedia.org/T357459 (10brennen) Thanks! [19:22:43] (CloudVPSDesignateLeaks) firing: (2) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [19:27:43] (CloudVPSDesignateLeaks) resolved: (2) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [19:32:01] (NovafullstackSustainedFailures) resolved: Novafullstack tests have been failing for more than 5hours in eqiad - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/NovafullstackSustainedFailures - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-nova-fullstack?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DNovafullstackSustainedFailures [19:42:02] 10Cloud-VPS, 10cloud-services-team, 10User-aborrero: Some VPS instances still using ns-recursor0 - https://phabricator.wikimedia.org/T346426 (10Andrew) I noticed this morning that this broke new VMs based on images built before the new resolver IP was added. To fix, I rebuilt and installed a new Bullseye bas... [20:03:23] 10VPS-project-Wikistats, 10collaboration-services, 10User-RhinosF1: Add 'wikitide' to wikistats - https://phabricator.wikimedia.org/T349660 (10Dzahn) @Reception123 Given that htps://wikitide.org redirects to htps://wikitide.org nowadays I think we can close this as invalid or so. Ok with you? [20:04:56] 10VPS-project-Wikistats, 10collaboration-services, 10User-RhinosF1: Add 'wikitide' to wikistats - https://phabricator.wikimedia.org/T349660 (10Reception123) 05Stalled→03Invalid Miraheze has merged with WikiTide and is now ran by WikiTide Foundation so this is indeed no longer necessary. Sorry for not upd... [20:19:15] (03CR) 10Eugene233: [C: 03+2] Define main as the default branch [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1003086 (owner: 10Amire80) [20:19:42] (03Merged) 10jenkins-bot: Define main as the default branch [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1003086 (owner: 10Amire80) [20:20:46] 10VPS-project-Wikistats, 10collaboration-services, 10User-RhinosF1: Add 'wikitide' to wikistats - https://phabricator.wikimedia.org/T349660 (10Dzahn) Alright, thanks! [20:24:04] (03CR) 10Eugene233: [C: 03+2] Fix spaces in some messages [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1003085 (https://phabricator.wikimedia.org/T357422) (owner: 10Amire80) [20:24:30] (03Merged) 10jenkins-bot: Fix spaces in some messages [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1003085 (https://phabricator.wikimedia.org/T357422) (owner: 10Amire80) [21:15:22] (HAProxyBackendUnavailable) firing: HAProxy service neutron-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [21:20:22] (HAProxyBackendUnavailable) resolved: HAProxy service neutron-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [21:41:37] (ProbeDown) firing: (2) Service toolsbeta-test-k8s-haproxy-3:30000 has failed probes (http_admin_beta_toolforge_org_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [21:45:24] 10Wikibugs: host "tools-sgebastion-07.tools.eqiad.wmflabs" is not an admin host - https://phabricator.wikimedia.org/T262268 (10bd808) Has this been taken care of as a side effect of the migration to Kubernetes or is there still something to fix here? [21:52:22] 10Wikibugs: Frequent exception while trying to extract anchors from task - https://phabricator.wikimedia.org/T199007 (10bd808) [21:52:24] 10Wikibugs: wikibugs.wb2-phab: Could not retrieve anchor - https://phabricator.wikimedia.org/T242261 (10bd808) [21:52:41] 10Wikibugs: Frequent exception while trying to extract anchors from task - https://phabricator.wikimedia.org/T199007 (10bd808) p:05Triage→03Medium [22:00:28] (PuppetCertificateAboutToExpire) firing: Puppet CA certificate Puppet CA: paws-puppetmaster-01.paws.eqiad.wmflabs is about to expire in 27d 23h 58m 23s - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetCertificateAboutToExpire [22:47:06] 10Toolforge Jobs framework, 10User-aborrero: toolforge jobs current image alias - https://phabricator.wikimedia.org/T357388 (10tstarling) >>! In T357388#9538726, @Andrew wrote: > The value of this depends on how much we believe that an automatic platform upgrade will or won't work. Our general suspicion is tha... [22:51:18] 10Toolforge Jobs framework, 10User-aborrero: toolforge jobs current image aliases - https://phabricator.wikimedia.org/T357388 (10tstarling) [22:54:48] (PuppetConstantChange) firing: Puppet performing a change on every puppet run on cloudweb2002-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [22:59:12] 10Wikibugs: Frequent exception while trying to extract anchors from task - https://phabricator.wikimedia.org/T199007 (10bd808) The `get_anchors_for_task` method that is raising the error is: `lang=python def get_anchors_for_task(self, task_page): """ :param url: url to task :type url:... [23:42:58] 10Grid-Engine-to-K8s-Migration: Migrate mbh from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319883 (10MBH) @dcaro could you add "mono-tf68" as possible image for webservice, as I requested in [[https://phabricator.wikimedia.org/T319883#9438992|this comment]]? Or what should... [23:53:09] 10Grid-Engine-to-K8s-Migration: Migrate jembot from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319828 (10-jem-) 05Open→03Resolved I have completed the migration by using external servers when needed, and I'm already running Kubernetes in my webservice, although it seems...