[00:10:19] (03PS1) 10Andrew Bogott: Huge update of .po strings to pick up a single html text change [openstack/horizon/horizon] (2023.1) - 10https://gerrit.wikimedia.org/r/982494 [00:14:34] (03CR) 10Andrew Bogott: [V: 03+2 C: 03+2] Huge update of .po strings to pick up a single html text change [openstack/horizon/horizon] (2023.1) - 10https://gerrit.wikimedia.org/r/982494 (owner: 10Andrew Bogott) [00:16:59] 10Grid-Engine-to-K8s-Migration: Migrate bub from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319610 (10Soda) >>! In T319610#9400673, @komla wrote: > @Soda it shows it's running. If it is no longer in use, kindly disable it and mark this as closed. I don't have access to the... [00:51:08] 10superset.wmcloud.org: sql backup to rotate after successful backup - https://phabricator.wikimedia.org/T352766 (10github-toolforge-bot) vivian-rook opened https://github.com/toolforge/superset-deploy/pull/13 [00:51:17] vivian-rook opened https://github.com/toolforge/superset-deploy/pull/13 [00:51:54] 10superset.wmcloud.org: sql backup to rotate after successful backup - https://phabricator.wikimedia.org/T352766 (10github-toolforge-bot) vivian-rook closed https://github.com/toolforge/superset-deploy/pull/13 [00:52:03] 10superset.wmcloud.org: sql backup to rotate after successful backup - https://phabricator.wikimedia.org/T352766 (10rook) 05Open→03Resolved [00:52:03] vivian-rook closed https://github.com/toolforge/superset-deploy/pull/13 [01:19:41] 10Grid-Engine-to-K8s-Migration: Migrate dplbot from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319701 (10russblau) >>! In T319701#9400759, @komla wrote: > See the usage of the pywikibot image [[ https://wikitech.wikimedia.org/wiki/Help:Toolforge/Running_Pywikibot_scripts |... [01:30:41] 10Grid-Engine-to-K8s-Migration: Migrate fastilybot-reports from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319741 (10Fastily) @komla @nskaggs Going to need additional time to rewrite my tool. Could I please get an extension on the shutdown? Thanks. [02:25:04] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [02:25:04] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [02:43:56] (ToolsGridQueueProblem) firing: Grid queue webgrid-lighttpd@tools-sgeweblight-10-21.tools.eqiad1.wikimedia.cloud is in state E - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsGridQueueProblem - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsGridQueueProblem [03:47:55] 10Toolforge (Toolforge iteration 02), 10Patch-For-Review, 10User-Raymond_Ndibe: [gitlab,toolforge-deploy] Create a process to open an MR to toolforge-deploy when a new release ofa component happens - https://phabricator.wikimedia.org/T347392 (10CodeReviewBot) raymond-ndibe merged https://gitlab.wikimedia.org... [03:59:44] 10Grid-Engine-to-K8s-Migration: Migrate bub from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319610 (10komla) >>! In T319610#9401717, @Soda wrote: >>>! In T319610#9400673, @komla wrote: >> @Soda it shows it's running. If it is no longer in use, kindly disable it and mark thi... [05:25:04] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [05:25:04] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [05:43:56] (ToolsGridQueueProblem) firing: Grid queue webgrid-lighttpd@tools-sgeweblight-10-21.tools.eqiad1.wikimedia.cloud is in state E - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsGridQueueProblem - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsGridQueueProblem [05:56:27] (03PS1) 10Andrew Bogott: Another attempt at reformatting keypair.html [openstack/horizon/horizon] (2023.1) - 10https://gerrit.wikimedia.org/r/982512 [06:34:49] (03CR) 10Andrew Bogott: [V: 03+2 C: 03+2] Another attempt at reformatting keypair.html [openstack/horizon/horizon] (2023.1) - 10https://gerrit.wikimedia.org/r/982512 (owner: 10Andrew Bogott) [07:27:50] 10Data-Services, 10Quarry, 10cloud-services-team (FY2023/2024-Q1-Q2): Create db user for Quarry with readonly access to public ToolsDB databases - https://phabricator.wikimedia.org/T348407 (10SD0001) Hi @fnegri, did you get a chance to get to this? Thanks! [08:25:04] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [08:25:04] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [08:43:56] (ToolsGridQueueProblem) firing: Grid queue webgrid-lighttpd@tools-sgeweblight-10-21.tools.eqiad1.wikimedia.cloud is in state E - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsGridQueueProblem - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsGridQueueProblem [08:48:04] 10Grid-Engine-to-K8s-Migration: Migrate isa from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319818 (10Sebastian_Berlin-WMSE) @komla, just to clarify, does this mean you've determined that there's no longer anything running on GridEngine for the tool? Since your last post ab... [09:11:34] 10Grid-Engine-to-K8s-Migration: Migrate bub from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319610 (10Soda) >>! In T319610#9401869, @komla wrote: >>>! In T319610#9401717, @Soda wrote: >>>>! In T319610#9400673, @komla wrote: >>> @Soda it shows it's running. If it is no longe... [09:16:46] 10Grid-Engine-to-K8s-Migration: Migrate botwikiawk from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319607 (10dcaro) >>! In T319607#9394449, @Green_Cardamom wrote: > Hi, my tools here are awk and tcsh scripts mostly, that further invoke other unix tools like sort, uniq etc..... [09:34:20] (ProbeDown) firing: Service toolserver-proxy-01:443 has failed probes (http_toolserver_org_redirects_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#toolserver-proxy-01:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [09:39:20] (ProbeDown) resolved: Service toolserver-proxy-01:443 has failed probes (http_toolserver_org_redirects_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#toolserver-proxy-01:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [09:42:20] (ProbeDown) firing: Service toolserver-proxy-01:443 has failed probes (http_toolserver_org_redirects_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#toolserver-proxy-01:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [09:47:09] !log toolsbeta dcaro@urcuchillay START - Cookbook wmcs.toolforge.k8s.component.deploy for component envvars-api [09:47:12] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [09:47:20] (ProbeDown) resolved: Service toolserver-proxy-01:443 has failed probes (http_toolserver_org_redirects_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#toolserver-proxy-01:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [09:47:40] !log toolsbeta dcaro@urcuchillay END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component envvars-api [09:47:42] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [09:49:14] !log tools dcaro@urcuchillay START - Cookbook wmcs.toolforge.k8s.component.deploy for component envvars-api [09:49:17] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [09:49:46] !log tools dcaro@urcuchillay END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component envvars-api [09:49:48] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [09:49:59] 10Toolforge (Toolforge iteration 02), 10Patch-For-Review, 10User-Raymond_Ndibe: [apis] nginx fails to reload on config change - https://phabricator.wikimedia.org/T350928 (10CodeReviewBot) dcaro merged https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/152 envvars-api: bump... [09:52:20] (ProbeDown) firing: (2) Service toolserver-proxy-01:443 has failed probes (http_toolserver_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#toolserver-proxy-01:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [09:57:20] (ProbeDown) resolved: (2) Service toolserver-proxy-01:443 has failed probes (http_toolserver_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#toolserver-proxy-01:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [10:00:20] (ProbeDown) firing: Service toolserver-proxy-01:443 has failed probes (http_toolserver_org_redirects_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#toolserver-proxy-01:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [10:05:20] (ProbeDown) resolved: Service toolserver-proxy-01:443 has failed probes (http_toolserver_org_redirects_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#toolserver-proxy-01:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [10:13:26] (ProbeDown) firing: Service toolserver-proxy-01:443 has failed probes (http_toolserver_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#toolserver-proxy-01:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [10:18:26] (ProbeDown) firing: (2) Service toolserver-proxy-01:443 has failed probes (http_toolserver_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#toolserver-proxy-01:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [10:23:29] 10Cloud Services Proposals, 10Toolforge (Toolforge iteration 02), 10cloud-services-team, 10Cloud-Services-Origin-Team, and 3 others: [toolforge-envvars.api,toolforge-build.api] Support using custom environment variables at build time - https://phabricator.wikimedia.org/T338142 (10CodeReviewBot) dcaro merge... [10:28:26] (ProbeDown) resolved: Service toolserver-proxy-01:443 has failed probes (http_toolserver_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#toolserver-proxy-01:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [10:37:04] 10Cloud Services Proposals, 10Toolforge (Toolforge iteration 02), 10cloud-services-team, 10Cloud-Services-Origin-Team, and 3 others: [toolforge-envvars.api,toolforge-build.api] Support using custom environment variables at build time - https://phabricator.wikimedia.org/T338142 (10CodeReviewBot) project_131... [10:40:57] 10Tool-ldap: Display group IDs - https://phabricator.wikimedia.org/T353311 (10taavi) [10:43:55] !log toolsbeta dcaro@urcuchillay START - Cookbook wmcs.toolforge.k8s.component.deploy for component builds-admission (T338142) [10:43:59] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [10:44:00] T338142: [toolforge-envvars.api,toolforge-build.api] Support using custom environment variables at build time - https://phabricator.wikimedia.org/T338142 [10:44:23] !log toolsbeta dcaro@urcuchillay END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component builds-admission (T338142) [10:44:27] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [10:45:10] 10Toolforge, 10Documentation: Document node.js in toolforge - https://phabricator.wikimedia.org/T188397 (10fnegri) 05Open→03Resolved a:03fnegri Documentation now exists at https://wikitech.wikimedia.org/wiki/Help:Toolforge/Node.js [10:45:39] 10Toolforge Build Service: [tbs] Explore adding caching support - https://phabricator.wikimedia.org/T350689 (10taavi) [10:46:31] 10Toolforge Build Service, 10Cloud-Services-Origin-Team, 10Cloud-Services-Worktype-Project, 10User-dcaro: [tbs]Add storage capabilities for buildpack services - https://phabricator.wikimedia.org/T293670 (10taavi) [10:48:01] !log tools dcaro@urcuchillay START - Cookbook wmcs.toolforge.k8s.component.deploy for component builds-admission (T338142) [10:48:04] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [10:48:12] 10Toolforge, 10cloud-services-team, 10Patch-Needs-Improvement, 10Python3-Porting: Upgrade various Toolforge infrastructure scripts from Python 2 to Python 3 - https://phabricator.wikimedia.org/T218427 (10taavi) [10:48:20] 10Toolforge, 10cloud-services-team, 10Patch-For-Review: Toolforge: Decomission the Grid Engine infrastructure - https://phabricator.wikimedia.org/T314664 (10taavi) [10:48:31] 10Toolforge (Toolforge iteration 02): [tbs] Create a tutorial on how to deploy a Node.js app using Build Service - https://phabricator.wikimedia.org/T353313 (10fnegri) [10:48:33] !log tools dcaro@urcuchillay END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component builds-admission (T338142) [10:48:36] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [10:48:49] 10Toolforge, 10cloud-services-team, 10Patch-Needs-Improvement, 10Python3-Porting: Upgrade various Toolforge infrastructure scripts from Python 2 to Python 3 - https://phabricator.wikimedia.org/T218427 (10taavi) 05Open→03Resolved a:03taavi `modules/profile/files/toolforge/proxylistener.py` is the only... [10:57:06] 10Toolforge (Toolforge iteration 02): [tbs] Create a tutorial on how to deploy a Node.js app using Build Service - https://phabricator.wikimedia.org/T353313 (10dcaro) [10:58:08] 10Cloud Services Proposals, 10Toolforge (Toolforge iteration 02), 10cloud-services-team, 10Cloud-Services-Origin-Team, and 3 others: [toolforge-envvars.api,toolforge-build.api] Support using custom environment variables at build time - https://phabricator.wikimedia.org/T338142 (10CodeReviewBot) dcaro merge... [11:02:57] 10Toolforge (Toolforge iteration 02), 10Patch-For-Review, 10User-dcaro: [builds-builder] Investigate how to enable mono/dotnet/c# and implement the best one to unblock us to migrate tools - https://phabricator.wikimedia.org/T352774 (10CodeReviewBot) dcaro merged https://gitlab.wikimedia.org/repos/cloud/toolf... [11:03:02] 10Cloud Services Proposals, 10Toolforge (Toolforge iteration 02), 10cloud-services-team, 10Cloud-Services-Origin-Team, and 3 others: [toolforge-envvars.api,toolforge-build.api] Support using custom environment variables at build time - https://phabricator.wikimedia.org/T338142 (10CodeReviewBot) dcaro merge... [11:04:14] 10Grid-Engine-to-K8s-Migration: Migrate stockholm-mania from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320063 (10Lokal_Profil) 05Open→03Resolved Migrated [11:05:14] 10Grid-Engine-to-K8s-Migration: Migrate videoconvert from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320126 (10Lokal_Profil) 05Open→03Resolved Migrated Backend is still broken, but leaving it up for now in case someone else wants to adopt the tool [11:16:38] 10VPS-Projects, 10WMDE-TechWish-Sprint-2023-11-22: Scraper: destroy Cloud VPS runner instance - https://phabricator.wikimedia.org/T345411 (10thiemowmde) 05Open→03Resolved [11:17:00] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_grid_node for tools-sgeweblight-10-16 [11:17:27] 10Toolforge, 10cloud-services-team: Tiny swap on many grid nodes - https://phabricator.wikimedia.org/T309902 (10taavi) 05Open→03Declined [11:17:36] 10Toolforge, 10cloud-services-team (Kanban), 10Patch-For-Review: Toolforge: add Debian Buster to the grid and eliminate Debian Stretch - https://phabricator.wikimedia.org/T277653 (10taavi) [11:24:03] (InstanceDown) firing: Project tools instance tools-sgeweblight-10-16 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [11:25:04] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [11:29:03] (InstanceDown) resolved: Project tools instance tools-sgeweblight-10-16 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [11:30:04] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [11:43:56] (ToolsGridQueueProblem) firing: Grid queue webgrid-lighttpd@tools-sgeweblight-10-21.tools.eqiad1.wikimedia.cloud is in state E - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsGridQueueProblem - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsGridQueueProblem [11:44:21] 10Grid-Engine-to-K8s-Migration: Migrate bub from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319610 (10taavi) 05Open→03Resolved a:03taavi [11:44:28] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.grid.cleanup_queue_errors [11:44:30] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.grid.cleanup_queue_errors (exit_code=0) [11:48:56] (ToolsGridQueueProblem) resolved: Grid queue webgrid-lighttpd@tools-sgeweblight-10-21.tools.eqiad1.wikimedia.cloud is in state E - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsGridQueueProblem - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsGridQueueProblem [12:18:58] 10Toolforge (Toolforge iteration 02), 10cloud-services-team (FY2023/2024-Q1-Q2): [tbs] Create a tutorial on how to deploy a Node.js app using Build Service - https://phabricator.wikimedia.org/T353313 (10fnegri) [12:19:19] 10Toolforge (Toolforge iteration 02), 10cloud-services-team (FY2023/2024-Q1-Q2): [tbs] Create a tutorial on how to deploy a Node.js app using Build Service - https://phabricator.wikimedia.org/T353313 (10fnegri) p:05Triage→03Medium [12:20:17] 10Toolforge (Toolforge iteration 02), 10Patch-For-Review, 10User-dcaro: [builds-builder] Investigate how to enable mono/dotnet/c# and implement the best one to unblock us to migrate tools - https://phabricator.wikimedia.org/T352774 (10CodeReviewBot) project_1317_bot_df3177307bed93c3f34e421e26c86e38 opened ht... [12:21:03] 10Cloud Services Proposals, 10Toolforge (Toolforge iteration 02), 10cloud-services-team, 10Cloud-Services-Origin-Team, and 3 others: [toolforge-envvars.api,toolforge-build.api] Support using custom environment variables at build time - https://phabricator.wikimedia.org/T338142 (10CodeReviewBot) project_131... [12:21:45] 10Data-Services, 10cloud-services-team (FY2023/2024-Q1-Q2): [toolsdb] Migrate mixnmatch db to Trove - https://phabricator.wikimedia.org/T350862 (10fnegri) [12:24:34] 10Toolforge, 10cloud-services-team (FY2023/2024-Q1-Q2), 10Documentation: Toolforge admin docs: revise new navigation menu and add category labels - https://phabricator.wikimedia.org/T345109 (10fnegri) [12:25:10] 10Data-Services, 10cloud-services-team (FY2023/2024-Q1-Q2): [toolsdb] Migrate mixnmatch db to Trove - https://phabricator.wikimedia.org/T350862 (10fnegri) p:05Triage→03Medium [12:31:59] 10Cloud-VPS, 10cloud-services-team, 10Cumin, 10Infrastructure-Foundations, 10Patch-For-Review: [cumin] [openstack] Openstack backend fails when project is not set - https://phabricator.wikimedia.org/T346453 (10fnegri) a:05fnegri→03Volans [12:39:35] 10Data-Services, 10Quarry, 10cloud-services-team (FY2023/2024-Q1-Q2): Create db user for Quarry with readonly access to public ToolsDB databases - https://phabricator.wikimedia.org/T348407 (10fnegri) @SD0001 not yet, sorry, too many other things! It's in my to-do list though! [13:15:16] 10Grid-Engine-to-K8s-Migration: Migrate nn1l2bot from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319926 (10taavi) 05Open→03Declined disabled tool per https://wikitech.wikimedia.org/w/index.php?title=Special:Log&logid=958033 [13:20:13] 10Grid-Engine-to-K8s-Migration: Migrate xslack from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320192 (10taavi) 05Open→03Declined Disabled tool per https://wikitech.wikimedia.org/w/index.php?title=Special:Log&logid=927443. [13:26:29] !log toolsbeta dcaro@urcuchillay START - Cookbook wmcs.toolforge.k8s.component.deploy for component builds-api (T338142) [13:26:33] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [13:26:34] T338142: [toolforge-envvars.api,toolforge-build.api] Support using custom environment variables at build time - https://phabricator.wikimedia.org/T338142 [13:27:01] !log toolsbeta dcaro@urcuchillay END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component builds-api (T338142) [13:27:04] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [13:27:34] (03CR) 10Thiemo Kreuz (WMDE): [C: 03+1] "Looks like the space is indeed not relevant: https://gerrit.wikimedia.org/g/labs/tools/wikinity/+/master/src/templates/admin/index.html" [labs/tools/wikinity] - 10https://gerrit.wikimedia.org/r/982233 (https://phabricator.wikimedia.org/T310688) (owner: 10Nikerabbit) [13:31:27] !log tools dcaro@urcuchillay START - Cookbook wmcs.toolforge.k8s.component.deploy for component builds-api (T338142) [13:31:33] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [13:31:34] T338142: [toolforge-envvars.api,toolforge-build.api] Support using custom environment variables at build time - https://phabricator.wikimedia.org/T338142 [13:32:01] !log tools dcaro@urcuchillay END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component builds-api (T338142) [13:32:05] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [13:32:52] 10Toolforge (Toolforge iteration 02), 10Patch-For-Review, 10User-dcaro: [builds-builder] Investigate how to enable mono/dotnet/c# and implement the best one to unblock us to migrate tools - https://phabricator.wikimedia.org/T352774 (10CodeReviewBot) dcaro merged https://gitlab.wikimedia.org/repos/cloud/toolf... [13:33:07] 10Cloud Services Proposals, 10Toolforge (Toolforge iteration 02), 10cloud-services-team, 10Cloud-Services-Origin-Team, and 3 others: [toolforge-envvars.api,toolforge-build.api] Support using custom environment variables at build time - https://phabricator.wikimedia.org/T338142 (10CodeReviewBot) dcaro merge... [13:33:51] 10Toolforge (Toolforge iteration 02), 10Patch-For-Review, 10User-dcaro: [builds-builder] Investigate how to enable mono/dotnet/c# and implement the best one to unblock us to migrate tools - https://phabricator.wikimedia.org/T352774 (10CodeReviewBot) dcaro merged https://gitlab.wikimedia.org/repos/cloud/toolf... [13:37:22] 10Toolforge (Toolforge iteration 02), 10Patch-For-Review, 10User-dcaro: [builds-builder] Investigate how to enable mono/dotnet/c# and implement the best one to unblock us to migrate tools - https://phabricator.wikimedia.org/T352774 (10CodeReviewBot) dcaro opened https://gitlab.wikimedia.org/repos/cloud/toolf... [13:39:19] (HAProxyBackendUnavailable) firing: HAProxy service neutron-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [13:44:18] 10Toolforge (Toolforge iteration 02), 10Patch-For-Review, 10User-dcaro: [builds-builder] Investigate how to enable mono/dotnet/c# and implement the best one to unblock us to migrate tools - https://phabricator.wikimedia.org/T352774 (10CodeReviewBot) dcaro merged https://gitlab.wikimedia.org/repos/cloud/toolf... [13:44:19] (HAProxyBackendUnavailable) resolved: HAProxy service neutron-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [13:56:28] 10Cloud Services Proposals, 10Toolforge Build Service, 10cloud-services-team, 10Cloud-Services-Origin-Team, and 3 others: [Epic] Make Toolforge a proper platform as a service with push-to-deploy and build packs - https://phabricator.wikimedia.org/T194332 (10dcaro) [13:57:05] 10Cloud Services Proposals, 10Toolforge (Toolforge iteration 02), 10cloud-services-team, 10Cloud-Services-Origin-Team, and 3 others: [toolforge-envvars.api,toolforge-build.api] Support using custom environment variables at build time - https://phabricator.wikimedia.org/T338142 (10dcaro) 05In progress→03... [13:59:17] 10Toolforge (Toolforge iteration 02), 10Patch-For-Review, 10User-dcaro: [builds-builder] Investigate how to enable mono/dotnet/c# and implement the best one to unblock us to migrate tools - https://phabricator.wikimedia.org/T352774 (10CodeReviewBot) dcaro merged https://gitlab.wikimedia.org/repos/cloud/toolf... [14:01:45] 10Toolforge (Toolforge iteration 02), 10Patch-For-Review, 10User-dcaro: [builds-builder] Investigate how to enable mono/dotnet/c# and implement the best one to unblock us to migrate tools - https://phabricator.wikimedia.org/T352774 (10CodeReviewBot) project_1317_bot_df3177307bed93c3f34e421e26c86e38 opened ht... [14:11:00] 10Cloud Services Proposals, 10Toolforge (Toolforge iteration 02), 10cloud-services-team, 10Cloud-Services-Origin-Team, and 3 others: [toolforge-envvars.api,toolforge-build.api] Support using custom environment variables at build time - https://phabricator.wikimedia.org/T338142 (10dcaro) [14:12:02] !log toolsbeta dcaro@urcuchillay START - Cookbook wmcs.toolforge.k8s.component.deploy for component builds-builder (T352774) [14:12:07] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [14:12:07] T352774: [builds-builder] Investigate how to enable mono/dotnet/c# and implement the best one to unblock us to migrate tools - https://phabricator.wikimedia.org/T352774 [14:12:36] !log toolsbeta dcaro@urcuchillay END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component builds-builder (T352774) [14:12:40] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [14:12:45] !log toolsbeta dcaro@urcuchillay START - Cookbook wmcs.toolforge.k8s.component.deploy for component builds-api (T352774) [14:12:49] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [14:13:14] !log toolsbeta dcaro@urcuchillay END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component builds-api (T352774) [14:13:17] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [14:22:02] !log tools dcaro@urcuchillay START - Cookbook wmcs.toolforge.k8s.component.deploy for component builds-api (T352774) [14:22:07] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [14:22:07] T352774: [builds-builder] Investigate how to enable mono/dotnet/c# and implement the best one to unblock us to migrate tools - https://phabricator.wikimedia.org/T352774 [14:22:33] !log tools dcaro@urcuchillay END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component builds-api (T352774) [14:22:36] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [14:22:40] !log tools dcaro@urcuchillay START - Cookbook wmcs.toolforge.k8s.component.deploy for component builds-builder (T352774) [14:22:44] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [14:23:13] !log tools dcaro@urcuchillay END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component builds-builder (T352774) [14:23:17] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [14:25:04] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [14:27:51] 10Toolforge (Software install/update): Create a kubernetes container with mono and dotnet - https://phabricator.wikimedia.org/T311466 (10dcaro) [14:29:20] 10Toolforge (Toolforge iteration 02), 10Patch-For-Review, 10User-dcaro: [builds-builder] Investigate how to enable mono/dotnet/c# and implement the best one to unblock us to migrate tools - https://phabricator.wikimedia.org/T352774 (10dcaro) 05In progress→03Resolved [14:30:04] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [14:32:53] 10Grid-Engine-to-K8s-Migration: Migrate isa from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319818 (10komla) >>! In T319818#9402098, @Sebastian_Berlin-WMSE wrote: > @komla, just to clarify, does this mean you've determined that there's no longer anything running on GridEngi... [14:33:31] 10Grid-Engine-to-K8s-Migration: Migrate isa from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319818 (10Sebastian_Berlin-WMSE) Great, thanks! [14:45:22] 10Cloud-VPS, 10SRE, 10observability, 10Patch-For-Review, and 2 others: ossl rsyslog errors post-migration - https://phabricator.wikimedia.org/T351710 (10fgiunchedi) >>! In T351710#9390698, @fgiunchedi wrote: >>>! In T351710#9385748, @Stashbot wrote: >> {nav icon=file, name=Mentioned in SAL (#wikimedia-oper... [15:04:16] 10Cloud-VPS, 10cloud-services-team (FY2023/2024-Q1-Q2), 10Goal, 10Patch-For-Review: Support 'unmanaged' projects in cloud-vps - https://phabricator.wikimedia.org/T326818 (10Andrew) [x] bring-your-own base image This is semi-implemented. - Users can be granted the 'glanceadmin' role on a project and then u... [15:18:55] 10Toolforge, 10Patch-Needs-Improvement: Introduce static HTML webservice type on Toolforge - https://phabricator.wikimedia.org/T241817 (10dcaro) There's now support to run a static website using the build service on toolforge (ex. https://wikitech.wikimedia.org/wiki/Help:Toolforge/My_first_Buildpack_static_too... [15:20:21] 10Toolforge, 10Patch-Needs-Improvement: Introduce static HTML webservice type on Toolforge - https://phabricator.wikimedia.org/T241817 (10dcaro) 05Open→03Resolved a:03dcaro Please reopen if you have a usecase forcing you to not have your code in a git repository or use the build service in any other way. [15:27:22] 10Toolforge (Software install/update): Create a kubernetes container with mono and dotnet - https://phabricator.wikimedia.org/T311466 (10dcaro) I have deployed both, the dotnet buildpack and the support for build environment variables. So when you are able to get repositories up and running, feel free to test it... [15:29:42] 10Toolforge Build Service: Add Rust buildpack to Toolforge build service - https://phabricator.wikimedia.org/T337066 (10dcaro) Sorry for the delay, lots of things happened xd Now we have a working way to inject specific buildpacks into the pipeline, so this should be relatively easy to get in. One issue that... [15:31:00] 10Cloud-VPS, 10cloud-services-team, 10Goal: Hide + disable 'key pair' tab when creating puppetized VMs - https://phabricator.wikimedia.org/T353331 (10Andrew) [15:31:37] (CephSlowOps) firing: Ceph cluster in eqiad has 14 slow ops - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephSlowOps - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephSlowOps [15:31:42] 10cloud-services-team: CephSlowOps Ceph cluster in eqiad has slow ops, which might be blocking some writes - https://phabricator.wikimedia.org/T352570 (10phaultfinder) [15:31:48] 10Cloud-VPS, 10cloud-services-team, 10Goal: Hide VM puppet tab for unpuppetized VMs - https://phabricator.wikimedia.org/T353332 (10Andrew) [15:36:37] (CephClusterInWarning) firing: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [15:41:37] (CephClusterInWarning) resolved: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [15:41:37] (CephSlowOps) resolved: Ceph cluster in eqiad has 18 slow ops - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephSlowOps - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephSlowOps [16:14:28] PROBLEM - toolschecker: check mtime mod from tools cron job on checker.tools.wmflabs.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 SERVICE UNAVAILABLE - string OK not found on http://checker.tools.wmflabs.org:80/cron - 177 bytes in 0.008 second response time https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Toolschecker [16:18:10] RECOVERY - toolschecker: check mtime mod from tools cron job on checker.tools.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 158 bytes in 0.138 second response time https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Toolschecker [16:22:16] 10Toolforge (Toolforge iteration 02), 10Patch-For-Review, 10User-dcaro: [builds-builder] Investigate how to enable mono/dotnet/c# and implement the best one to unblock us to migrate tools - https://phabricator.wikimedia.org/T352774 (10CodeReviewBot) dcaro merged https://gitlab.wikimedia.org/repos/cloud/toolf... [16:23:43] !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.scale_grid_exec [16:23:49] !log taavi@cloudcumin1001 toolsbeta END (ERROR) - Cookbook wmcs.toolforge.scale_grid_exec (exit_code=97) [16:23:57] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.scale_grid_exec [16:25:32] 10Striker, 10GitLab (Integrations), 10User-bd808: GitLab users with only provider=cas3 identies are not found when Striker attempts to create GitLab repostories - https://phabricator.wikimedia.org/T353176 (10bd808) >>! In T353176#9401632, @Hawkeye7 wrote: > @bd808 I can confirm that it is working now. Thanks... [16:37:03] (PuppetAgentStaleLastRun) firing: Last Puppet run was over 24 hours ago on instance tools-sgeexec-10-23 in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [16:38:59] !log taavi@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.toolforge.scale_grid_exec (exit_code=99) [16:42:03] (PuppetAgentStaleLastRun) resolved: Last Puppet run was over 24 hours ago on instance tools-sgeexec-10-23 in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [16:49:26] 10cloud-services-team, 10affects-Kiwix-and-openZIM: Read-only access to Wikimedia mirror of Kiwix data in dumps.wikimedia.org/kiwix/ - https://phabricator.wikimedia.org/T348226 (10Benoit74) Up, anyone could at least triage this issue, please? [16:51:14] PROBLEM - toolschecker: expect a long running job on buster on checker.tools.wmflabs.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 504 Gateway Time-out - string OK not found on http://checker.tools.wmflabs.org:80/grid/continuous/buster - 340 bytes in 60.019 second response time https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Toolschecker [16:51:14] PROBLEM - toolschecker: check mtime mod from tools cron job on checker.tools.wmflabs.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 SERVICE UNAVAILABLE - string OK not found on http://checker.tools.wmflabs.org:80/cron - 177 bytes in 0.052 second response time https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Toolschecker [16:51:46] PROBLEM - toolschecker: start a job and verify on buster on checker.tools.wmflabs.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 504 Gateway Time-out - string OK not found on http://checker.tools.wmflabs.org:80/grid/start/buster - 340 bytes in 60.023 second response time https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Toolschecker [16:57:06] 10Toolforge, 10cloud-services-team: Provide tools for disabling the grid for specific tools - https://phabricator.wikimedia.org/T353351 (10Andrew) [17:11:26] 10Toolforge, 10cloud-services-team: read error closing "tools-sgegrid-master.tools.eqiad1.wikimedia.cloud/qmaster/1" - https://phabricator.wikimedia.org/T353352 (10JJMC89) [17:21:54] 10tool-wscontest, 10good first task: Add contestant number (order) for WSContest contest page - https://phabricator.wikimedia.org/T331507 (10PMenon-WMF) a:05AFZL210→03None Hi @AFZL210, removing you from this task because it has been stale for quite a while. Feel free to re-assign it to yourself if you are... [17:22:24] 10tool-wscontest, 10good first task: Add UTC in the WSContest contest page - https://phabricator.wikimedia.org/T331225 (10PMenon-WMF) a:05AFZL210→03None Hi @AFZL210, removing you from this task since it has been stale for quite a while. Feel free to re-assign it to yourself if you are still working on it! [17:23:42] 10tool-wscontest, 10User-Frostly, 10good first task: Make all interface messages translatable - https://phabricator.wikimedia.org/T346994 (10PMenon-WMF) a:05Frostly→03None Hi @Frostly, removing you from this task because it has been stale for a while. Feel free to re-assign it to yourself and give us an... [17:24:03] (PuppetAgentFailure) firing: Puppet agent failure detected on instance tools-sgeexec-10-23 in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [17:30:04] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [17:30:04] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [17:43:07] 10cloud-services-team, 10Release-Engineering-Team (Priority Backlog 📥): Experiment with WMCS as a k8s provider for gitlab-cloud-runner cluster - https://phabricator.wikimedia.org/T353356 (10dduvall) [17:54:34] 10Striker, 10Patch-Needs-Improvement: Set code repository URI when creating project tags - https://phabricator.wikimedia.org/T320915 (10Aklapper) [17:54:41] 10Grid-Engine-to-K8s-Migration: Migrate dexbot from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319674 (10Ladsgroup) nah, I just need to do some work. Some of them are non-trivial (e.g. the web req triggering a job in grid engine) I'm planning to do some work on it during th... [18:08:39] 10Cloud-VPS (Quota-requests): Please delete meet and chat VPS projects - https://phabricator.wikimedia.org/T352727 (10Ladsgroup) 05Open→03Resolved {{done}} [18:08:48] 10Cloud-VPS, 10cloud-services-team: UDP traffic throughput to instances in the "meet" Cloud VPS project not meeting expectations - https://phabricator.wikimedia.org/T268393 (10Ladsgroup) 05Open→03Declined Wikimedia Meet has been retired [18:08:59] 10Toolforge, 10cloud-services-team: read error closing "tools-sgegrid-master.tools.eqiad1.wikimedia.cloud/qmaster/1" - https://phabricator.wikimedia.org/T353352 (10taavi) I originally thought this was the grid getting confused by a new node, but it seems to be something hammering NFS instead. [18:39:31] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.reboot for all workers [18:41:45] (ProbeDown) firing: (4) Service tools-k8s-haproxy-3:30000 has failed probes (http_admin_toolforge_org_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [18:44:00] RECOVERY - toolschecker: check mtime mod from tools cron job on checker.tools.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 158 bytes in 0.138 second response time https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Toolschecker [18:44:03] (PuppetAgentFailure) resolved: Puppet agent failure detected on instance tools-sgeexec-10-23 in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [18:45:14] RECOVERY - toolschecker: expect a long running job on buster on checker.tools.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 158 bytes in 0.411 second response time https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Toolschecker [18:45:52] RECOVERY - toolschecker: start a job and verify on buster on checker.tools.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 158 bytes in 1.016 second response time https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Toolschecker [18:46:45] (ProbeDown) resolved: (4) Service tools-k8s-haproxy-3:30000 has failed probes (http_admin_toolforge_org_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [18:47:03] (InstanceDown) firing: Project tools instance tools-k8s-worker-99 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [18:52:03] (InstanceDown) resolved: Project tools instance tools-k8s-worker-99 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [19:07:03] (WidespreadPuppetAgentFailure) firing: Widespread puppet agent failures in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure [19:12:03] (WidespreadPuppetAgentFailure) resolved: Widespread puppet agent failures in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure [19:39:45] (ProbeDown) firing: Service tools-k8s-haproxy-4:30000 has failed probes (http_admin_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-k8s-haproxy-4:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [19:44:45] (ProbeDown) resolved: Service tools-k8s-haproxy-4:30000 has failed probes (http_admin_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-k8s-haproxy-4:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [19:50:03] (PuppetAgentFailure) firing: Puppet agent failure detected on instance tools-sgeweblight-10-25 in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [19:53:55] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for all workers [19:57:16] 10Toolforge, 10cloud-services-team, 10Community-Tech, 10CopyPatrol: read error closing "tools-sgegrid-master.tools.eqiad1.wikimedia.cloud/qmaster/1" - https://phabricator.wikimedia.org/T353352 (10Don-vip) I got the same errors with spacemedia tool between 17:02 and 18:42 UTC. [19:59:06] 10Toolforge, 10cloud-services-team, 10Community-Tech, 10CopyPatrol: read error closing "tools-sgegrid-master.tools.eqiad1.wikimedia.cloud/qmaster/1" - https://phabricator.wikimedia.org/T353352 (10taavi) 05Open→03Resolved a:03taavi https://lists.wikimedia.org/hyperkitty/list/cloud-announce@lists.wikim... [20:10:03] (PuppetAgentFailure) resolved: Puppet agent failure detected on instance tools-sgeweblight-10-25 in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [20:18:50] 10Grid-Engine-to-K8s-Migration: Migrate sqid from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320056 (10Mmarx) 05Open→03In progress p:05Triage→03High [20:30:04] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [20:30:04] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [20:34:03] (PuppetAgentFailure) firing: Puppet agent failure detected on instance tools-sgeweblight-10-24 in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [21:10:37] 10Grid-Engine-to-K8s-Migration, 10Pywikibot: Migrate pywikibot from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319981 (10JJMC89) The nightly cronjobs need to be migrated from Grid Engine to the jobs framework. [21:24:55] 10Grid-Engine-to-K8s-Migration: Migrate redirtalkdeleter from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319998 (10Andriy.v) I do not know if you have the access to my Toolforge tools, but first of all I want to show my actual situation. I have a "pywikibot-core" directory... [21:29:10] 10Grid-Engine-to-K8s-Migration: Migrate jembot from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319828 (10-jem-) I'm testing the migration now, some things need fixing but I think I can handle it; let's hope things keep going well and I'll report again (and maybe close the t... [21:29:48] 10Grid-Engine-to-K8s-Migration: Migrate jembot from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319828 (10-jem-) a:03-jem- [21:39:06] 10Grid-Engine-to-K8s-Migration: Migrate sqid from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320056 (10Mmarx) 05In progress→03Resolved Migrated the grid cronjobs to the new kubernetes jobs. [21:44:00] 10Grid-Engine-to-K8s-Migration: Migrate fastilybot-reports from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319741 (10nskaggs) @Fastily of course. Thank you for providing an update. If you have specific questions or need help, please do reach out. Good luck and thanks for he... [22:15:24] (03PS1) 10Dwisehaupt: Fix up a typo in community_civicrm::config_nonce [labs/private] - 10https://gerrit.wikimedia.org/r/982924 (https://phabricator.wikimedia.org/T343486) [22:54:08] (03CR) 10Jforrester: "check experimental" [labs/tools/sonarqubebot] - 10https://gerrit.wikimedia.org/r/809582 (owner: 10Kosta Harlan) [22:56:37] (03CR) 10Jforrester: "check experimental" [labs/tools/ipchanges] - 10https://gerrit.wikimedia.org/r/608562 (owner: 10Legoktm) [23:06:51] (03CR) 10Jforrester: "check experimental" [labs/tools/nagf] - 10https://gerrit.wikimedia.org/r/617293 (owner: 10Krinkle) [23:07:09] (03CR) 10Jforrester: "check experimental" [labs/tools/meetingtimes] - 10https://gerrit.wikimedia.org/r/971743 (owner: 10VolkerE) [23:07:23] (03CR) 10Jforrester: "check experimental" [labs/tools/wikiinfo] - 10https://gerrit.wikimedia.org/r/949071 (owner: 10Krinkle) [23:30:04] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [23:30:04] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [23:34:03] (PuppetAgentFailure) firing: Puppet agent failure detected on instance tools-sgeweblight-10-24 in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [23:39:34] (03CR) 10Dwisehaupt: [V: 03+2 C: 03+2] "Also got the verbal ok from jgreen but he had to head off for the night." [labs/private] - 10https://gerrit.wikimedia.org/r/982924 (https://phabricator.wikimedia.org/T343486) (owner: 10Dwisehaupt)