[00:00:08] 10Wikibugs: Print events in closed tasks in grey - https://phabricator.wikimedia.org/T140881#9622185 (10bd808) This will require always passing the current status of the task to the message formatter. Currently we only pass status if the transaction group triggering the message included a status transition. As a... [00:12:09] 10Wikibugs, 15User-bd808: Print events in closed tasks in grey - https://phabricator.wikimedia.org/T140881#9622209 (10bd808) 05Open→03In progress a:03bd808 [00:41:28] (PuppetAgentFailure) firing: (2) Puppet agent failure detected on instance toolsbeta-puppetdb-02 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [00:50:56] (CloudVPSDesignateLeaks) firing: (5) Detected 6 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [00:56:28] (PuppetAgentFailure) firing: (2) Puppet agent failure detected on instance toolsbeta-puppetdb-02 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [01:00:28] (PuppetStaleCertificates) firing: Found non-revoked Puppet certificates for 1 deleted instances on metricsinfra-puppetmaster-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [01:56:40] 10Wikibugs, 15User-bd808: Print events in closed tasks in grey - https://phabricator.wikimedia.org/T140881#9622283 (10bd808) >>! In T140881#9622185, @bd808 wrote: > This will require always passing the current status of the task to the message formatter. Actually it will be even easier to propagate the "isClos... [02:12:28] (PuppetAgentNoResources) firing: (2) No Puppet resources found on instance toolsbeta-sgegrid-shadow on project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [02:28:04] 10Wikibugs: wikibugs test bug - https://phabricator.wikimedia.org/T1152#9622293 (10bd808) Comment on a closed task for {T140881} [02:45:36] 10Wikibugs, 15User-bd808: Show when a task has been closed as a duplicate - https://phabricator.wikimedia.org/T128868#9622300 (10bd808) 05Open→03In progress a:03bd808 >>! In T128868#9618213, @bd808 wrote: > * Do both? Seems like the way to go [03:06:28] (PuppetAgentFailure) firing: (2) Puppet agent failure detected on instance toolsbeta-puppetdb-02 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [03:35:25] 10Wikibugs: Wikibugs testing task - https://phabricator.wikimedia.org/T90594#9622320 (10bd808) test [03:35:27] 10Wikibugs: Wikibugs testing task - https://phabricator.wikimedia.org/T90594#9622320 (10bd808) test [03:36:34] 10Wikibugs: wikibugs test bug - https://phabricator.wikimedia.org/T1152#9622321 (10bd808) Another comment on a closed task [03:38:25] 10Wikibugs: wikibugs test bug - https://phabricator.wikimedia.org/T1152#9622322 (10bd808) test [03:40:36] 10Wikibugs: wikibugs test bug - https://phabricator.wikimedia.org/T1152#9622338 (10bd808) [03:40:40] 10Wikibugs: Wikibugs testing task - https://phabricator.wikimedia.org/T90594#9622335 (10bd808) [03:49:10] 10Wikibugs, 13Patch-For-Review, 15User-bd808: Hide User-XX projects from wikibugs output - https://phabricator.wikimedia.org/T180293#9622355 (10CodeReviewBot) bd808 opened https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/17 Enhance Phorge event rendering [03:49:32] 10Wikibugs, 13Patch-For-Review, 15User-bd808: Show when a task has been closed as a duplicate - https://phabricator.wikimedia.org/T128868#9622357 (10CodeReviewBot) bd808 opened https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/17 Enhance Phorge event rendering [03:51:28] (PuppetAgentFailure) firing: (2) Puppet agent failure detected on instance toolsbeta-puppetdb-02 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [04:00:28] (PuppetStaleCertificates) firing: Found non-revoked Puppet certificates for 1 deleted instances on metricsinfra-puppetmaster-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [04:50:56] (CloudVPSDesignateLeaks) firing: (5) Detected 6 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [05:12:28] (PuppetAgentNoResources) firing: (2) No Puppet resources found on instance toolsbeta-sgegrid-shadow on project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [06:21:28] (PuppetAgentFailure) resolved: Puppet agent failure detected on instance toolsbeta-puppetdb-03 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [06:25:28] (PuppetAgentStaleLastRun) firing: Last Puppet run was over 24 hours ago on instance toolsbeta-puppetdb-03 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [06:55:28] (PuppetAgentStaleLastRun) resolved: Last Puppet run was over 24 hours ago on instance toolsbeta-puppetdb-03 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [07:00:28] (PuppetStaleCertificates) firing: Found non-revoked Puppet certificates for 1 deleted instances on metricsinfra-puppetmaster-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [07:06:28] (PuppetAgentFailure) firing: Puppet agent failure detected on instance toolsbeta-puppetdb-02 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [07:21:28] (PuppetAgentFailure) resolved: Puppet agent failure detected on instance toolsbeta-puppetdb-02 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [07:39:58] (PuppetAgentFailure) firing: (2) Puppet agent failure detected on instance toolsbeta-puppetdb-02 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [07:51:49] (03CR) 10Eugene233: "recheck" [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1009401 (owner: 10AgnesAbah) [07:52:22] (03CR) 10CI reject: [V: 04-1] Bug:T359300 Wrote docstrings for functions on the ISA tool on routes.py [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1009401 (owner: 10AgnesAbah) [08:12:28] (PuppetAgentNoResources) firing: (2) No Puppet resources found on instance toolsbeta-sgegrid-shadow on project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [08:27:02] 10Tool-itwiki: If the section "Voci correlate" becomes empty, remove that section - https://phabricator.wikimedia.org/T338084#9622610 (10valerio.bozzolan) Ouch. The example diff was deleted :( https://it.wikipedia.org/w/index.php?title=Nicola_Palmieri&diff=prev&oldid=133718774 [08:50:57] (CloudVPSDesignateLeaks) firing: (5) Detected 6 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [09:09:58] (PuppetAgentFailure) firing: (2) Puppet agent failure detected on instance toolsbeta-puppetdb-02 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [09:24:58] (PuppetAgentFailure) firing: (2) Puppet agent failure detected on instance toolsbeta-puppetdb-02 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [09:28:30] 10Tools, 10Wikidata, 06Wikidata Dev Team, 10wmde-wikidata-tech: [SW] [GENERAL] Deprecate connecting senses prototype - https://phabricator.wikimedia.org/T351829#9622752 (10Lucas_Werkmeister_WMDE) **Prio Notes:** | Impact Area | Affected | Notes | |-------------------------|---------|... [09:28:41] 10Tools, 10Wikidata, 06Wikidata Dev Team, 10wmde-wikidata-tech: [SW] [GENERAL] Deprecate connecting senses prototype - https://phabricator.wikimedia.org/T351829#9622755 (10Lucas_Werkmeister_WMDE) [09:42:28] (PuppetAgentNoResources) firing: (2) No Puppet resources found on instance toolsbeta-sgegrid-shadow on project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [09:45:07] 10Tool-Global-user-contributions, 06Stewards-and-global-tools, 10Temporary accounts, 10XTools, 07Design: [Design EPIC] Global User Contributions - https://phabricator.wikimedia.org/T349901#9622854 (10Johannnes89) I just clicked through the design brief. It says on slide 12 (frequency of use): „GUC is use... [09:49:58] (PuppetAgentFailure) resolved: Puppet agent failure detected on instance toolsbeta-puppetdb-03 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [09:54:39] 05Grid-Engine-to-K8s-Migration, 06Growth-Team, 10Community-Tech (CommTech-Kanban): Migrate ERANBOT project off of Grid Engine - https://phabricator.wikimedia.org/T306888#9622885 (10taavi) Ping - is there anything blocking the migration of the remaining tasks/wikis? [09:57:06] 05Grid-Engine-to-K8s-Migration: Migrate mbh from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319883#9622887 (10dcaro) >>! In T319883#9621336, @MBH wrote: > Thanks. But I added clusters5 and test to all.sln, replaced .csproj of both projects with code from your last message,... [10:00:28] (PuppetStaleCertificates) firing: Found non-revoked Puppet certificates for 1 deleted instances on metricsinfra-puppetmaster-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [10:12:28] (PuppetAgentNoResources) resolved: No Puppet resources found on instance toolsbeta-test-k8s-worker-nfs-3 on project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [10:52:50] (ProbeDown) firing: Service tools-static-14:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-14:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [10:57:50] (ProbeDown) resolved: Service tools-static-14:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-14:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [11:10:22] 10wikitech.wikimedia.org, 06Content-Transform-Team-WIP, 10DiscussionTools, 10Parsoid-Read-Views (Phase 1 - DiscussionTools support): Use Parsoid for DiscussionTools on wikitech - https://phabricator.wikimedia.org/T355374#9622943 (10MSantos) [11:10:38] 05Grid-Engine-to-K8s-Migration, 06Growth-Team, 10Community-Tech (CommTech-Kanban): Migrate ERANBOT project off of Grid Engine - https://phabricator.wikimedia.org/T306888#9622945 (10TheresNoTime) @MusikAnimal @JJMC89 One assumes that, following the above template, we could create (eg) `enwiki.sh` with `lang=... [11:14:57] !log aborrero@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.prepare_upgrade for cluster toolsbeta upgrade from 1.23 to 1.24 (T359638) [11:14:58] !log aborrero@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.k8s.prepare_upgrade (exit_code=99) for cluster toolsbeta upgrade from 1.23 to 1.24 (T359638) [11:15:02] T359638: toolsbeta: upgrade kubernetes to 1.24 - https://phabricator.wikimedia.org/T359638 [11:27:00] 05Grid-Engine-to-K8s-Migration, 06Growth-Team, 10Community-Tech (CommTech-Kanban): Migrate ERANBOT project off of Grid Engine - https://phabricator.wikimedia.org/T306888#9623041 (10TheresNoTime) >>! In T306888#9622983, @TheresNoTime wrote: > **Question for @MusikAnimal and/or @JJMC89:** I note that these job... [11:34:27] 06Toolforge-standards-committee: Adoption request for eatchabot - https://phabricator.wikimedia.org/T338555#9623064 (10dcaro) Cleaned up caches and secrets: ` $ sudo -i -u tools.eatchabot $ vim passwords/* # and redacted the passwords there $ find . -iname apicache-py3 -exec rm -r {} \; $ rm -r .cache $ rm -r .... [11:34:55] 06Toolforge-standards-committee: Adoption request for eatchabot - https://phabricator.wikimedia.org/T338555#9623065 (10dcaro) p:05Triage→03High a:03dcaro [11:38:30] 06Toolforge-standards-committee: Adoption request for eatchabot - https://phabricator.wikimedia.org/T338555#9623074 (10dcaro) 05Open→03Resolved @mdaniels5757 sorry for the delay, I have added you as tool manager for eatchabot after doing the password/sensitive information cleanup, thanks a lot for taking car... [11:39:41] 05Grid-Engine-to-K8s-Migration: Migrate eatchabot from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319713#9623077 (10dcaro) I approved the adoption of the tool, @mdaniels5757 you should be unblocked on that side, thanks! [11:40:44] 10Tools: eatchabot using a lot of NFS storage - https://phabricator.wikimedia.org/T284968#9623079 (10dcaro) Just approved the adoption request, @mdaniels5757 is now the maintainer :) [11:40:59] 10Toolforge (Quota-requests): Request increased quota for eranbot Toolforge tool - https://phabricator.wikimedia.org/T359923 (10TheresNoTime) 03NEW [11:42:08] 10Toolforge (Quota-requests): Request increased quota for eranbot Toolforge tool - https://phabricator.wikimedia.org/T359923#9623092 (10taavi) a:03taavi [11:42:25] 10Toolforge (Quota-requests), 06cloud-services-team: Request increased quota for eranbot Toolforge tool - https://phabricator.wikimedia.org/T359923#9623093 (10taavi) [11:45:11] 10Tools, 06cloud-services-team: cewbot k8s-20230418.fix-redirected-wikilinks-of-templates.out is unreasonably large - https://phabricator.wikimedia.org/T358555#9623099 (10dcaro) 05Open→03Resolved The file is now ~58M, I think we can close this task, the tool does not seem to generate too much output (would... [11:45:16] 10Toolforge, 06cloud-services-team: 2024-02-27: toolforge NFS cleanup - https://phabricator.wikimedia.org/T358554#9623101 (10dcaro) [11:46:42] 10Toolforge (Quota-requests), 06cloud-services-team: Request increased quota for eranbot Toolforge tool - https://phabricator.wikimedia.org/T359923#9623104 (10dcaro) +1 [11:50:23] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.component.deploy for component maintain-kubeusers [11:50:26] 10Toolforge (Quota-requests), 06cloud-services-team, 13Patch-For-Review: Request increased quota for eranbot Toolforge tool - https://phabricator.wikimedia.org/T359923#9623110 (10CodeReviewBot) taavi merged https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/221 maintain-kub... [11:50:33] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component maintain-kubeusers [11:51:16] 10Tools: 'hoiscript' tool uses an unreasonable amount of disk space - https://phabricator.wikimedia.org/T349913#9623112 (10dcaro) @Hoi I can still see the 533G PDFs, can you please clean those up? It has been several months already since they were supposed to have been processed. Thanks! ` root@tools-nfs-2:/srv... [11:52:06] 10PAWS: remove unused image_name var - https://phabricator.wikimedia.org/T359924 (10rook) 03NEW [11:52:09] 10Toolforge, 06cloud-services-team: 2024-02-27: toolforge NFS cleanup - https://phabricator.wikimedia.org/T358554#9623113 (10dcaro) 05Open→03In progress p:05Triage→03Medium [11:52:11] 10Toolforge (Quota-requests), 06cloud-services-team, 13Patch-For-Review: Request increased quota for eranbot Toolforge tool - https://phabricator.wikimedia.org/T359923#9623108 (10CodeReviewBot) taavi opened https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/221 maintain-kub... [11:53:30] vivian-rook opened https://github.com/toolforge/paws/pull/389 [11:55:04] 10Toolforge (Quota-requests), 06cloud-services-team, 13Patch-For-Review: Request increased quota for eranbot Toolforge tool - https://phabricator.wikimedia.org/T359923#9623126 (10taavi) 05Open→03Resolved [11:55:06] 10PAWS: remove unused image_name var - https://phabricator.wikimedia.org/T359924#9623128 (10github-toolforge-bot) vivian-rook opened https://github.com/toolforge/paws/pull/389 [11:57:16] 05Grid-Engine-to-K8s-Migration, 06Growth-Team, 10Community-Tech (CommTech-Kanban): Migrate ERANBOT project off of Grid Engine - https://phabricator.wikimedia.org/T306888#9623133 (10taavi) [11:57:23] 10Toolforge (Quota-requests), 06cloud-services-team, 13Patch-For-Review: Request increased quota for eranbot Toolforge tool - https://phabricator.wikimedia.org/T359923#9623134 (10taavi) [12:08:33] 10Toolforge (Toolforge iteration 07), 06cloud-services-team, 15User-aborrero: wmcs.toolforge.k8s.prepare_upgrade: be more flexible checking for deb package components - https://phabricator.wikimedia.org/T359927 (10aborrero) 03NEW [12:10:28] (InstanceDown) firing: Project tools instance tools-prometheus-6 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [12:14:24] 05Grid-Engine-to-K8s-Migration, 06Growth-Team, 10Community-Tech (CommTech-Kanban): Migrate ERANBOT project off of Grid Engine - https://phabricator.wikimedia.org/T306888#9623220 (10TheresNoTime) Have moved arwiki, enwiki, eswiki, frwiki, simplewiki to jobs following the template above. `lang=sh tools.eranbo... [12:27:27] (ProbeDown) firing: Service toolsbeta-proxy-3:443 has failed probes (http_beta_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#toolsbeta-proxy-3:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [12:34:50] (ProbeDown) firing: Service tools-static-14:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-14:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [12:37:27] (ProbeDown) resolved: Service toolsbeta-proxy-3:443 has failed probes (http_beta_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#toolsbeta-proxy-3:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [12:39:50] (ProbeDown) resolved: Service tools-static-14:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-14:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [12:40:28] (InstanceDown) resolved: Project tools instance tools-prometheus-6 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [12:50:57] (CloudVPSDesignateLeaks) firing: (5) Detected 6 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [13:00:28] (PuppetStaleCertificates) firing: Found non-revoked Puppet certificates for 1 deleted instances on metricsinfra-puppetmaster-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [13:05:50] (ProbeDown) firing: Service tools-static-14:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-14:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [13:10:50] (ProbeDown) resolved: Service tools-static-14:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-14:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [13:12:00] 10Toolforge, 10cloud-services-team (FY2023/2024-Q3-Q4), 05Goal, 13Patch-For-Review: Toolforge: Decommission the Grid Engine infrastructure - https://phabricator.wikimedia.org/T314664#9623352 (10CodeReviewBot) taavi opened https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/... [13:13:35] 10Toolforge, 13Patch-For-Review: [toolforge-webservice] Remove old webservice-runner code - https://phabricator.wikimedia.org/T358320#9623353 (10CodeReviewBot) taavi opened https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/31 Remove grid engine support [13:14:39] 10Toolforge (Toolforge iteration 07), 10cloud-services-team (FY2023/2024-Q3-Q4), 05Goal, 13Patch-For-Review: Toolforge: Decommission the Grid Engine infrastructure - https://phabricator.wikimedia.org/T314664#9623357 (10taavi) [13:16:55] 10Toolforge (Toolforge iteration 07), 10cloud-services-team (FY2023/2024-Q3-Q4): Archive grid engine related infrastructure tools - https://phabricator.wikimedia.org/T359934 (10taavi) 03NEW [13:18:51] 10Toolforge, 10Projects-Cleanup: Archive grid engine related Gerrit repositories - https://phabricator.wikimedia.org/T359935 (10taavi) 03NEW [13:19:28] 10Toolforge (Toolforge iteration 07), 06cloud-services-team, 15User-aborrero: [toolsbeta,infra] upgrade kubernetes to 1.24 - https://phabricator.wikimedia.org/T359638#9623389 (10dcaro) [13:19:48] 10Toolforge (Toolforge iteration 07), 13Patch-For-Review, 15User-aborrero: [jobs-api] introduce OpenAPI to jobs framework - https://phabricator.wikimedia.org/T356523#9623390 (10dcaro) [13:20:14] 10Toolforge (Toolforge iteration 07): [toolforge-cli,jobs-cli,builds-cli,envvars-cli] Explore OpenAPI SDK tooling for client consolidation - https://phabricator.wikimedia.org/T356261#9623391 (10dcaro) [13:20:41] 10Toolforge (Toolforge iteration 07), 13Patch-For-Review, 15User-aborrero: [jobs-api,buildservice-api,envvars-api] Investigate ways to present our multiple Openapi definitions to a future consolidated CLI client - https://phabricator.wikimedia.org/T354745#9623392 (10dcaro) [13:21:04] 10Toolforge (Toolforge iteration 07), 06cloud-services-team, 15User-aborrero: [cookbook] wmcs.toolforge.k8s.prepare_upgrade: be more flexible checking for deb package components - https://phabricator.wikimedia.org/T359927#9623393 (10dcaro) [13:21:22] 10Toolforge (Toolforge iteration 07), 10cloud-services-team (FY2023/2024-Q3-Q4), 05Goal, 13Patch-For-Review: [infra] Decommission the Grid Engine infrastructure - https://phabricator.wikimedia.org/T314664#9623394 (10dcaro) [13:21:29] 10Toolforge (Toolforge iteration 07), 10cloud-services-team (FY2023/2024-Q3-Q4), 05Goal, 13Patch-For-Review: [infra] Decommission the Grid Engine infrastructure - https://phabricator.wikimedia.org/T314664#9623395 (10dcaro) p:05Triage→03High [13:22:51] 10Toolforge (Toolforge iteration 07), 10cloud-services-team (FY2023/2024-Q3-Q4): [infra] Archive grid engine related infrastructure tools - https://phabricator.wikimedia.org/T359934#9623396 (10dcaro) p:05Triage→03High [13:24:23] 10Toolforge (Toolforge iteration 07), 10cloud-services-team (FY2023/2024-Q3-Q4), 05Goal, 13Patch-For-Review: [infra] Decommission the Grid Engine infrastructure - https://phabricator.wikimedia.org/T314664#9623400 (10CodeReviewBot) taavi opened https://gitlab.wikimedia.org/repos/cloud/toolforge/disable-tool... [13:25:24] (03PS1) 10Majavah: debian: Remove unused override [labs/toollabs] - 10https://gerrit.wikimedia.org/r/1010516 [13:25:24] (03PS1) 10Majavah: Remove jobutils binary package [labs/toollabs] - 10https://gerrit.wikimedia.org/r/1010517 (https://phabricator.wikimedia.org/T314664) [13:25:26] (03PS1) 10Majavah: misctools: Remove oge-crontab script [labs/toollabs] - 10https://gerrit.wikimedia.org/r/1010518 (https://phabricator.wikimedia.org/T314664) [13:27:15] 05Grid-Engine-to-K8s-Migration: Migrate mbh from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319883#9623411 (10MBH) Thanks. I merged your patch, successfully built all projects, stopped and started webservice (buildservice mount=all) and now all tools is inaccessible with ER... [13:27:37] 10Toolforge (Toolforge iteration 07), 10cloud-services-team (FY2023/2024-Q3-Q4), 05Goal, 13Patch-For-Review: [infra] Decommission the Grid Engine infrastructure - https://phabricator.wikimedia.org/T314664#9623423 (10taavi) [13:29:08] 10Toolforge (Toolforge iteration 07), 10cloud-services-team (FY2023/2024-Q3-Q4), 05Goal, 13Patch-For-Review: [infra] Decommission the Grid Engine infrastructure - https://phabricator.wikimedia.org/T314664#9623430 (10taavi) [13:29:30] 10Toolforge, 06cloud-services-team: Front proxy can keep bad routing info for webservices previously running on the grid engine - https://phabricator.wikimedia.org/T267369#9623431 (10taavi) 05Open→03Declined [13:29:48] 10Toolforge, 06cloud-services-team: Develop or expand grid troubleshooting playbook - https://phabricator.wikimedia.org/T218139#9623432 (10taavi) 05Open→03Declined [13:31:33] 14Cloud-VPS (Ubuntu Trusty Deprecation), 10Toolforge, 14cloud-services-team (Kanban), 07Epic: Upgrade the tools gridengine system - https://phabricator.wikimedia.org/T199271#9623441 (10taavi) [13:31:36] 10Toolforge, 14cloud-services-team (Kanban), 15User-bd808: NFS issue affecting Toolforge SGE master - https://phabricator.wikimedia.org/T218038#9623433 (10taavi) [13:31:53] 10Toolforge, 06cloud-services-team: Script dependency appears to not exist when script is run in job grid even though dependency is otherwise installed - https://phabricator.wikimedia.org/T197997#9623437 (10taavi) 05Open→03Declined Declining this Grid Engine related task. [13:32:26] 05Grid-Engine-to-K8s-Migration: Migrate mbh from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319883#9623450 (10dcaro) >>! In T319883#9623411, @MBH wrote: > Thanks. I merged your patch, successfully built all projects, stopped and started webservice (buildservice mount=all) a... [13:33:14] 05Grid-Engine-to-K8s-Migration: Migrate mbh from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319883#9623451 (10dcaro) I see, the compile script did not work as expected xd ` tools.mbh@tools-sgebastion-10:~$ toolforge build logs ... [step-build] 2024-03-12T13:06:00.986898628Z... [13:33:23] (03CR) 10Majavah: [C: 03+2] debian: Remove unused override [labs/toollabs] - 10https://gerrit.wikimedia.org/r/1010516 (owner: 10Majavah) [13:34:42] (03Merged) 10jenkins-bot: debian: Remove unused override [labs/toollabs] - 10https://gerrit.wikimedia.org/r/1010516 (owner: 10Majavah) [13:35:20] 10Toolforge (Toolforge iteration 07), 13Patch-For-Review, 15User-aborrero: [jobs-api,buildservice-api,envvars-api] Investigate ways to present our multiple Openapi definitions to a future consolidated CLI client - https://phabricator.wikimedia.org/T354745#9623453 (10Slst2020) @dcaro what is the `base.yaml` i... [13:39:25] 10Toolforge (Toolforge iteration 07), 13Patch-For-Review, 15User-aborrero: [jobs-api,buildservice-api,envvars-api] Investigate ways to present our multiple Openapi definitions to a future consolidated CLI client - https://phabricator.wikimedia.org/T354745#9623462 (10dcaro) >>! In T354745#9623453, @Slst2020 w... [13:42:00] 05Grid-Engine-to-K8s-Migration: Migrate mbh from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319883#9623484 (10dcaro) @MBH this is the fix https://github.com/Saisengen/wikibots/pull/6 I rebuilt your webservice from that branch, and it's working as expected (I think, please... [13:42:49] 10Toolforge: Rust (buildservice) requires PHP - https://phabricator.wikimedia.org/T359937#9623487 (10Magnus) [13:43:33] 05Grid-Engine-to-K8s-Migration: Migrate mbh from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319883#9623493 (10dcaro) Oh, and be very very careful with the test.cs webservice, I'd recommend removing it as soon as you finish testing, as it might expose secrets. [13:44:11] 10PAWS: remove unused image_name var - https://phabricator.wikimedia.org/T359924#9623498 (10github-toolforge-bot) vivian-rook closed https://github.com/toolforge/paws/pull/389 [13:44:21] vivian-rook closed https://github.com/toolforge/paws/pull/389 [13:44:33] 10PAWS: remove unused image_name var - https://phabricator.wikimedia.org/T359924#9623499 (10rook) 05Open→03Resolved [13:47:46] (03PS1) 10Majavah: Remove Grid Engine support [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1010520 [13:48:10] (03PS2) 10Majavah: Remove Grid Engine support [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1010520 (https://phabricator.wikimedia.org/T314664) [13:48:34] 10Toolforge: Rust (buildservice) requires PHP - https://phabricator.wikimedia.org/T359937#9623502 (10dcaro) Yes, you can install other packages using https://wikitech.wikimedia.org/wiki/Help:Toolforge/Build_Service#Installing_Apt_packages Note though that the binaries will not be installed in the usual path, so... [13:48:51] 10Toolforge (Toolforge iteration 07), 10cloud-services-team (FY2023/2024-Q3-Q4), 05Goal, 13Patch-For-Review: [infra] Decommission the Grid Engine infrastructure - https://phabricator.wikimedia.org/T314664#9623504 (10taavi) [13:50:32] 10Toolforge, 10Projects-Cleanup: [infra] Archive grid engine related Gerrit repositories - https://phabricator.wikimedia.org/T359935#9623506 (10dcaro) p:05Triage→03High [13:50:53] 10Toolforge: [envvars-cli] Enable use of `toolforge envvar` managed data on bastions - https://phabricator.wikimedia.org/T358537#9623509 (10dcaro) p:05Triage→03Medium [13:50:57] 10Toolforge: Rust (buildservice) requires PHP - https://phabricator.wikimedia.org/T359937#9623511 (10dcaro) p:05Triage→03Low [13:52:05] 05Grid-Engine-to-K8s-Migration: Migrate mbh from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319883#9623516 (10MBH) Thanks, looks like it works now (I still have some questions. but will ask them later). Could you also help another ruwiki user with transferring his tool? T3... [13:54:04] 05Grid-Engine-to-K8s-Migration: Migrate wikisaurusbot from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320164#9623519 (10dcaro) @Wikisaurus2 thanks for having your code in a public repo! :) I'll give it a look and suggest some changes. [13:56:23] 10Toolforge (Toolforge iteration 07): Disable-tool is unable to archive databases for tools.totally-not-xikimania - https://phabricator.wikimedia.org/T359938 (10taavi) 03NEW [13:56:43] 10Toolforge (Toolforge iteration 07): Disable-tool is unable to archive databases for tools.totally-not-xikimania - https://phabricator.wikimedia.org/T359938#9623552 (10taavi) a:03taavi [13:56:50] (ProbeDown) firing: Service tools-static-14:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-14:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [13:57:25] 10Toolforge (Toolforge iteration 07): Disable-tool is unable to archive databases for tools.totally-not-xikimania - https://phabricator.wikimedia.org/T359938#9623538 (10taavi) 05Open→03In progress [14:01:50] (ProbeDown) resolved: Service tools-static-14:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-14:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [14:01:57] 05Grid-Engine-to-K8s-Migration, 06Growth-Team, 10Community-Tech (CommTech-Kanban): Migrate ERANBOT project off of Grid Engine - https://phabricator.wikimedia.org/T306888#9623568 (10TheresNoTime) a:03TheresNoTime Grabbing this to get it across the line hopefully :-) [14:03:09] 10Toolforge (Toolforge iteration 07), 13Patch-For-Review: Disable-tool is unable to archive databases for tools.totally-not-xikimania - https://phabricator.wikimedia.org/T359938#9623579 (10CodeReviewBot) taavi merged https://gitlab.wikimedia.org/repos/cloud/toolforge/disable-tool/-/merge_requests/19 Do not tr... [14:03:40] 10Toolforge (Toolforge iteration 07), 13Patch-For-Review: Disable-tool is unable to archive databases for tools.totally-not-xikimania - https://phabricator.wikimedia.org/T359938#9623580 (10taavi) ` MariaDB [(none)]> show databases like 's55567__%'; Empty set (0.006 sec) ` No databases to archive, so I just del... [14:03:58] 10Toolforge (Toolforge iteration 07), 13Patch-For-Review: Disable-tool is unable to archive databases for tools.totally-not-xikimania - https://phabricator.wikimedia.org/T359938#9623576 (10CodeReviewBot) taavi opened https://gitlab.wikimedia.org/repos/cloud/toolforge/disable-tool/-/merge_requests/19 Do not tr... [14:06:10] 10Toolforge (Toolforge iteration 07), 13Patch-For-Review: Disable-tool is unable to archive databases for tools.totally-not-xikimania - https://phabricator.wikimedia.org/T359938#9623583 (10taavi) 05In progress→03Resolved [14:06:28] (PuppetAgentNoResources) firing: No Puppet resources found on instance tools-puppetdb-2 on project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [14:11:02] 10Toolforge (Toolforge iteration 07), 10cloud-services-team (FY2023/2024-Q3-Q4), 05Goal, 13Patch-For-Review: [infra] Decommission the Grid Engine infrastructure - https://phabricator.wikimedia.org/T314664#9623595 (10taavi) [14:20:10] 05Grid-Engine-to-K8s-Migration, 06Growth-Team, 10Community-Tech (CommTech-Kanban): Migrate ERANBOT project off of Grid Engine - https://phabricator.wikimedia.org/T306888#9623628 (10TheresNoTime) Backed up `jobs.yaml` to `jobs.backup.20240312.yaml` and made the following change to `jobs.yaml`: `lang=diff dif... [14:24:59] 10Toolforge: Rust (buildservice) requires PHP - https://phabricator.wikimedia.org/T359937#9623653 (10Magnus) I added an [[ https://github.com/magnusmanske/mixnmatch_rs/blob/main/Aptfile | Aptfile ]] as described and rebuild the image, but: ` tools.mix-n-match@tools-sgebastion-10:~/mixnmatch_rs$ toolforge jobs ru... [14:25:28] (PuppetAgentFailure) firing: Puppet agent failure detected on instance tools-puppetserver-01 in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [14:34:09] 05Grid-Engine-to-K8s-Migration, 06Growth-Team, 10Community-Tech (CommTech-Kanban): Migrate ERANBOT project off of Grid Engine - https://phabricator.wikimedia.org/T306888#9623696 (10TheresNoTime) p:05Triage→03High the simplewiki/arwiki errors seem to have disappeared, all jobs are running, nothing is on t... [14:34:47] 10Toolforge: Rust (buildservice) requires PHP - https://phabricator.wikimedia.org/T359937#9623701 (10Magnus) Ah, I found `/layers/fagiani_apt/apt/usr/bin/php8.1`, I guess it's just not symlinked anywhere? [14:37:12] 10Toolforge: Rust (buildservice) requires PHP - https://phabricator.wikimedia.org/T359937#9623712 (10dcaro) >>! In T359937#9623701, @Magnus wrote: > Ah, I found `/layers/fagiani_apt/apt/usr/bin/php8.1`, I guess it's just not symlinked anywhere? Might be yes :/, we don't run the deb package scripts (as we are ju... [14:40:28] (PuppetAgentFailure) resolved: Puppet agent failure detected on instance tools-puppetserver-01 in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [14:49:49] 10Toolforge: Rust (buildservice) requires PHP - https://phabricator.wikimedia.org/T359937#9623782 (10Magnus) Would it be possible to add a symlink in `$PATH` somewhere? `php` is stable, `php8.1` maybe less so [14:50:42] (CloudVPSDesignateLeaks) firing: (5) Detected 6 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:51:28] (PuppetAgentNoResources) resolved: No Puppet resources found on instance tools-puppetdb-2 on project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [14:55:42] (CloudVPSDesignateLeaks) firing: (5) Detected 6 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:56:18] 10Toolforge: Rust (buildservice) requires PHP - https://phabricator.wikimedia.org/T359937#9623827 (10dcaro) >>! In T359937#9623782, @Magnus wrote: > Would it be possible to add a symlink in `$PATH` somewhere? `php` is stable, `php8.1` maybe less so The issue there is that we would not know which binary to simli... [14:59:50] (ProbeDown) firing: Service tools-static-14:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-14:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [15:00:42] (CloudVPSDesignateLeaks) resolved: (5) Detected 6 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [15:04:50] (ProbeDown) resolved: Service tools-static-14:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-14:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [15:12:47] (03PS1) 10Elukey: Add fake Docker secret config for Dragonfly on ml-serve k8s [labs/private] - 10https://gerrit.wikimedia.org/r/1010534 (https://phabricator.wikimedia.org/T359416) [15:15:14] (03CR) 10Elukey: [V: 03+2 C: 03+2] Add fake Docker secret config for Dragonfly on ml-serve k8s [labs/private] - 10https://gerrit.wikimedia.org/r/1010534 (https://phabricator.wikimedia.org/T359416) (owner: 10Elukey) [15:31:53] 10Toolforge: Rust (buildservice) requires PHP - https://phabricator.wikimedia.org/T359937#9623974 (10Magnus) Looks like the php setup is incomplete. Running `php8.1 path/to/my/php/script` dies with `Fatal error: Uncaught Error: Class "mysqli" not found`. It looks like, for this tool, the buildservice is not fun... [15:34:04] 10Toolforge: Rust (buildservice) requires PHP - https://phabricator.wikimedia.org/T359937#9623987 (10taavi) >>! In T359937#9623974, @Magnus wrote: > Looks like the php setup is incomplete. Running `php8.1 path/to/my/php/script` dies with `Fatal error: Uncaught Error: Class "mysqli" not found`. The PHP mysql ext... [15:40:51] 10Toolforge: Rust (buildservice) requires PHP - https://phabricator.wikimedia.org/T359937#9623990 (10Magnus) Thanks @taavi Is there a list of packages required for a toolforge-equivalent php setup? [15:43:43] 05Grid-Engine-to-K8s-Migration: Migrate wikisaurusbot from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320164#9623991 (10dcaro) @Wikisaurus2 I have sent a PR https://github.com/wikisaurus/wikisaurusbot/pull/1 that should work, note the instructions in the README.md file. Wi... [15:48:33] 10Toolforge: Rust (buildservice) requires PHP - https://phabricator.wikimedia.org/T359937#9623995 (10Magnus) @taavi I added `php-mysql` to the `Aptfile` and rebuild the image, same error. I undestand and applaud the drive to use Docker and k8s and buildservices etc but now I, as a volunteer, have to invest cons... [15:55:49] 05Grid-Engine-to-K8s-Migration, 06Growth-Team, 10Community-Tech (CommTech-Kanban): Migrate ERANBOT project off of Grid Engine - https://phabricator.wikimedia.org/T306888#9624000 (10komla) >>! In T306888#9623696, @TheresNoTime wrote: > the simplewiki/arwiki errors seem to have disappeared, all jobs are runnin... [15:59:46] 05Grid-Engine-to-K8s-Migration: Migrate mbh from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319883#9624004 (10MBH) Remind me, how to rebuild only one tool? Will it be faster than rebuilding all tools and will other tools remain in image after that? Another question: why ht... [16:00:28] (PuppetStaleCertificates) firing: Found non-revoked Puppet certificates for 1 deleted instances on metricsinfra-puppetmaster-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [16:06:06] 10Toolforge (Toolforge iteration 07), 13Patch-For-Review, 15User-aborrero: [jobs-api,buildservice-api,envvars-api] Investigate ways to present our multiple Openapi definitions to a future consolidated CLI client - https://phabricator.wikimedia.org/T354745#9624011 (10Slst2020) @dcaro thanks! Another question:... [16:06:11] (03PS1) 10Arturo Borrero Gonzalez: toolforge.k8s.prepare_upgrade: be more flexible when checkin repository things [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1010558 (https://phabricator.wikimedia.org/T359927) [16:07:28] (PuppetAgentFailure) firing: Puppet agent failure detected on instance tools-puppetdb-2 in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [16:13:41] (CloudVPSDesignateLeaks) firing: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [16:17:24] 05Grid-Engine-to-K8s-Migration, 06Growth-Team, 10Community-Tech (CommTech-Kanban): Migrate ERANBOT project off of Grid Engine - https://phabricator.wikimedia.org/T306888#9624065 (10JJMC89) [16:18:22] 10Wikibugs: wikibugs requests.exceptions.ConnectionError - https://phabricator.wikimedia.org/T359953 (10TheresNoTime) 03NEW [16:18:41] (CloudVPSDesignateLeaks) firing: (4) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [16:19:09] 10Wikibugs: wikibugs requests.exceptions.ConnectionError - https://phabricator.wikimedia.org/T359953#9624082 (10TheresNoTime) [16:22:28] (PuppetAgentFailure) resolved: Puppet agent failure detected on instance tools-puppetdb-2 in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [16:23:41] (CloudVPSDesignateLeaks) firing: (4) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [16:28:41] (CloudVPSDesignateLeaks) resolved: (4) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [16:35:41] 10Wikibugs: wikibugs requests.exceptions.ConnectionError - https://phabricator.wikimedia.org/T359953#9624139 (10bd808) [16:38:52] 10Cloud Services Proposals, 10Toolforge, 15User-aborrero: Decision request - Toolforge external infrastructure domain usage - https://phabricator.wikimedia.org/T306039#9624155 (10taavi) 05Open→03Resolved a:03taavi There have been no objections for the `.svc.toolforge.org` name so I'm declaring that the... [16:40:24] 10Toolforge: wikibugs requests.exceptions.ConnectionError - https://phabricator.wikimedia.org/T359953#9624162 (10bd808) This is `toolforge jobs logs -f` failing and not anything actually related to wikibugs as an application. A workaround for this general class of problem tailing logs as well as it's limitation... [16:41:02] 10Toolforge: `toolforge jobs logs -f` crashes after a while with internal k8s api errors - https://phabricator.wikimedia.org/T359953#9624164 (10bd808) [16:44:47] 05Grid-Engine-to-K8s-Migration: Migrate mbh from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319883#9624197 (10MBH) And what the difference between TOOL_REPLICA and TOOL_TOOLSDB envvars? They are equal now for my tool. And (this is screenshot from building log) why some too... [16:54:11] 05Grid-Engine-to-K8s-Migration: Migrate mbh from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319883#9624240 (10dcaro) >>! In T319883#9624004, @MBH wrote: > Remind me, how to rebuild only one tool? Will it be faster than rebuilding all tools and will other tools remain in ima... [16:54:54] 05Grid-Engine-to-K8s-Migration: Migrate mbh from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319883#9624242 (10bd808) >>! In T319883#9624197, @MBH wrote: > And what the difference between TOOL_REPLICA and TOOL_TOOLSDB envvars? They are equal now for my tool. Future proofing... [16:57:01] 10Toolforge (Toolforge iteration 07), 13Patch-For-Review, 15User-aborrero: [jobs-api,buildservice-api,envvars-api] Investigate ways to present our multiple Openapi definitions to a future consolidated CLI client - https://phabricator.wikimedia.org/T354745#9624257 (10dcaro) >>! In T354745#9624011, @Slst2020 w... [16:58:07] 10Toolforge: `toolforge jobs logs -f` crashes after a while with internal k8s api errors - https://phabricator.wikimedia.org/T359953#9624268 (10TheresNoTime) ah, thank you @bd808! [17:00:27] 10Toolforge: Rust (buildservice) requires PHP - https://phabricator.wikimedia.org/T359937#9624277 (10dcaro) >>! In T359937#9623995, @Magnus wrote: > @taavi I added `php-mysql` to the `Aptfile` and rebuild the image, same error. > > I undestand and applaud the drive to use Docker and k8s and buildservices etc bu... [17:03:29] 05Grid-Engine-to-K8s-Migration: Migrate mbh from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319883#9624285 (10MBH) > are you sending a newline at the end of your POST data? First time I didn't add it, now I add it and tool freezes too. As far as I know, all of my db connec... [17:07:15] 10Toolforge: `toolforge jobs ...` should use named loggers and always show timestamps and logger names - https://phabricator.wikimedia.org/T359963 (10bd808) 03NEW [17:11:41] (CloudVPSDesignateLeaks) firing: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [17:14:21] 05Grid-Engine-to-K8s-Migration, 06Growth-Team, 10Community-Tech (CommTech-Kanban): Migrate ERANBOT project off of Grid Engine - https://phabricator.wikimedia.org/T306888#9624348 (10MusikAnimal) 05Open→03Resolved All seems to be working great! Thank you @JJMC89 and @TheresNoTime for tending to this, and t... [17:17:44] 05Grid-Engine-to-K8s-Migration: Migrate mbh from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319883#9624356 (10bd808) >>! In T319883#9624285, @MBH wrote: > As far as I know, all of my db connections are to wiki db replicas. If replicas and toolsdb are two different things, w... [17:21:41] (CloudVPSDesignateLeaks) resolved: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [17:26:50] (ProbeDown) firing: Service tools-static-14:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-14:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [17:31:50] (ProbeDown) resolved: Service tools-static-14:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-14:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [17:35:01] (03CR) 10BryanDavis: [C: 03+1] Remove jobutils binary package [labs/toollabs] - 10https://gerrit.wikimedia.org/r/1010517 (https://phabricator.wikimedia.org/T314664) (owner: 10Majavah) [17:35:09] (03CR) 10BryanDavis: [C: 03+1] misctools: Remove oge-crontab script [labs/toollabs] - 10https://gerrit.wikimedia.org/r/1010518 (https://phabricator.wikimedia.org/T314664) (owner: 10Majavah) [17:37:28] (PuppetAgentFailure) firing: Puppet agent failure detected on instance toolsbeta-puppetdb-02 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [17:49:29] 10Wikibugs, 13Patch-For-Review, 15User-bd808: Show when a task has been closed as a duplicate - https://phabricator.wikimedia.org/T128868#9624545 (10bd808) [17:49:55] 10Wikibugs, 13Patch-For-Review, 15User-bd808: Use case-insensitive sort for tags added to the irc log - https://phabricator.wikimedia.org/T90339#9624558 (10bd808) https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/17 [17:50:06] 10Wikibugs, 13Patch-For-Review, 15User-bd808: Print events in closed tasks in grey - https://phabricator.wikimedia.org/T140881#9624547 (10bd808) https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/17 [17:50:25] 10Wikibugs, 13Patch-For-Review, 15User-bd808: Hide User-XX projects from wikibugs output - https://phabricator.wikimedia.org/T180293#9624567 (10bd808) [17:51:37] 10Wikibugs, 13Patch-For-Review, 15User-bd808: Wikibugs should not accidentally ping SREs by sending text "# page" - https://phabricator.wikimedia.org/T281105#9624565 (10bd808) https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/17 [17:52:28] (PuppetAgentFailure) resolved: Puppet agent failure detected on instance toolsbeta-puppetdb-02 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [17:57:23] 10Striker: Dev environment Keystone crashes - https://phabricator.wikimedia.org/T359972 (10bd808) 03NEW [18:06:31] 10Striker: Striker dev env gitlab root credentials do not work - https://phabricator.wikimedia.org/T355525#9624614 (10bd808) I wonder if the root issue is that the `gitlab_rails['initial_root_password'] = 'docker-gitlab'` magic at https://gerrit.wikimedia.org/r/plugins/gitiles/labs/striker/+/2193f609db964ad97dd5... [18:15:54] 10Striker: Striker dev env gitlab root credentials do not work - https://phabricator.wikimedia.org/T355525#9624627 (10bd808) >>! In T355525#9624614, @bd808 wrote: > I wonder if the root issue is that the `gitlab_rails['initial_root_password'] = 'docker-gitlab'` magic at https://gerrit.wikimedia.org/r/plugins/git... [18:19:46] 10Wikibugs, 15User-bd808: Wikibugs testing task - https://phabricator.wikimedia.org/T90594#9624670 (10bd808) test [18:33:22] (HAProxyBackendUnavailable) firing: HAProxy service nova-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [18:46:34] 10Toolforge: Upgrade Toolforge Kubernetes to version 1.27 - https://phabricator.wikimedia.org/T359641#9624721 (10Pppery) [18:51:05] 10Tools: 'hoiscript' tool uses an unreasonable amount of disk space - https://phabricator.wikimedia.org/T349913#9624742 (10Hoi) Sorry, I was too busy to process them. I am uploading them to Commons this week. [19:00:29] (PuppetStaleCertificates) firing: Found non-revoked Puppet certificates for 1 deleted instances on metricsinfra-puppetmaster-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [19:08:08] 10Tools, 06Tech-Docs-Team, 07Documentation, 03Wikimedia-Hackathon-2024: [Hackathon 2024] Improve technical documentation of tools - https://phabricator.wikimedia.org/T358040#9624776 (10TBurmeister) Status update: I've been working on a draft of some tool-specific documentation guidelines: https://www.media... [19:13:19] 10VPS-Projects, 06cloud-services-team, 10Puppet (Puppet 7.0): Migrate Puppet servers in Cloud Services team managed projects to Puppet 7 - https://phabricator.wikimedia.org/T351453#9624796 (10Andrew) [19:18:22] (HAProxyBackendUnavailable) resolved: HAProxy service nova-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [19:55:40] 10Wikibugs, 13Patch-For-Review, 15User-bd808: Hide User-XX projects from wikibugs output - https://phabricator.wikimedia.org/T180293#9624953 (10CodeReviewBot) bd808 merged https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/17 Enhance Phorge event rendering [19:55:44] 10Wikibugs, 13Patch-For-Review, 15User-bd808: Show when a task has been closed as a duplicate - https://phabricator.wikimedia.org/T128868#9624954 (10CodeReviewBot) bd808 merged https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/17 Enhance Phorge event rendering [20:03:05] 10Wikibugs, 13Patch-For-Review: 14Use case-insensitive sort for tags added to the irc log - 14https://phabricator.wikimedia.org/T90339#9624995 (10bd808) 05In progress→03Resolved [20:04:35] 10Wikibugs, 13Patch-For-Review: Wikibugs should not accidentally ping SREs by sending text "# page" - https://phabricator.wikimedia.org/T281105#9624996 (10bd808) Verifying fix: `# page == # page` [20:04:51] 10Wikibugs, 13Patch-For-Review: 14Wikibugs should not accidentally ping SREs by sending text "# page" - 14https://phabricator.wikimedia.org/T281105#9624997 (10bd808) 05In progress→03Resolved [20:05:11] 10Wikibugs, 13Patch-For-Review: 14Hide User-XX projects from wikibugs output - 14https://phabricator.wikimedia.org/T180293#9624998 (10bd808) 05In progress→03Resolved [20:05:44] 10Wikibugs, 13Patch-For-Review: 14Print events in closed tasks in grey - 14https://phabricator.wikimedia.org/T140881#9624999 (10bd808) 05In progress→03Resolved [20:06:14] 10Wikibugs, 13Patch-For-Review: 14Show when a task has been closed as a duplicate - 14https://phabricator.wikimedia.org/T128868#9625000 (10bd808) 05In progress→03Resolved [20:30:51] 05Grid-Engine-to-K8s-Migration: Migrate srwiki from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320057#9625015 (10dungodung) I did the switchover -- commented out the two crontab jobs and created new ones with the toolforge-jobs command. Hopefully this should do it. Tonight... [20:33:46] 10Wikibugs: Update irc task to use AsyncRedisQueue - https://phabricator.wikimedia.org/T359982 (10bd808) 03NEW [20:34:16] 10Wikibugs: Update irc task to use AsyncRedisQueue - https://phabricator.wikimedia.org/T359982#9625028 (10bd808) 05Open→03In progress p:05Triage→03Medium a:03bd808 [20:34:30] 10Wikibugs: Update irc task to use AsyncRedisQueue - https://phabricator.wikimedia.org/T359982#9625032 (10CodeReviewBot) bd808 updated https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/15 Revamp irc bot plugin and queue [20:49:23] 10Wikibugs, 13Patch-For-Review: 14Show when a task has been closed as a duplicate - 14https://phabricator.wikimedia.org/T128868#9625053 (10bd808) 14Example from the wild: `lang=irc [20:41] < wb-test> Abstract Wikipedia team, Internet-Archive, Wikifunctions, WikiLambda: ZObject label is displayed in wron... [21:04:08] (03PS1) 10Umherirrender: repositories: Add some "performance" repos [labs/libraryupgrader/config] - 10https://gerrit.wikimedia.org/r/1010647 [21:04:47] (03CR) 10CI reject: [V: 04-1] repositories: Add some "performance" repos [labs/libraryupgrader/config] - 10https://gerrit.wikimedia.org/r/1010647 (owner: 10Umherirrender) [21:06:22] (03PS2) 10Umherirrender: repositories: Add some "performance" repos [labs/libraryupgrader/config] - 10https://gerrit.wikimedia.org/r/1010647 [21:37:45] (03PS1) 10Umherirrender: build: Upgrade mediawiki/mediawiki-codesniffer to v43.0.0 [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/1010662 [21:39:31] (03PS1) 10Tim Starling: Fix broken vim modelines [docker-images/toollabs-images] - 10https://gerrit.wikimedia.org/r/1010664 [21:43:00] (03PS2) 10Tim Starling: Fix broken vim modelines [docker-images/toollabs-images] - 10https://gerrit.wikimedia.org/r/1010664 [21:52:41] (03PS1) 10Tim Starling: Add procps to base images [docker-images/toollabs-images] - 10https://gerrit.wikimedia.org/r/1010690 [21:52:56] (SystemdUnitDown) firing: The service unit wikitech_run_jobs.service is in failed status on host cloudweb1003. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudweb1003 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [21:57:56] (SystemdUnitDown) resolved: The service unit wikitech_run_jobs.service is in failed status on host cloudweb1003. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudweb1003 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [22:00:28] (PuppetStaleCertificates) firing: Found non-revoked Puppet certificates for 1 deleted instances on metricsinfra-puppetmaster-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [22:26:38] (03CR) 10DannyS712: "Looks like Umherirrender has mostly handled these but for any that were missed" [labs/libraryupgrader/config] - 10https://gerrit.wikimedia.org/r/993504 (https://phabricator.wikimedia.org/T353909) (owner: 10DannyS712) [22:41:34] (03CR) 10BryanDavis: "https://vimdoc.sourceforge.net/htmldoc/options.html#modeline" [docker-images/toollabs-images] - 10https://gerrit.wikimedia.org/r/1010664 (owner: 10Tim Starling) [22:45:28] (PuppetAgentNoResources) firing: No Puppet resources found on instance paws-puppetmaster-2 on project paws - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [23:00:21] (03CR) 10Jforrester: [C: 04-1] "Until https://gitlab.wikimedia.org/repos/ci-tools/libup/-/merge_requests/23 lands this will probably fail. We already have the phan upgrad" [labs/libraryupgrader/config] - 10https://gerrit.wikimedia.org/r/993504 (https://phabricator.wikimedia.org/T353909) (owner: 10DannyS712)