[00:03:08] 06Toolforge-standards-committee: Adoption request for "request" tool - https://phabricator.wikimedia.org/T389540#10664527 (10Ladsgroup) >>! In T389540#10663104, @Tkarcher wrote: >>>! In T389540#10659244, @bd808 wrote: >> Problem 0 here is that I can't find anything that looks remotely like a license inside of /d... [00:13:29] 10Tool-fault-tolerance: Fault-tolerance tool should have a backend option - https://phabricator.wikimedia.org/T389612#10664550 (10Ladsgroup) >>! In T389612#10661875, @MatthewVernon wrote: > My other dumb idea would be: netbox knows the answer, and has http endpoints, could that be the solution? netbox is not op... [00:17:41] 06Toolforge-standards-committee: Adoption request for "request" tool - https://phabricator.wikimedia.org/T389540#10664553 (10AntiCompositeNumber) That provision was introduced in May 2023, and we know that FNDE logged in to Toolforge in Jan 2024 (T320003#9474901). That should be sufficient for the default licens... [03:41:08] 06cloud-services-team, 10Toolforge: I can't login to Toolforge using WinSCP - https://phabricator.wikimedia.org/T389704 (10MBH) 03NEW [03:42:00] FIRING: NovafullstackSustainedFailures: Novafullstack tests have been failing for more than 5hours in eqiad - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/NovafullstackSustainedFailures - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-nova-fullstack?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DNovafullstackSustainedFailures [03:47:10] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-nfs-68 [03:48:20] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for service: project,designate [03:48:48] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) on deployment eqiad1 for service: project,designate [03:51:25] !log andrew@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-nfs-68 [03:55:21] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-nfs-44 [03:57:00] RESOLVED: NovafullstackSustainedFailures: Novafullstack tests have been failing for more than 5hours in eqiad - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/NovafullstackSustainedFailures - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-nova-fullstack?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DNovafullstackSustainedFailures [04:00:52] !log andrew@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-nfs-44 [04:33:03] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-68 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [04:35:27] 06cloud-services-team, 10Toolforge: I can't login to Toolforge using WinSCP - https://phabricator.wikimedia.org/T389704#10664695 (10Peachey88) [04:35:36] 06cloud-services-team, 10Toolforge: I can't login to Toolforge using WinSCP - https://phabricator.wikimedia.org/T389704#10664697 (10MBH) I also can't login from my phone, using TotalCommander with SFTP plugin, it worked correctly before.{F58893440} [04:37:46] 06cloud-services-team, 10Toolforge: mbh can't login to Toolforge - https://phabricator.wikimedia.org/T389704#10664698 (10MBH) [06:54:31] 06cloud-services-team, 10Toolforge: mbh can't login to Toolforge - https://phabricator.wikimedia.org/T389704#10664723 (10Peachey88) When was the last time you were able to successfully signed in? I believe some of the older key signing types were removed a couple of months ago from memory. [06:55:25] 06cloud-services-team, 10Toolforge: mbh can't login to Toolforge - https://phabricator.wikimedia.org/T389704#10664735 (10MBH) Yesterday. [07:24:37] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Cloud-VPS, 05Cloud-Services-Origin-Team, 07Cloud-Services-Worktype-Maintenance, 05Goal: [ceph] Upgrade hosts to bullseye - https://phabricator.wikimedia.org/T309789#10664828 (10Aklapper) 05In progress→03Open Resetting task status from "In Progress" to "Ope... [07:25:30] 06cloud-services-team, 10Toolforge: WMCS FY22/23 Q3: next steps in grid engine deprecation - https://phabricator.wikimedia.org/T327254#10664866 (10Aklapper) [07:25:34] 06cloud-services-team, 10Toolforge, 13Patch-For-Review: Toolforge: improve local kubernetes development setup - https://phabricator.wikimedia.org/T326789#10664861 (10Aklapper) 05In progress→03Open Resetting task status from "In Progress" to "Open" as this task has been "in progress" for more than two years. [07:26:26] 10Tool-refill: Toolforge: refill doesn't work on Wikipedia language versions other than English - https://phabricator.wikimedia.org/T295327#10664891 (10Aklapper) 05In progress→03Open Resetting task status from "In Progress" to "Open" as this task has been "in progress" for more than two years. [08:31:46] 06cloud-services-team, 10Toolforge: mbh can't login to Toolforge - https://phabricator.wikimedia.org/T389704#10665030 (10MBH) I assumed that problem may be in old key. I have generated a new RSA key using PuttyGEN and uploaded a public key to https://toolsadmin.wikimedia.org/profile/settings/ssh-keys/. I tried... [09:57:31] 10Cloud-VPS (Quota-requests): Add new flavor for dwl project and increase quota - https://phabricator.wikimedia.org/T389711 (10Giftpflanze) 03NEW [12:14:46] 06cloud-services-team, 10Toolforge: dev.toolforge.org unreachable - https://phabricator.wikimedia.org/T389717 (10Multichill) 03NEW [12:17:32] 06cloud-services-team, 10Toolforge: dev.toolforge.org unreachable - https://phabricator.wikimedia.org/T389717#10665270 (10LucasWerkmeister) `lang=shell-session root@tools-bastion-12:~# systemctl --failed UNIT LOAD ACTIVE SUB DESCRIPTION ● sssd-pam.service loaded failed failed SSSD PA... [14:57:35] 06cloud-services-team, 10Toolforge: mbh can't login to Toolforge - https://phabricator.wikimedia.org/T389704#10665309 (10MBH) Works now with an old keyfile. I'm not closing this task because maybe someone could explain what was this? [15:06:50] 06cloud-services-team, 10Toolforge: dev.toolforge.org unreachable - https://phabricator.wikimedia.org/T389717#10665314 (10taavi) This seems to have fixed itself? [15:15:02] 06cloud-services-team, 10Toolforge: dev.toolforge.org unreachable - https://phabricator.wikimedia.org/T389717#10665331 (10LucasWerkmeister) Seems like it 🤷 journal indicates SSSD was started again 13:11:45 UTC for no reason that I can make out. [16:15:59] 06cloud-services-team, 10Toolforge: dev.toolforge.org unreachable - https://phabricator.wikimedia.org/T389717#10665405 (10Stephonjeffries19) →14Duplicate dup:03T389721 [17:02:28] 06cloud-services-team, 10Toolforge: dev.toolforge.org unreachable - https://phabricator.wikimedia.org/T389717#10665460 (10Pppery) 05Duplicate→03Open