[02:12:44] <wikibugs>	 (03PS1) 10Dzahn: secrets: add fake SSH private key for zuul [labs/private] - 10https://gerrit.wikimedia.org/r/1161093 (https://phabricator.wikimedia.org/T395938)
[02:15:43] <wikibugs>	 (03CR) 10Dzahn: [V:03+2 C:03+2] secrets: add fake SSH private key for zuul [labs/private] - 10https://gerrit.wikimedia.org/r/1161093 (https://phabricator.wikimedia.org/T395938) (owner: 10Dzahn)
[02:15:51] <wikibugs>	 (03PS2) 10Dzahn: secrets: add fake SSH private key for zuul [labs/private] - 10https://gerrit.wikimedia.org/r/1161093 (https://phabricator.wikimedia.org/T395938)
[02:15:55] <wikibugs>	 (03CR) 10Dzahn: [V:03+2] secrets: add fake SSH private key for zuul [labs/private] - 10https://gerrit.wikimedia.org/r/1161093 (https://phabricator.wikimedia.org/T395938) (owner: 10Dzahn)
[03:21:37] <wikibugs>	 (03PS1) 10Andrew Bogott: Comment back in cinder ldap passwords [labs/private] - 10https://gerrit.wikimedia.org/r/1161116
[03:22:02] <wikibugs>	 (03CR) 10Andrew Bogott: [V:03+2 C:03+2] Comment back in cinder ldap passwords [labs/private] - 10https://gerrit.wikimedia.org/r/1161116 (owner: 10Andrew Bogott)
[04:25:22] <jinxer-wm>	 FIRING: [12x] HAProxyBackendUnavailable: HAProxy service cinder-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable
[04:26:17] <icinga-wm>	 PROBLEM - nova-compute proc minimum on cloudvirt1047 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting
[04:26:22] <jinxer-wm>	 FIRING: [2x] HAProxyServiceUnavailable: HAProxy service neutron-api_backend has no available backends on cloudlb1001:9900 - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyServiceUnavailable
[04:26:32] <wikibugs>	 06cloud-services-team: HAProxyServiceUnavailable - https://phabricator.wikimedia.org/T397390 (10phaultfinder) 03NEW
[04:27:17] <icinga-wm>	 RECOVERY - nova-compute proc minimum on cloudvirt1047 is OK: PROCS OK: 1 process with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting
[04:28:35] <icinga-wm>	 PROBLEM - nova-compute proc minimum on cloudvirt1071 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting
[04:29:35] <icinga-wm>	 RECOVERY - nova-compute proc minimum on cloudvirt1071 is OK: PROCS OK: 1 process with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting
[04:30:03] <icinga-wm>	 PROBLEM - nova-compute proc minimum on cloudvirt1073 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting
[04:30:17] <icinga-wm>	 PROBLEM - nova-compute proc minimum on cloudvirt1047 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting
[04:30:56] <jinxer-wm>	 FIRING: SystemdUnitDown: The service unit nova-fullstack.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[04:31:03] <icinga-wm>	 RECOVERY - nova-compute proc minimum on cloudvirt1073 is OK: PROCS OK: 1 process with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting
[04:31:17] <icinga-wm>	 RECOVERY - nova-compute proc minimum on cloudvirt1047 is OK: PROCS OK: 1 process with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting
[04:33:37] <icinga-wm>	 PROBLEM - nova-compute proc minimum on cloudvirt1069 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting
[04:33:37] <icinga-wm>	 PROBLEM - nova-compute proc minimum on cloudvirt1067 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting
[04:34:03] <icinga-wm>	 PROBLEM - nova-compute proc minimum on cloudvirt1044 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting
[04:34:37] <icinga-wm>	 RECOVERY - nova-compute proc minimum on cloudvirt1069 is OK: PROCS OK: 1 process with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting
[04:34:37] <icinga-wm>	 RECOVERY - nova-compute proc minimum on cloudvirt1067 is OK: PROCS OK: 1 process with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting
[04:35:03] <icinga-wm>	 RECOVERY - nova-compute proc minimum on cloudvirt1044 is OK: PROCS OK: 1 process with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting
[04:36:23] <icinga-wm>	 PROBLEM - nova-compute proc minimum on cloudvirt1056 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting
[04:37:23] <icinga-wm>	 RECOVERY - nova-compute proc minimum on cloudvirt1056 is OK: PROCS OK: 1 process with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting
[04:40:17] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for all services
[04:45:56] <jinxer-wm>	 FIRING: SystemdUnitDown: The service unit designate_floating_ip_ptr_records_updater.service is in failed status on host cloudcontrol1006. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1006 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[04:48:07] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) on deployment eqiad1 for all services
[04:50:56] <jinxer-wm>	 RESOLVED: SystemdUnitDown: The service unit designate_floating_ip_ptr_records_updater.service is in failed status on host cloudcontrol1006. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1006 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[04:55:52] <jinxer-wm>	 RESOLVED: [24x] HAProxyBackendUnavailable: HAProxy service cinder-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable
[04:57:52] <jinxer-wm>	 RESOLVED: [9x] HAProxyServiceUnavailable: HAProxy service heat-api_backend has no available backends on cloudlb1001:9900 - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyServiceUnavailable
[06:52:49] <wikibugs>	 10VPS-project-Phabricator, 06collaboration-services: Requesting manual activation of phabricator.wmcloud.org accounts - https://phabricator.wikimedia.org/T397280#10930493 (10A_smart_kitten) Thank you @dzahn! All seems to work okay :)
[07:49:14] <wikibugs>	 06cloud-services-team: HAProxyServiceUnavailable - https://phabricator.wikimedia.org/T397390#10930715 (10dcaro) @Andrew This seems fixed now, though it happened during your working hours I think and I see maybe it's related to https://gerrit.wikimedia.org/r/c/operations/puppet/+/1161148 ?
[07:49:37] <wikibugs>	 06cloud-services-team: HAProxyServiceUnavailable - https://phabricator.wikimedia.org/T397390#10930716 (10dcaro) p:05Triage→03Low
[07:53:26] <wikibugs>	 06cloud-services-team: SystemdUnitDown The systemd unit nova-fullstack.service on node cloudcontrol1007 has been failing for more than two hours. - https://phabricator.wikimedia.org/T397357#10930735 (10dcaro) This is still happening, it seems to be timing out when waiting for the reverse DNS cleanup:  ` Jun 19 0...
[07:58:08] <wikibugs>	 06cloud-services-team: SystemdUnitDown The systemd unit backup_cinder_volumes.service on node cloudbackup1001-dev has been failing for more than two hours. - https://phabricator.wikimedia.org/T397105#10930752 (10dcaro) 05Open→03Resolved a:03dcaro
[07:58:54] <wikibugs>	 06cloud-services-team: SystemdUnitDown The systemd unit nova-fullstack.service on node cloudcontrol1007 has been failing for more than two hours. - https://phabricator.wikimedia.org/T397357#10930755 (10dcaro) It seems it has been flapping very often lately (https://grafana-rw.wikimedia.org/d/ebJoA6VWz/wmcs-opens...
[08:05:43] <wikibugs>	 06cloud-services-team: SystemdUnitDown The systemd unit nova-fullstack.service on node cloudcontrol1007 has been failing for more than two hours. - https://phabricator.wikimedia.org/T397357#10930852 (10dcaro) Got specially choppy in the last couple of days: {F62387966}
[08:05:54] <wikibugs>	 06cloud-services-team: SystemdUnitDown The systemd unit nova-fullstack.service on node cloudcontrol1007 has been failing for more than two hours. - https://phabricator.wikimedia.org/T397357#10930864 (10dcaro) Cleaned up all the existing VMs, trying to get a clean run
[08:16:10] <wikibugs>	 10wikitech.wikimedia.org: Wikitech double redirect bot needs new SUL OAuth credentials after Wikitech authn changes - https://phabricator.wikimedia.org/T376224#10930958 (10taavi) 05Open→03Resolved a:03taavi
[08:16:30] <wikibugs>	 06cloud-services-team: SystemdUnitDown The systemd unit nova-fullstack.service on node cloudcontrol1007 has been failing for more than two hours. - https://phabricator.wikimedia.org/T397357#10930961 (10dcaro) Hmm.... the fullstack logs did successfully remove the VM, but the DNS records are still there for a dif...
[08:28:12] <wikibugs>	 (03open) 10dcaro: README: add dev notes about authentication [repos/cloud/cloud-vps/horizon/deploy] (support_podman) - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/horizon/deploy/-/merge_requests/6
[08:28:27] <wikibugs>	 (03update) 10dcaro: README: add dev notes about authentication [repos/cloud/cloud-vps/horizon/deploy] (support_podman) - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/horizon/deploy/-/merge_requests/6
[08:29:18] <wikibugs>	 (03update) 10dcaro: makefile: support podman [repos/cloud/cloud-vps/horizon/deploy] (use_markdown) - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/horizon/deploy/-/merge_requests/5
[08:29:22] <wikibugs>	 (03update) 10dcaro: makefile: support podman [repos/cloud/cloud-vps/horizon/deploy] (use_markdown) - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/horizon/deploy/-/merge_requests/5
[08:29:38] <wikibugs>	 (03update) 10dcaro: README: add dev notes about authentication [repos/cloud/cloud-vps/horizon/deploy] (support_podman) - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/horizon/deploy/-/merge_requests/6
[08:29:41] <wikibugs>	 (03update) 10dcaro: README: use makrdown for nice presentation in gitlab [repos/cloud/cloud-vps/horizon/deploy] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/horizon/deploy/-/merge_requests/4
[08:31:28] <wikibugs>	 06cloud-services-team: SystemdUnitDown The systemd unit nova-fullstack.service on node cloudcontrol1007 has been failing for more than two hours. - https://phabricator.wikimedia.org/T397357#10931044 (10dcaro) 05Open→03Resolved a:03dcaro I cleaned up all the VMs, and ran the `wmcs-dnsleaks --delpoyment...
[08:33:12] <wikibugs>	 06cloud-services-team, 10Horizon, 13Patch-For-Review: Horizon proxy tab Edit buttons not working - https://phabricator.wikimedia.org/T397272#10931063 (10dcaro) 05Open→03In progress p:05Triage→03Medium a:03dcaro
[08:33:28] <wikibugs>	 10cloud-services-team (FY2024/2025-Q3-Q4), 10Horizon, 13Patch-For-Review: Horizon proxy tab Edit buttons not working - https://phabricator.wikimedia.org/T397272#10931068 (10dcaro)
[08:33:42] <wikibugs>	 06cloud-services-team: SystemdUnitDown The systemd unit backup_cinder_volumes.service on node cloudbackup1002-dev has been failing for more than two hours. - https://phabricator.wikimedia.org/T397100#10931082 (10dcaro) 05Open→03Resolved a:03dcaro This is fixed now
[08:35:44] <wikibugs>	 06cloud-services-team: KernelErrors Server cloudcephosd1024 logged kernel errors - https://phabricator.wikimedia.org/T396937#10931091 (10dcaro) 05Open→03Resolved a:03dcaro Expected: ` root@cloudcephosd1024:~# journalctl -k -p err -- Journal begins at Sat 2025-06-14 21:29:07 UTC, ends at Thu 2025-06-19...
[08:36:01] <wikibugs>	 06cloud-services-team: NovafullstackSustainedFailures Novafullstack tests have been failing for more than 5hours in eqiad - https://phabricator.wikimedia.org/T396934#10931097 (10dcaro) 05Open→03Resolved a:03dcaro Cleaned up and restarted, and now it's working.
[08:37:56] <wikibugs>	 06cloud-services-team: KernelErrors Server cloudcephosd1015 logged kernel errors - https://phabricator.wikimedia.org/T396796#10931107 (10dcaro) 05Open→03Resolved a:03dcaro Expected: ` Jun 17 19:03:08 cloudcephosd1014 kernel: x86/cpu: VMX (outside TXT) disabled by BIOS...
[08:38:05] <wikibugs>	 06cloud-services-team: KernelErrors Server cloudcephosd1016 logged kernel errors - https://phabricator.wikimedia.org/T396801#10931111 (10dcaro) 05Open→03Resolved a:03dcaro Expected: ` Jun 17 19:03:08 cloudcephosd1014 kernel: x86/cpu: VMX (outside TXT) disabled by BIOS...
[08:38:10] <wikibugs>	 06cloud-services-team: KernelErrors - https://phabricator.wikimedia.org/T396810#10931115 (10dcaro) 05Open→03Resolved a:03dcaro Expected: ` Jun 17 19:03:08 cloudcephosd1014 kernel: x86/cpu: VMX (outside TXT) disabled by BIOS...
[08:38:16] <wikibugs>	 06cloud-services-team: KernelErrors Server cloudcephosd1017 logged kernel errors - https://phabricator.wikimedia.org/T396832#10931120 (10dcaro) 05Open→03Resolved a:03dcaro Expected: ` Jun 17 19:03:08 cloudcephosd1014 kernel: x86/cpu: VMX (outside TXT) disabled by BIOS...
[08:38:25] <wikibugs>	 06cloud-services-team: KernelErrors Server cloudcephosd1018 logged kernel errors - https://phabricator.wikimedia.org/T396859#10931124 (10dcaro) 05Open→03Resolved a:03dcaro Expected: ` Jun 17 19:03:08 cloudcephosd1014 kernel: x86/cpu: VMX (outside TXT) disabled by BIOS...
[08:38:33] <wikibugs>	 06cloud-services-team: KernelErrors Server cloudcephosd1019 logged kernel errors - https://phabricator.wikimedia.org/T396909#10931128 (10dcaro) 05Open→03Resolved a:03dcaro Expected: ` Jun 17 19:03:08 cloudcephosd1014 kernel: x86/cpu: VMX (outside TXT) disabled by BIOS...
[08:38:39] <wikibugs>	 06cloud-services-team: KernelErrors Server cloudcephosd1020 logged kernel errors - https://phabricator.wikimedia.org/T396917#10931132 (10dcaro) 05Open→03Resolved a:03dcaro Expected: ` Jun 17 19:03:08 cloudcephosd1014 kernel: x86/cpu: VMX (outside TXT) disabled by BIOS...
[08:38:45] <wikibugs>	 06cloud-services-team: KernelErrors Server cloudcephosd1022 logged kernel errors - https://phabricator.wikimedia.org/T396921#10931136 (10dcaro) 05Open→03Resolved a:03dcaro Expected: ` Jun 17 19:03:08 cloudcephosd1014 kernel: x86/cpu: VMX (outside TXT) disabled by BIOS...
[08:38:53] <wikibugs>	 06cloud-services-team: KernelErrors Server cloudcephosd1023 logged kernel errors - https://phabricator.wikimedia.org/T396929#10931151 (10dcaro) 05Open→03Resolved a:03dcaro Expected: ` Jun 17 19:03:08 cloudcephosd1014 kernel: x86/cpu: VMX (outside TXT) disabled by BIOS...
[08:41:06] <wikibugs>	 06cloud-services-team: PuppetFailure Puppet has failed on cloudcontrol2010-dev:9100 - https://phabricator.wikimedia.org/T396769#10931156 (10dcaro) 05Open→03Resolved a:03dcaro Not failing anymore.
[08:43:20] <wikibugs>	 10Tool-query-chest, 10Wikidata, 10Wikidata Query UI: Use query-chest for short URLs when the w.wiki shortener fails for long queries - https://phabricator.wikimedia.org/T334893#10931160 (10jhsoby)
[08:52:56] <wikibugs>	 10cloud-services-team (FY2024/2025-Q3-Q4), 10Cloud-VPS, 06collaboration-services, 10GitLab (Infrastructure): Volume is stuck to deleted instance in devtools project - https://phabricator.wikimedia.org/T396739#10931174 (10dcaro) p:05Triage→03High
[08:53:02] <wikibugs>	 10cloud-services-team (FY2024/2025-Q3-Q4), 10Cloud-VPS, 06collaboration-services, 10GitLab (Infrastructure): Volume is stuck to deleted instance in devtools project - https://phabricator.wikimedia.org/T396739#10931176 (10dcaro)
[08:53:13] <wikibugs>	 10cloud-services-team (FY2024/2025-Q3-Q4), 10Cloud-VPS, 06collaboration-services, 10GitLab (Infrastructure): Volume is stuck to deleted instance in devtools project - https://phabricator.wikimedia.org/T396739#10931177 (10dcaro) a:03Andrew
[08:57:25] <wm-bot2>	 !log dcaro@acme toolsbeta-logging START - Cookbook wmcs.vps.create_project for project toolsbeta-logging in eqiad1 (T397339)
[08:57:26] <stashbot>	 wmbot~dcaro@acme: Unknown project "toolsbeta-logging"
[08:57:27] <stashbot>	 T397339: Request creation of toolsbeta-logging VPS project - https://phabricator.wikimedia.org/T397339
[08:57:30] <wm-bot2>	 !log dcaro@acme toolsbeta-logging END (FAIL) - Cookbook wmcs.vps.create_project (exit_code=99) for project toolsbeta-logging in eqiad1 (T397339)
[08:57:31] <stashbot>	 wmbot~dcaro@acme: Unknown project "toolsbeta-logging"
[08:57:33] <wm-bot2>	 !log dcaro@acme toolsbeta-logging START - Cookbook wmcs.vps.create_project for project toolsbeta-logging in eqiad1 (T397339)
[08:57:34] <stashbot>	 wmbot~dcaro@acme: Unknown project "toolsbeta-logging"
[08:57:44] <wm-bot2>	 !log dcaro@acme toolsbeta-logging END (FAIL) - Cookbook wmcs.vps.create_project (exit_code=99) for project toolsbeta-logging in eqiad1 (T397339)
[08:57:44] <stashbot>	 wmbot~dcaro@acme: Unknown project "toolsbeta-logging"
[08:59:24] <logmsgbot_cloud>	 !log dcaro@cloudcumin1001 toolsbeta-logging START - Cookbook wmcs.vps.create_project for project toolsbeta-logging in eqiad1 (T397339)
[08:59:24] <stashbot>	 dcaro@cloudcumin1001: Unknown project "toolsbeta-logging"
[09:00:03] <wikibugs>	 (03open) 10group_199_bot_333a6c67971a471aeb1cf0b14ccf9f49: projects: added project toolsbeta-logging [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/249 (https://phabricator.wikimedia.org/T397339)
[09:03:28] <wikibugs>	 (03approved) 10taavi: projects: added project toolsbeta-logging [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/249 (https://phabricator.wikimedia.org/T397339) (owner: 10group_199_bot_333a6c67971a471aeb1cf0b14ccf9f49)
[09:03:34] <logmsgbot_cloud>	 dcaro@cloudcumin1001 create_project (PID 3616594) is awaiting input
[09:03:48] <wikibugs>	 (03merge) 10dcaro: projects: added project toolsbeta-logging [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/249 (https://phabricator.wikimedia.org/T397339) (owner: 10group_199_bot_333a6c67971a471aeb1cf0b14ccf9f49)
[09:06:19] <logmsgbot_cloud>	 !log dcaro@cloudcumin1001 toolsbeta-logging END (FAIL) - Cookbook wmcs.vps.create_project (exit_code=99) for project toolsbeta-logging in eqiad1 (T397339)
[09:06:23] <stashbot>	 T397339: Request creation of toolsbeta-logging VPS project - https://phabricator.wikimedia.org/T397339
[09:08:28] <logmsgbot_cloud>	 !log dcaro@cloudcumin1001 toolsbeta-logging START - Cookbook wmcs.vps.create_project for project toolsbeta-logging in eqiad1 (T397339)
[09:09:29] <logmsgbot_cloud>	 !log dcaro@cloudcumin1001 toolsbeta-logging END (FAIL) - Cookbook wmcs.vps.create_project (exit_code=99) for project toolsbeta-logging in eqiad1 (T397339)
[09:09:48] <wikibugs>	 10cloud-services-team (FY2024/2025-Q3-Q4), 10Data-Services, 06Data-Persistence: Migrate clouddb* hosts to MariaDB 10.11 - https://phabricator.wikimedia.org/T394372#10931248 (10Marostegui) Thank you!
[09:34:16] <logmsgbot_cloud>	 !log dcaro@cloudcumin1001 toolsbeta-logging START - Cookbook wmcs.vps.create_project for project toolsbeta-logging in eqiad1 (T397339)
[09:34:19] <stashbot>	 T397339: Request creation of toolsbeta-logging VPS project - https://phabricator.wikimedia.org/T397339
[09:36:14] <logmsgbot_cloud>	 !log dcaro@cloudcumin1001 toolsbeta-logging END (PASS) - Cookbook wmcs.vps.create_project (exit_code=0) for project toolsbeta-logging in eqiad1 (T397339)
[11:18:30] <wikibugs>	 06cloud-services-team, 10Bitu, 06Infrastructure-Foundations, 07LDAP: Allocate more available UNIX UIDs for human users - https://phabricator.wikimedia.org/T355663#10931620 (10MoritzMuehlenhoff) >>! In T355663#10852835, @jhathaway wrote: > I'm not sure it is much of an issue, but that range overlaps with `s...
[11:22:34] <wikibugs>	 06cloud-services-team, 10Bitu, 06Infrastructure-Foundations, 07LDAP: Allocate more available UNIX UIDs for human users - https://phabricator.wikimedia.org/T355663#10931634 (10SLyngshede-WMF) Sounds good to me, 400.000 should last a pretty long time.   I still think that we should stop allocating uidNumbers...
[11:47:57] <wikibugs>	 (03open) 10taavi: Use separate project for log storage buckets [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/50 (https://phabricator.wikimedia.org/T396574)
[11:48:00] <wikibugs>	 (03update) 10taavi: Use separate project for log storage buckets [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/50 (https://phabricator.wikimedia.org/T396574)
[11:49:51] <wikibugs>	 (03update) 10taavi: Use separate project for log storage buckets [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/50 (https://phabricator.wikimedia.org/T396574)
[11:50:09] <wikibugs>	 (03PS8) 10Slyngshede: Build: Update build system [labs/countervandalism/CVNBot] - 10https://gerrit.wikimedia.org/r/1143806
[11:50:42] <wikibugs>	 (03CR) 10CI reject: [V:04-1] Build: Update build system [labs/countervandalism/CVNBot] - 10https://gerrit.wikimedia.org/r/1143806 (owner: 10Slyngshede)
[11:52:47] <wikibugs>	 (03update) 10taavi: Use separate project for log storage buckets [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/50 (https://phabricator.wikimedia.org/T396574)
[11:55:26] <wikibugs>	 (03update) 10taavi: Use separate project for log storage buckets [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/50 (https://phabricator.wikimedia.org/T396574)
[11:57:19] <wikibugs>	 (03update) 10taavi: Use separate project for log storage buckets [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/50 (https://phabricator.wikimedia.org/T396574)
[11:58:35] <wikibugs>	 (03update) 10taavi: Use separate project for log storage buckets [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/50 (https://phabricator.wikimedia.org/T396574)
[12:01:07] <wikibugs>	 (03update) 10taavi: Use separate project for log storage buckets [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/50 (https://phabricator.wikimedia.org/T396574)
[12:02:14] <wikibugs>	 06cloud-services-team, 10Bitu, 06Infrastructure-Foundations, 07LDAP: Allocate more available UNIX UIDs for human users - https://phabricator.wikimedia.org/T355663#10931823 (10MoritzMuehlenhoff) >>! In T355663#10931634, @SLyngshede-WMF wrote: > Sounds good to me, 400.000 should last a pretty long time. >  >...
[12:08:51] <wikibugs>	 10Cloud-VPS (Project-requests): Request creation of toolsbeta-logging VPS project - https://phabricator.wikimedia.org/T397339#10931851 (10taavi) 05Open→03Resolved a:03dcaro
[12:10:30] <wikibugs>	 10Cloud-VPS (Project-requests): Request creation of <PROJECT-NAME> VPS project - https://phabricator.wikimedia.org/T397446 (10taavi) 03NEW
[12:10:32] <wikibugs>	 10Cloud-VPS (Project-requests): Request creation of <PROJECT-NAME> VPS project - https://phabricator.wikimedia.org/T397446#10931872 (10taavi)
[12:10:40] <wikibugs>	 06cloud-services-team, 10Toolforge, 13Patch-For-Review: Provision object storage volumes for Loki - https://phabricator.wikimedia.org/T396574#10931873 (10taavi)
[12:10:52] <wikibugs>	 10Cloud-VPS (Project-requests): Request creation of tools-logging VPS project - https://phabricator.wikimedia.org/T397446#10931874 (10taavi)
[12:25:17] <wikibugs>	 (03CR) 10CI reject: [V:04-1] Localisation updates from https://translatewiki.net. [labs/tools/map-of-monuments] - 10https://gerrit.wikimedia.org/r/1161499 (owner: 10L10n-bot)
[12:32:46] <wikibugs>	 10Cloud-VPS (Project-requests): Request creation of tools-logging VPS project - https://phabricator.wikimedia.org/T397446#10931921 (10dcaro) +1
[12:33:06] <logmsgbot_cloud>	 !log dcaro@cloudcumin1001 tools-logging START - Cookbook wmcs.vps.create_project for project tools-logging in eqiad1 (T397446)
[12:33:07] <stashbot>	 dcaro@cloudcumin1001: Unknown project "tools-logging"
[12:33:08] <stashbot>	 T397446: Request creation of tools-logging VPS project - https://phabricator.wikimedia.org/T397446
[12:33:47] <wikibugs>	 (03update) 10group_199_bot_333a6c67971a471aeb1cf0b14ccf9f49: projects: added project tools-logging [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/250 (https://phabricator.wikimedia.org/T397446)
[12:33:51] <wikibugs>	 (03open) 10group_199_bot_333a6c67971a471aeb1cf0b14ccf9f49: projects: added project tools-logging [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/250 (https://phabricator.wikimedia.org/T397446)
[12:35:09] <wikibugs>	 (03approved) 10dcaro: projects: added project tools-logging [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/250 (https://phabricator.wikimedia.org/T397446) (owner: 10group_199_bot_333a6c67971a471aeb1cf0b14ccf9f49)
[12:35:12] <wikibugs>	 (03merge) 10dcaro: projects: added project tools-logging [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/250 (https://phabricator.wikimedia.org/T397446) (owner: 10group_199_bot_333a6c67971a471aeb1cf0b14ccf9f49)
[12:35:16] <logmsgbot_cloud>	 !log dcaro@cloudcumin1001 tools-logging END (ERROR) - Cookbook wmcs.vps.create_project (exit_code=97) for project tools-logging in eqiad1 (T397446)
[12:35:17] <stashbot>	 dcaro@cloudcumin1001: Unknown project "tools-logging"
[12:35:21] <logmsgbot_cloud>	 !log dcaro@cloudcumin1001 tools-logging START - Cookbook wmcs.vps.create_project for project tools-logging in eqiad1 (T397446)
[12:35:21] <stashbot>	 dcaro@cloudcumin1001: Unknown project "tools-logging"
[12:38:13] <logmsgbot_cloud>	 !log dcaro@cloudcumin1001 tools-logging END (FAIL) - Cookbook wmcs.vps.create_project (exit_code=99) for project tools-logging in eqiad1 (T397446)
[12:38:16] <stashbot>	 T397446: Request creation of tools-logging VPS project - https://phabricator.wikimedia.org/T397446
[12:43:09] <logmsgbot_cloud>	 !log dcaro@cloudcumin1001 tools-logging START - Cookbook wmcs.vps.create_project for project tools-logging in eqiad1 (T397446)
[12:46:22] <logmsgbot_cloud>	 !log dcaro@cloudcumin1001 tools-logging END (PASS) - Cookbook wmcs.vps.create_project (exit_code=0) for project tools-logging in eqiad1 (T397446)
[12:46:26] <stashbot>	 T397446: Request creation of tools-logging VPS project - https://phabricator.wikimedia.org/T397446
[12:47:31] <wikibugs>	 10Cloud-VPS (Project-requests), 13Patch-For-Review: Request creation of tools-logging VPS project - https://phabricator.wikimedia.org/T397446#10931996 (10dcaro) 05Open→03Resolved p:05Triage→03High
[12:48:50] <wikibugs>	 (03CR) 10Abijeet Patro: [V:03+2] Localisation updates from https://translatewiki.net. [labs/tools/map-of-monuments] - 10https://gerrit.wikimedia.org/r/1161499 (owner: 10L10n-bot)
[12:58:57] <wmcs-alerts>	 FIRING: ProbeDown: Service tools-k8s-haproxy-5:30000 has failed probes (http_admin_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-k8s-haproxy-5:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown
[13:00:12] <icinga-wm>	 PROBLEM - toolschecker: NFS read/writeable on labs instances on checker.tools.wmflabs.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 504 Gateway Time-out - string OK not found on http://checker.tools.wmflabs.org:80/nfs/home - 324 bytes in 60.012 second response time https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Toolschecker
[13:03:28] <wmcs-alerts>	 FIRING: InstanceDown: Project tools instance tools-nfs-2 is down   - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown
[13:03:31] <wmcs-alerts>	 FIRING: ToolsNFSDown: No tools nfs services running found - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsNFSDown  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsNFSDown
[13:03:57] <wmcs-alerts>	 FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-5:30000 has failed probes (http_admin_toolforge_org_ip4)  - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown
[13:04:25] <wikibugs>	 06cloud-services-team, 10Toolforge: Cannot log into Toolforge - https://phabricator.wikimedia.org/T397451 (10MBH) 03NEW
[13:13:28] <icinga-wm>	 RECOVERY - toolschecker: NFS read/writeable on labs instances on checker.tools.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 158 bytes in 50.585 second response time https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Toolschecker
[13:13:28] <wmcs-alerts>	 RESOLVED: InstanceDown: Project tools instance tools-nfs-2 is down   - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown
[13:14:58] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.reboot for all NFS workers
[13:18:22] <wmcs-alerts>	 FIRING: MaintainKubeusersDown: maintain-kubeusers is down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainKubeusersDown  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DMaintainKubeusersDown
[13:18:31] <wmcs-alerts>	 RESOLVED: ToolsNFSDown: No tools nfs services running found - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsNFSDown  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsNFSDown
[13:18:57] <wmcs-alerts>	 FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-5:30000 has failed probes (http_admin_toolforge_org_ip4)  - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown
[13:21:31] <wmcs-alerts>	 FIRING: ToolsToolsDBWritableState: There should be exactly one writable MariaDB instance instead of 0 - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsToolsDBWritableState  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsToolsDBWritableState
[13:23:57] <wmcs-alerts>	 RESOLVED: ProbeDown: Service tools-k8s-haproxy-6:30000 has failed probes (http_admin_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-k8s-haproxy-6:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown
[13:26:31] <wmcs-alerts>	 RESOLVED: ToolsToolsDBWritableState: There should be exactly one writable MariaDB instance instead of 0 - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsToolsDBWritableState  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsToolsDBWritableState
[13:42:38] <wikibugs>	 06cloud-services-team, 10Toolforge: Cannot log into Toolforge - https://phabricator.wikimedia.org/T397451#10932253 (10taavi) 05Open→03Resolved a:03taavi This was caused by an outage of the Toolforge NFS server that we believe is now fixed.
[13:47:04] <wmcs-alerts>	 FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-36 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses
[13:47:07] <wikibugs>	 06cloud-services-team, 10Toolforge: [toolsbeta,tofu,infra] There's some discrepancy between the volumes in toolsbeta and tofu - https://phabricator.wikimedia.org/T396276#10932270 (10taavi) 05Open→03Resolved a:03dcaro Resolved with https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-...
[13:50:04] <wikibugs>	 06cloud-services-team, 10Toolforge: `become` command not working properly on login-buster.toolforge.org - https://phabricator.wikimedia.org/T391538#10932282 (10dcaro) 05Open→03Resolved @Ykhwong this was caused by a wider outage in toolforge, should be working agan, please reopen if you still face issues.
[13:52:24] <wikibugs>	 06cloud-services-team, 10Toolforge: `become` command not working properly on login-buster.toolforge.org - https://phabricator.wikimedia.org/T391538#10932291 (10Ykhwong) 05Resolved→03Open Thanks for the update. However, I'm still experiencing the issue. When I run the become command on login-buster.toolforg...
[13:54:42] <wikibugs>	 06cloud-services-team, 10Toolforge: `become` command not working properly on login-buster.toolforge.org - https://phabricator.wikimedia.org/T391538#10932299 (10dcaro) Yep, it seems it's still hanging (note that it does not happens with all tools, `wm-lol` did work, but `tedbot` does not), I'll reboot 👍
[13:55:52] <wmcs-alerts>	 RESOLVED: MaintainKubeusersDown: maintain-kubeusers is down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainKubeusersDown  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DMaintainKubeusersDown
[13:56:28] <wmcs-alerts>	 FIRING: InstanceDown: Project tools instance tools-k8s-worker-nfs-74 is down   - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown
[13:57:27] <wikibugs>	 06cloud-services-team, 10Toolforge: `become` command not working properly on login-buster.toolforge.org - https://phabricator.wikimedia.org/T391538#10932327 (10Ykhwong) 05Open→03Resolved Thanks for the reboot — the issue seems to be resolved now. become is working properly again on login-buster.toolfor...
[13:58:45] <wikibugs>	 06cloud-services-team, 10Toolforge: `become` command not working properly on login-buster.toolforge.org - https://phabricator.wikimedia.org/T391538#10932340 (10dcaro) @Ykhwong awesome :), may I ask why are you using the old buster bastion and not the newer one? (so we can provide whatever is missing for yo...
[14:00:25] <wikibugs>	 10cloud-services-team (FY2024/2025-Q3-Q4), 10Cloud-VPS (Debian Buster Deprecation), 10Toolforge (Toolforge iteration 21), 07Epic, 05Goal: [infra] Toolforge: migrate to Debian Bullseye or later - https://phabricator.wikimedia.org/T311897#10932342 (10dcaro)
[14:01:28] <wmcs-alerts>	 RESOLVED: InstanceDown: Project tools instance tools-k8s-worker-nfs-74 is down   - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown
[14:02:34] <wikibugs>	 06cloud-services-team, 10Toolforge: `become` command not working properly on login-buster.toolforge.org - https://phabricator.wikimedia.org/T391538#10932345 (10Ykhwong) Oh, I didn't realize I was still using the Buster bastion. Thanks for letting me know. I'll check out the migration guide and start transi...
[14:06:21] <wikibugs>	 06cloud-services-team, 10Toolforge: Lock down tools-sgebastion-10 (login-buster.toolforge.org) to only members of tools with known dependencies on it - https://phabricator.wikimedia.org/T397459 (10taavi) 03NEW
[14:07:08] <wikibugs>	 06cloud-services-team, 10Toolforge: Lock down tools-sgebastion-10 (login-buster.toolforge.org) to only members of tools with known dependencies on it - https://phabricator.wikimedia.org/T397459#10932360 (10taavi)
[14:09:57] <wmcs-alerts>	 FIRING: ProbeDown: Service tools-k8s-haproxy-5:30000 has failed probes (http_admin_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-k8s-haproxy-5:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown
[14:10:20] <wikibugs>	 06cloud-services-team, 10Toolforge: [toolsdb] Revisit WritableState alert - https://phabricator.wikimedia.org/T397460 (10fnegri) 03NEW
[14:10:59] <wikibugs>	 06cloud-services-team, 10Toolforge: [toolsdb] Revisit WritableState alert - https://phabricator.wikimedia.org/T397460#10932377 (10fnegri) p:05Triage→03Medium
[14:19:57] <wmcs-alerts>	 RESOLVED: ProbeDown: Service tools-k8s-haproxy-5:30000 has failed probes (http_admin_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-k8s-haproxy-5:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown
[14:55:32] <wikibugs>	 (03update) 10dcaro: deploy_task: store error when build fails [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/92
[15:25:28] <wmcs-alerts>	 FIRING: InstanceDown: Project tools instance tools-k8s-worker-nfs-5 is down   - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown
[15:29:21] <wikibugs>	 (03update) 10dcaro: deploy_task: store error when build fails [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/92
[15:30:28] <wmcs-alerts>	 RESOLVED: InstanceDown: Project tools instance tools-k8s-worker-nfs-5 is down   - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown
[15:50:03] <wikibugs>	 (03update) 10dcaro: deploy_task: store error when build fails [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/92
[15:58:15] <wikibugs>	 (03update) 10dcaro: deploy_task: store error when build fails [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/92
[16:02:57] <wmcs-alerts>	 FIRING: ProbeDown: Service tools-k8s-haproxy-5:30000 has failed probes (http_admin_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-k8s-haproxy-5:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown
[16:04:23] <wikibugs>	 (03update) 10chuckonwumelu: show: Display latest deployment if no deploy_id included [repos/cloud/toolforge/components-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-cli/-/merge_requests/36 (https://phabricator.wikimedia.org/T394994)
[16:07:57] <wmcs-alerts>	 FIRING: [2x] ProbeDown: Service tools-k8s-haproxy-5:30000 has failed probes (http_admin_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-k8s-haproxy-5:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown
[16:10:28] <wmcs-alerts>	 FIRING: InstanceDown: Project tools instance tools-k8s-worker-nfs-39 is down   - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown
[16:12:56] <wikibugs>	 (03update) 10dcaro: deploy_task: store error when build fails [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/92
[16:15:28] <wmcs-alerts>	 RESOLVED: InstanceDown: Project tools instance tools-k8s-worker-nfs-39 is down   - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown
[16:17:57] <wmcs-alerts>	 RESOLVED: [2x] ProbeDown: Service tools-k8s-haproxy-5:30000 has failed probes (http_admin_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-k8s-haproxy-5:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown
[16:18:47] <wikibugs>	 (03update) 10dcaro: deploy_task: store error when build fails [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/92
[16:18:58] <wmcs-alerts>	 FIRING: [2x] InstanceDown: Project tools instance tools-k8s-worker-nfs-38 is down   - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown
[16:21:11] <wikibugs>	 (03approved) 10fnegri: show: Display latest deployment if no deploy_id included [repos/cloud/toolforge/components-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-cli/-/merge_requests/36 (https://phabricator.wikimedia.org/T394994) (owner: 10chuckonwumelu)
[16:23:58] <wmcs-alerts>	 RESOLVED: [2x] InstanceDown: Project tools instance tools-k8s-worker-nfs-38 is down   - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown
[16:26:47] <wikibugs>	 10Toolforge (Toolforge iteration 21): [components-api] Add all missing options for scheduled components - https://phabricator.wikimedia.org/T395071#10932707 (10dcaro) a:03dcaro
[16:26:57] <wikibugs>	 10Toolforge (Toolforge iteration 21): [components-api] add all the missing options for continuous components - https://phabricator.wikimedia.org/T395070#10932710 (10dcaro) a:05Raymond_Ndibe→03dcaro
[16:39:44] <wikibugs>	 (03update) 10chuckonwumelu: show: Display latest deployment if no deploy_id included [repos/cloud/toolforge/components-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-cli/-/merge_requests/36 (https://phabricator.wikimedia.org/T394994)
[16:58:31] <wikibugs>	 (03open) 10dcaro: deploy: add all the missing options for continuous job [repos/cloud/toolforge/components-api] (generate_config) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/93
[16:58:47] <wikibugs>	 10Toolforge (Toolforge iteration 21): [components-api] add all the missing options for continuous components - https://phabricator.wikimedia.org/T395070#10932775 (10dcaro) 05Open→03In progress
[16:59:01] <wikibugs>	 (03update) 10dcaro: deploy: add all the missing options for continuous job [repos/cloud/toolforge/components-api] (generate_config) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/93 (https://phabricator.wikimedia.org/T395070)
[17:00:57] <wikibugs>	 (03update) 10dcaro: deploy: add all the missing options for continuous job [repos/cloud/toolforge/components-api] (generate_config) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/93 (https://phabricator.wikimedia.org/T395070)
[17:05:34] <jinxer-wm>	 FIRING: DiskSpace: Disk space cloudbackup1004:9100:/srv 5.995% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudbackup1004 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace
[17:15:17] <wikibugs>	 (03approved) 10dcaro: health_check: default to 'type' but support 'health_check_type' [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/107 (https://phabricator.wikimedia.org/T396210)
[17:16:30] <wikibugs>	 (03update) 10dcaro: health_check: default to 'type' but support 'health_check_type' [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/107 (https://phabricator.wikimedia.org/T396210)
[17:16:34] <wikibugs>	 (03merge) 10dcaro: health_check: default to 'type' but support 'health_check_type' [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/107 (https://phabricator.wikimedia.org/T396210)
[17:17:13] <wikibugs>	 (03open) 10dcaro: d/changelog: bump to 16.1.14 [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/108 (https://phabricator.wikimedia.org/T396210)
[17:17:38] <wikibugs>	 (03update) 10dcaro: d/changelog: bump to 16.1.14 [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/108 (https://phabricator.wikimedia.org/T396210)
[17:18:12] <logmsgbot_cloud>	 !log dcaro@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component jobs-cli
[17:19:11] <wikibugs>	 06cloud-services-team, 10Striker: Striker should use ID instead of username to identify SUL accounts - https://phabricator.wikimedia.org/T359428#10932856 (10Arendpieter) @taavi What’s still left to do on this issue? All the pull requests are merged. I’m looking for an interesting Python issue to work on 😉
[17:23:15] <wikibugs>	 (03update) 10dcaro: deploy: add all the missing options for continuous job [repos/cloud/toolforge/components-api] (generate_config) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/93 (https://phabricator.wikimedia.org/T395070)
[17:23:30] <logmsgbot_cloud>	 !log dcaro@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component jobs-cli
[17:23:37] <logmsgbot_cloud>	 !log dcaro@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component jobs-cli
[17:28:13] <logmsgbot_cloud>	 !log dcaro@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component jobs-cli
[17:32:57] <wmcs-alerts>	 FIRING: ProbeDown: Service tools-k8s-haproxy-5:30000 has failed probes (http_admin_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-k8s-haproxy-5:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown
[17:33:48] <wikibugs>	 (03approved) 10dcaro: d/changelog: bump to 16.1.14 [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/108 (https://phabricator.wikimedia.org/T396210)
[17:33:55] <wikibugs>	 (03merge) 10dcaro: d/changelog: bump to 16.1.14 [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/108 (https://phabricator.wikimedia.org/T396210)
[17:36:21] <wikibugs>	 (03update) 10dcaro: deploy: add all the missing options for continuous job [repos/cloud/toolforge/components-api] (generate_config) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/93 (https://phabricator.wikimedia.org/T395070)
[17:37:06] <wikibugs>	 (03approved) 10dcaro: health-check: return `type` by default [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/174 (https://phabricator.wikimedia.org/T396210)
[17:37:09] <wikibugs>	 (03merge) 10dcaro: health-check: return `type` by default [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/174 (https://phabricator.wikimedia.org/T396210)
[17:37:57] <wmcs-alerts>	 RESOLVED: ProbeDown: Service tools-k8s-haproxy-5:30000 has failed probes (http_admin_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-k8s-haproxy-5:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown
[17:40:55] <wikibugs>	 (03open) 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620: jobs-api: bump to 0.0.381-20250619173722-eab6c9fe [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/819 (https://phabricator.wikimedia.org/T396210)
[17:41:03] <logmsgbot_cloud>	 !log dcaro@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component jobs-api
[17:45:55] <wikibugs>	 (03update) 10dcaro: deploy: add all the missing options for continuous job [repos/cloud/toolforge/components-api] (generate_config) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/93 (https://phabricator.wikimedia.org/T395070)
[17:48:15] <logmsgbot_cloud>	 !log dcaro@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component jobs-api
[17:49:06] <logmsgbot_cloud>	 !log dcaro@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component jobs-api
[17:55:28] <wmcs-alerts>	 FIRING: InstanceDown: Project tools instance tools-k8s-worker-nfs-11 is down   - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown
[17:57:14] <logmsgbot_cloud>	 !log dcaro@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component jobs-api
[17:57:34] <wmcs-alerts>	 RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-27 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses
[17:58:09] <wikibugs>	 (03approved) 10dcaro: jobs-api: bump to 0.0.381-20250619173722-eab6c9fe [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/819 (https://phabricator.wikimedia.org/T396210) (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620)
[17:58:12] <wikibugs>	 (03merge) 10dcaro: jobs-api: bump to 0.0.381-20250619173722-eab6c9fe [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/819 (https://phabricator.wikimedia.org/T396210) (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620)
[17:59:32] <wikibugs>	 (03update) 10dcaro: deploy: add all the missing options for continuous job [repos/cloud/toolforge/components-api] (generate_config) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/93 (https://phabricator.wikimedia.org/T395070)
[18:00:28] <wmcs-alerts>	 RESOLVED: InstanceDown: Project tools instance tools-k8s-worker-nfs-11 is down   - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown
[18:02:03] <wikibugs>	 (03update) 10dcaro: deploy: add all the missing options for continuous job [repos/cloud/toolforge/components-api] (generate_config) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/93 (https://phabricator.wikimedia.org/T395070)
[18:04:32] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for all NFS workers
[18:15:34] <jinxer-wm>	 RESOLVED: DiskSpace: Disk space cloudbackup1004:9100:/srv 5.991% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudbackup1004 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace
[18:23:28] <wikibugs>	 (03update) 10dcaro: deploy: add all the missing options for continuous job [repos/cloud/toolforge/components-api] (generate_config) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/93 (https://phabricator.wikimedia.org/T395071)
[18:25:53] <wikibugs>	 (03update) 10dcaro: deploy: add all the missing options for continuous job [repos/cloud/toolforge/components-api] (generate_config) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/93 (https://phabricator.wikimedia.org/T395071)
[18:27:56] <wikibugs>	 (03update) 10chuckonwumelu: show: Display latest deployment if no deploy_id included [repos/cloud/toolforge/components-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-cli/-/merge_requests/36 (https://phabricator.wikimedia.org/T394994)
[18:28:56] <wikibugs>	 (03merge) 10chuckonwumelu: GET the latest deployment for a particular tool [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/87 (https://phabricator.wikimedia.org/T394990)
[18:29:23] <wikibugs>	 (03merge) 10chuckonwumelu: show: Display latest deployment if no deploy_id included [repos/cloud/toolforge/components-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-cli/-/merge_requests/36 (https://phabricator.wikimedia.org/T394994)
[18:31:20] <wikibugs>	 (03open) 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620: components-api: bump to 0.0.120-20250619182909-09ea62ae [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/820 (https://phabricator.wikimedia.org/T394990)
[18:35:54] <wikibugs>	 (03update) 10dcaro: deploy: add all the missing options for continuous job [repos/cloud/toolforge/components-api] (generate_config) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/93 (https://phabricator.wikimedia.org/T395070)
[18:36:40] <wikibugs>	 (03open) 10dcaro: scheduled: add scheduled component support [repos/cloud/toolforge/components-api] (add_all_continuous_options) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/94 (https://phabricator.wikimedia.org/T395071)
[18:36:57] <wikibugs>	 (03update) 10dcaro: scheduled: add scheduled component support [repos/cloud/toolforge/components-api] (add_all_continuous_options) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/94 (https://phabricator.wikimedia.org/T395071)
[18:43:09] <logmsgbot_cloud>	 !log chuckonwumelu@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component components-api
[18:46:41] <logmsgbot_cloud>	 !log chuckonwumelu@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component components-api
[20:25:28] <wmcs-alerts>	 FIRING: PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance cvn-app10 in project cvn   - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun