[00:59:46] 10Tool-extloc, 10Release-Engineering-Team (Yak Shaving 🐃🪒): A tool for quickly answering what groups an extension is deployed to - https://phabricator.wikimedia.org/T296050#9824087 (10brennen) [01:01:58] 10Tool-extloc, 10Release-Engineering-Team (Yakisfaction): extloc: Save historical data - https://phabricator.wikimedia.org/T365664 (10brennen) 03NEW [01:03:35] 10Tool-extloc, 10Release-Engineering-Team (Yak Shaving 🐃🪒): extloc: Move to Toolforge Build Service - https://phabricator.wikimedia.org/T365665 (10brennen) 03NEW [01:23:26] (03update) 10raymond-ndibe: [jobs-api] add messages to all responses [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/85 (https://phabricator.wikimedia.org/T356974) [01:23:56] (03open) 10raymond-ndibe: [jobs-cli] add messages to all responses [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/32 (https://phabricator.wikimedia.org/T356974) [02:46:57] FIRING: [3x] CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [04:18:23] FIRING: OOM: OOM killer active on cloudcontrol2006-dev:9100 - TODO - https://grafana.wikimedia.org/d/-OcleDKIz/oom-kill - https://alerts.wikimedia.org/?q=alertname%3DOOM [04:23:23] RESOLVED: OOM: OOM killer active on cloudcontrol2006-dev:9100 - TODO - https://grafana.wikimedia.org/d/-OcleDKIz/oom-kill - https://alerts.wikimedia.org/?q=alertname%3DOOM [06:39:11] 10Striker: Add Bitu container to Striker development environment - https://phabricator.wikimedia.org/T362318#9824272 (10SLyngshede-WMF) 05Open→03In progress Docker image is now available: https://docker-registry.wikimedia.org/wikimedia/operations-software-bitu/tags/ Example for configuration: https://gerri... [06:46:57] FIRING: [3x] CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [07:29:48] (03update) 10sstefanova: Draft: prefix endpoints with /tool/{toolname}/ [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/93 [07:30:06] (03update) 10sstefanova: Draft: prefix endpoints with /tool/{toolname}/ [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/93 [09:00:04] 06cloud-services-team, 10Toolforge: toolforge: kubernetes can't revoke certificates - https://phabricator.wikimedia.org/T365681 (10aborrero) 03NEW [09:16:52] 06cloud-services-team, 10Toolforge: toolforge: kubernetes can't revoke certificates - https://phabricator.wikimedia.org/T365681#9824736 (10aborrero) [09:17:38] 06cloud-services-team, 10Toolforge: toolforge: kubernetes can't revoke certificates - https://phabricator.wikimedia.org/T365681#9824738 (10aborrero) [09:17:39] 10Toolforge (Toolforge iteration 10): [toolforge] Investigate authentication - https://phabricator.wikimedia.org/T363983#9824739 (10aborrero) [10:03:14] (03update) 10sstefanova: Draft: prefix endpoints with /tool/{toolname}/ [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/93 [10:27:45] 06cloud-services-team, 10Toolforge: Intermittent redis connection timeouts in Toolforge - https://phabricator.wikimedia.org/T318479#9825009 (10fnegri) > [2024-05-22 13:39:42,965: WARNING/MainProcess] consumer: Connection to broker lost. Trying to re-establish the connection... Where can I find this log line?... [10:29:42] (03update) 10sstefanova: prefix endpoints with /tool/{toolname}/ [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/93 [10:29:56] (03update) 10sstefanova: prefix endpoints with /tool/{toolname}/ [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/93 [10:30:31] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Toolforge (Toolforge iteration 10): Intermittent redis connection timeouts in Toolforge - https://phabricator.wikimedia.org/T318479#9825010 (10fnegri) 05Open→03In progress p:05Triage→03Medium a:03fnegri [10:31:57] (03update) 10dcaro: cli: webservice logs -f: Don't spam user w/ stack trace on control-C [repos/cloud/toolforge/tools-webservice] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/39 (https://phabricator.wikimedia.org/T361437) (owner: 10dancy) [10:31:59] (03update) 10dcaro: cli: webservice logs -f: Don't spam user w/ stack trace on control-C [repos/cloud/toolforge/tools-webservice] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/39 (https://phabricator.wikimedia.org/T361437) (owner: 10dancy) [10:35:14] (03update) 10sstefanova: prefix endpoints with /tool/{toolname}/ [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/93 [10:35:18] 10Toolforge (Toolforge iteration 10): [webservice-cli] `webservice logs -f` should expect KeyboardInterrupt - https://phabricator.wikimedia.org/T361437#9825023 (10dcaro) 05Open→03In progress p:05Triage→03Medium a:03dancy [10:36:51] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Toolforge (Toolforge iteration 10): Intermittent redis connection timeouts in Toolforge - https://phabricator.wikimedia.org/T318479#9825043 (10Pintoch) @fnegri I access the logs like this: ` become editgroups kubectl get pods # find the pod id that starts with `edit... [10:41:12] (03update) 10sstefanova: prefix endpoints with /tool/{toolname}/ [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/93 (https://phabricator.wikimedia.org/T363808) [10:41:46] (03update) 10sstefanova: prefix endpoints with /tool/{toolname}/ [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/93 (https://phabricator.wikimedia.org/T363808) [10:43:14] 10Toolforge (Toolforge iteration 10): [builds-api] Prefix all endpoints with `/tool/` - https://phabricator.wikimedia.org/T363808#9825056 (10Slst2020) [10:43:59] (03approved) 10dcaro: cli: webservice logs -f: Don't spam user w/ stack trace on control-C [repos/cloud/toolforge/tools-webservice] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/39 (https://phabricator.wikimedia.org/T361437) (owner: 10dancy) [10:44:26] (03merge) 10dcaro: cli: webservice logs -f: Don't spam user w/ stack trace on control-C [repos/cloud/toolforge/tools-webservice] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/39 (https://phabricator.wikimedia.org/T361437) (owner: 10dancy) [10:46:57] FIRING: [3x] CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [11:10:20] (03update) 10aborrero: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T364312) [11:47:36] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Toolforge (Toolforge iteration 10): Intermittent redis connection timeouts in Toolforge - https://phabricator.wikimedia.org/T318479#9825176 (10fnegri) > the others such as "Worker exited prematurely" I think these errors are caused by Kubernetes killing the process... [12:03:48] (03update) 10aborrero: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T364312) [12:11:24] 10Toolforge (Toolforge iteration 10): I can't connect to Toolforge DB replicas from my PC using MySQL Workbench - https://phabricator.wikimedia.org/T360839#9825237 (10MBH) @dcaro I tried to use another connection method, provided by @Iluvatar , looking simpler. It uses MS PowerShell: `PS C:\Windows\system32> Get... [12:11:26] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS: Investigate how to run OpenTofu to manage Cloud VPS admin-only resources - https://phabricator.wikimedia.org/T365696 (10taavi) 03NEW [12:21:49] 10Toolforge (Toolforge iteration 10): I can't connect to Toolforge DB replicas from my PC using MySQL Workbench - https://phabricator.wikimedia.org/T360839#9825289 (10MBH) >>! In T360839#9666070, @dcaro wrote: > You are using the default port there, that's the issue, so you have two options: > * Configure a diff... [12:24:57] (03approved) 10dcaro: [maintain-kubeusers] increment default services quota [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/25 (https://phabricator.wikimedia.org/T362520) (owner: 10raymond-ndibe) [12:24:59] (03update) 10dcaro: [maintain-kubeusers] increment default services quota [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/25 (https://phabricator.wikimedia.org/T362520) (owner: 10raymond-ndibe) [12:27:09] (03open) 10l10n-bot: Localisation updates from https://translatewiki.net. [toolforge-repos/wd-image-positions] - 10https://gitlab.wikimedia.org/toolforge-repos/wd-image-positions/-/merge_requests/7 [12:28:15] (03CR) 10CI reject: [V:04-1] Localisation updates from https://translatewiki.net. [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1035384 (owner: 10L10n-bot) [12:37:01] (03update) 10dcaro: [jobs-api] support services in jobs [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/71 (https://phabricator.wikimedia.org/T348758) (owner: 10raymond-ndibe) [13:02:39] FIRING: ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [13:04:31] 10Toolforge (Toolforge iteration 10): I can't connect to Toolforge DB replicas from my PC using MySQL Workbench - https://phabricator.wikimedia.org/T360839#9825517 (10MBH) After re-converting .ppk key, I successfully logged into Toolforge in Powershell window: `PS C:\Windows\system32> ssh -L 4711:ruwiki.web.db.s... [13:07:39] RESOLVED: ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [13:12:17] (03update) 10dcaro: [envvars-api] add messages to all responses [repos/cloud/toolforge/envvars-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-api/-/merge_requests/25 (https://phabricator.wikimedia.org/T356974) (owner: 10raymond-ndibe) [13:13:14] !log sstefanova@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.component.deploy for component builds-api [13:13:26] !log sstefanova@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component builds-api [13:17:08] 06cloud-services-team, 10Cloud-VPS, 06DC-Ops, 10ops-eqiad, and 2 others: Degraded RAID on cloudcephosd1031 - https://phabricator.wikimedia.org/T364060#9825583 (10dcaro) ` root@cloudcephosd1031:~# cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md0 :... [13:21:15] (03update) 10aborrero: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T364312) [13:21:54] !log sstefanova@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.component.deploy for component builds-api [13:22:07] !log sstefanova@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component builds-api [13:22:52] 06cloud-services-team, 10Cloud-VPS, 06DC-Ops, 10ops-eqiad, and 2 others: Degraded RAID on cloudcephosd1031 - https://phabricator.wikimedia.org/T364060#9825627 (10dcaro) The support assist logs are on google drive https://drive.google.com/file/d/1tS2cy8EF5AgsTLpdK2r8dTR0YQ06ntIZ/view?usp=drive_link (phabri... [13:25:50] (03update) 10sstefanova: builds-api: bump to 0.0.144-20240521144209-4947025a [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/283 (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [13:25:57] (03merge) 10sstefanova: builds-api: bump to 0.0.144-20240521144209-4947025a [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/283 (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [13:33:02] 10Toolforge (Toolforge iteration 10): I can't connect to Toolforge DB replicas from my PC using MySQL Workbench - https://phabricator.wikimedia.org/T360839#9825701 (10MBH) 05Stalled→03Resolved After explicitly indicating ports (4711 and 4712) in connection strings (before that, port wasn't defined in CS)... [13:37:36] 10Cloud-VPS, 10Quarry: [bug] Lot of queries stuck in queued state for hours and days - https://phabricator.wikimedia.org/T365136#9825723 (10Oudedutchman) @SD0001 I see the pull-request https://github.com/toolforge/quarry/pull/42 which addresses this, but is it really enough to fix the problem? [13:48:24] vivian-rook opened https://github.com/toolforge/quarry/pull/43 [13:51:20] vivian-rook closed https://github.com/toolforge/quarry/pull/43 [13:52:00] (03open) 10dcaro: fix typo [repos/cloud/toolforge/envvars-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-cli/-/merge_requests/39 [13:53:04] (03update) 10dcaro: fix typo [repos/cloud/toolforge/envvars-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-cli/-/merge_requests/39 [13:53:35] (03update) 10dcaro: fix typo [repos/cloud/toolforge/envvars-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-cli/-/merge_requests/39 [13:56:58] (03update) 10dcaro: fix typo [repos/cloud/toolforge/envvars-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-cli/-/merge_requests/39 [13:57:08] (03update) 10dcaro: fix typo [repos/cloud/toolforge/envvars-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-cli/-/merge_requests/39 [13:58:09] (03approved) 10sstefanova: fix typo [repos/cloud/toolforge/envvars-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-cli/-/merge_requests/39 (owner: 10dcaro) [13:58:14] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS: Investigate how to run OpenTofu to manage Cloud VPS admin-only resources - https://phabricator.wikimedia.org/T365696#9825767 (10fnegri) I think we could do both: being able to manually run Terraform from a shared server is nice, and I would want to have i... [14:10:32] (03open) 10dcaro: create: add nicer error when quota is reached [repos/cloud/toolforge/envvars-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-api/-/merge_requests/29 [14:13:47] (03update) 10dcaro: [envvars-api] add messages to all responses [repos/cloud/toolforge/envvars-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-api/-/merge_requests/25 (https://phabricator.wikimedia.org/T356974) (owner: 10raymond-ndibe) [14:13:48] (03approved) 10dcaro: [envvars-api] add messages to all responses [repos/cloud/toolforge/envvars-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-api/-/merge_requests/25 (https://phabricator.wikimedia.org/T356974) (owner: 10raymond-ndibe) [14:14:17] 06cloud-services-team, 10Cloud-VPS (Debian Buster Deprecation), 10Beta-Cluster-Infrastructure: Replace deployment-cumin with Bullseye or Bookworm host - https://phabricator.wikimedia.org/T361380#9825821 (10elukey) Upgrade the VM with `dist-upgrade.sh` and rebooted. Do we need to do anything to update the OS... [14:15:36] (03approved) 10dcaro: [envvars-cli] add messages to all responses [repos/cloud/toolforge/envvars-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-cli/-/merge_requests/38 (https://phabricator.wikimedia.org/T356974) (owner: 10raymond-ndibe) [14:15:38] (03update) 10dcaro: [envvars-cli] add messages to all responses [repos/cloud/toolforge/envvars-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-cli/-/merge_requests/38 (https://phabricator.wikimedia.org/T356974) (owner: 10raymond-ndibe) [14:16:12] (03update) 10dcaro: fix typo [repos/cloud/toolforge/envvars-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-cli/-/merge_requests/39 [14:16:13] (03approved) 10dcaro: fix typo [repos/cloud/toolforge/envvars-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-cli/-/merge_requests/39 [14:16:16] (03merge) 10dcaro: fix typo [repos/cloud/toolforge/envvars-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-cli/-/merge_requests/39 [14:19:29] (03update) 10dcaro: [jobs-cli] support services in jobs [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/18 (https://phabricator.wikimedia.org/T348758) (owner: 10raymond-ndibe) [14:23:37] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Data-Services: [wikireplicas] Update Admin docs - https://phabricator.wikimedia.org/T365717 (10fnegri) 03NEW [14:23:41] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Data-Services: [wikireplicas] Update Admin docs - https://phabricator.wikimedia.org/T365717#9825874 (10fnegri) p:05Triage→03Medium [14:24:42] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Data-Services: [wikireplicas] Update Admin docs - https://phabricator.wikimedia.org/T365717#9825876 (10fnegri) 05Open→03In progress [14:27:13] (03update) 10dcaro: create: add nicer error when quota is reached [repos/cloud/toolforge/envvars-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-api/-/merge_requests/29 [14:34:53] 10PAWS: jupyterlab to 4.2.1 - https://phabricator.wikimedia.org/T365719 (10rook) 03NEW [14:36:08] 10PAWS: jupyterlab to 4.2.1 - https://phabricator.wikimedia.org/T365719#9825982 (10github-toolforge-bot) vivian-rook opened https://github.com/toolforge/paws/pull/415 [14:36:10] 06cloud-services-team, 06DC-Ops, 10ops-eqiad, 06SRE: cloudvirt1041: can't boot after reimage - https://phabricator.wikimedia.org/T364984#9825953 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1002 for host cloudvirt1041.eqiad.wmnet with OS bookworm [14:36:15] vivian-rook opened https://github.com/toolforge/paws/pull/415 [14:46:57] FIRING: [3x] CloudVPSDesignateLeaks: Detected 6 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:50:20] 06cloud-services-team, 13Patch-For-Review: PuppetFailure - https://phabricator.wikimedia.org/T365640#9826035 (10dcaro) 05Open→03In progress p:05Triage→03Medium a:03dcaro [14:52:24] (03update) 10dancy: README.md: Clarify what command this repo implements [repos/cloud/toolforge/builds-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-cli/-/merge_requests/70 [14:52:58] (03update) 10dancy: README.md: Clarify what command this repo implements [repos/cloud/toolforge/builds-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-cli/-/merge_requests/70 [14:58:50] (03CR) 10BryanDavis: [C:04-2] "I have been thinking about my reaction here and have decided that waiting for a clear future Redis replacement is being too conservative." [docker-images/toollabs-images] - 10https://gerrit.wikimedia.org/r/1012797 (https://phabricator.wikimedia.org/T360378) (owner: 10BryanDavis) [14:58:57] (03CR) 10BryanDavis: Add redis image [docker-images/toollabs-images] - 10https://gerrit.wikimedia.org/r/1012797 (https://phabricator.wikimedia.org/T360378) (owner: 10BryanDavis) [15:02:48] 10Toolforge (Software install/update), 13Patch-For-Review: Provide a Redis container for use within a tool's namespace - https://phabricator.wikimedia.org/T360378#9826065 (10bd808) 05Stalled→03In progress >>! In T360378#9649924, @bd808 wrote: > I have put a -2 lock on my https://gerrit.wikimedia.org/r/c/op... [15:13:18] 10Cloud Services Proposals: Decision request - kubernetes upgrade workgroup - https://phabricator.wikimedia.org/T363683#9826104 (10Slst2020) [15:16:11] 10Cloud Services Proposals: Decision request - kubernetes upgrade workgroup - https://phabricator.wikimedia.org/T363683#9826109 (10aborrero) I would like to participate on the upgrades. I don't have any strong opinion on the different options at the moment. [15:19:53] 10Toolforge: [infra] Allow users to self-rotate the all the credentials - https://phabricator.wikimedia.org/T365724 (10dcaro) 03NEW [15:20:54] 10Cloud Services Proposals: Decision request - kubernetes upgrade workgroup - https://phabricator.wikimedia.org/T363683#9826156 (10Slst2020) [15:21:27] 10Toolforge: [infra] Allow users to self-rotate the all the credentials - https://phabricator.wikimedia.org/T365724#9826152 (10dcaro) [15:22:18] 10PAWS: Update chart version on PR? - https://phabricator.wikimedia.org/T365725 (10rook) 03NEW [15:23:30] 10Toolforge: [infra] Allow users to self-rotate the all the credentials - https://phabricator.wikimedia.org/T365724#9826165 (10aborrero) [15:23:53] 10Toolforge: [infra] Allow users to self-rotate the all the credentials - https://phabricator.wikimedia.org/T365724#9826178 (10aborrero) [15:24:17] 10Toolforge: Toolforge Aptfile not producing working copy of `ffmpeg` - https://phabricator.wikimedia.org/T365633#9826199 (10dcaro) This feels more like a version mismatch than a library not found :/, will take a look, the dependency resolution of the apt buildpack is seldom limited. [15:26:02] 10Toolforge: [infra] Allow users to self-rotate the all the credentials - https://phabricator.wikimedia.org/T365724#9826227 (10dcaro) [15:37:05] (03PS2) 10BryanDavis: wikitech: Add dummy GitLab API token [labs/private] - 10https://gerrit.wikimedia.org/r/1034533 (https://phabricator.wikimedia.org/T316418) [15:37:48] (03CR) 10EoghanGaffney: [C:03+1] wikitech: Add dummy GitLab API token [labs/private] - 10https://gerrit.wikimedia.org/r/1034533 (https://phabricator.wikimedia.org/T316418) (owner: 10BryanDavis) [15:38:12] (03CR) 10EoghanGaffney: [C:03+2] wikitech: Add dummy GitLab API token [labs/private] - 10https://gerrit.wikimedia.org/r/1034533 (https://phabricator.wikimedia.org/T316418) (owner: 10BryanDavis) [15:38:17] (03PS3) 10David Caro: wikitech: Add dummy GitLab API token [labs/private] - 10https://gerrit.wikimedia.org/r/1034533 (https://phabricator.wikimedia.org/T316418) (owner: 10BryanDavis) [15:39:22] (03CR) 10EoghanGaffney: [C:03+2] wikitech: Add dummy GitLab API token [labs/private] - 10https://gerrit.wikimedia.org/r/1034533 (https://phabricator.wikimedia.org/T316418) (owner: 10BryanDavis) [15:39:23] (03CR) 10EoghanGaffney: [V:03+2 C:03+2] wikitech: Add dummy GitLab API token [labs/private] - 10https://gerrit.wikimedia.org/r/1034533 (https://phabricator.wikimedia.org/T316418) (owner: 10BryanDavis) [15:43:12] 10PAWS: jupyterlab to 4.2.1 - https://phabricator.wikimedia.org/T365719#9826351 (10github-toolforge-bot) vivian-rook closed https://github.com/toolforge/paws/pull/415 [15:43:15] 10PAWS: jupyterlab to 4.2.1 - https://phabricator.wikimedia.org/T365719#9826352 (10rook) 05Open→03Resolved [15:43:18] vivian-rook closed https://github.com/toolforge/paws/pull/415 [15:47:24] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Data-Services, 06Data-Persistence: [wikireplicas] Update Admin docs - https://phabricator.wikimedia.org/T365717#9826369 (10fnegri) [15:55:21] 10Cloud Services Proposals: Decision request - kubernetes upgrade workgroup - https://phabricator.wikimedia.org/T363683#9826413 (10dcaro) [15:55:52] 06cloud-services-team, 06DC-Ops, 10ops-eqiad, 06SRE: cloudvirt1041: can't boot after reimage - https://phabricator.wikimedia.org/T364984#9826417 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1002 for host cloudvirt1041.eqiad.wmnet with OS bookworm executed with errors:... [15:56:33] 10Cloud Services Proposals: Decision request - kubernetes upgrade workgroup - https://phabricator.wikimedia.org/T363683#9826430 (10fnegri) Option 2 (including the long-term workgroup) looks fine to me. Maybe I would add that "monthly" is the target, but some difficult upgrades might need more time. We could pub... [16:00:10] 10Cloud-VPS (Quota-requests), 06Content-Transform-Team-WIP: Increase storage for parsoid visualdiff testing - https://phabricator.wikimedia.org/T365733 (10Jgiannelos) 03NEW [16:00:49] 06cloud-services-team, 06DC-Ops, 10ops-eqiad, 06SRE: cloudvirt1041: can't boot after reimage - https://phabricator.wikimedia.org/T364984#9826440 (10Jclark-ctr) @aborrero I am stuck right now i did attempt to reimage with no luck. Unsure what version of grub we have installed but looks like the same as thi... [16:03:56] (03update) 10aborrero: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T364312) [16:12:37] 10Cloud Services Proposals: Decision request - kubernetes upgrade workgroup - https://phabricator.wikimedia.org/T363683#9826506 (10dcaro) >>! In T363683#9826430, @fnegri wrote: > Option 2 (including the long-term workgroup) looks fine to me. > > Maybe I would add that "monthly" is the target, but some difficult... [16:13:28] 10Cloud Services Proposals: Decision request - kubernetes upgrade workgroup - https://phabricator.wikimedia.org/T363683#9826512 (10dcaro) [16:13:53] 06cloud-services-team, 10Cloud-VPS, 06DC-Ops, 10ops-eqiad, and 2 others: Degraded RAID on cloudcephosd1031 - https://phabricator.wikimedia.org/T364060#9826509 (10Jclark-ctr) a:03Jclark-ctr You have successfully submitted request SR191070960. Ordered replacement drive. will update when arrives [16:22:47] 10Cloud Services Proposals: Decision request - kubernetes upgrade workgroup - https://phabricator.wikimedia.org/T363683#9826561 (10fnegri) > I think though that we can keep track of that on the tasks The task is good to discuss the details, but I see a value in having a high-level wiki page with the list of up... [18:46:57] FIRING: [3x] CloudVPSDesignateLeaks: Detected 6 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [19:17:12] (03approved) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/wd-image-positions] - 10https://gitlab.wikimedia.org/toolforge-repos/wd-image-positions/-/merge_requests/7 (owner: 10l10n-bot) [19:17:16] (03merge) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/wd-image-positions] - 10https://gitlab.wikimedia.org/toolforge-repos/wd-image-positions/-/merge_requests/7 (owner: 10l10n-bot) [20:24:48] FIRING: PuppetZeroResources: Puppet has failed generate resources on cloudcumin2001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [20:44:48] FIRING: [2x] PuppetZeroResources: Puppet has failed generate resources on cloudcumin1001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [22:46:57] FIRING: [3x] CloudVPSDesignateLeaks: Detected 6 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [23:11:09] 10Cloud-VPS (Quota-requests), 10Wikispore: Floating IP for Wikispore - https://phabricator.wikimedia.org/T365641#9828112 (10Pharos) The new site will be a mapping portal for free knowledge content related to the city from Wikipedia/Wikidata/other sister projects, as well as OpenStreetMap and Wikispore. Some b...