[00:35:14] (03update) 10raymond-ndibe: [jobs-cli] refactor job payload [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/98 (https://phabricator.wikimedia.org/T389118 https://phabricator.wikimedia.org/T390136) [01:15:45] (03update) 10raymond-ndibe: [jobs-cli] refactor job payload [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/98 (https://phabricator.wikimedia.org/T389118 https://phabricator.wikimedia.org/T390136) [03:02:59] FIRING: MaintainDBUsersManyErrors: Maintain-dbusers is having sustained errors - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainDBUsersManyErrors - https://grafana.wikimedia.org/d/ae240a06-c13e-49f3-b12c-58432c551e85/wmcs-maintain-dbusers - https://alerts.wikimedia.org/?q=alertname%3DMaintainDBUsersManyErrors [04:32:53] (03update) 10raymond-ndibe: values.yaml: add image variant name to aliases [repos/cloud/toolforge/image-config] (replace_job_with_webservice_image_variants) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/image-config/-/merge_requests/21 (https://phabricator.wikimedia.org/T415322) [05:25:34] (03update) 10raymond-ndibe: jobs-api: test for proper handling of the diff variations of the --image argument [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1113 (https://phabricator.wikimedia.org/T414978 https://phabricator.wikimedia.org/T415322) [05:50:36] (03CR) 10Abijeet Patro: [V:03+2] Localisation updates from https://translatewiki.net. [labs/tools/watch-translations] - 10https://gerrit.wikimedia.org/r/1270431 (owner: 10L10n-bot) [05:50:45] (03CR) 10Abijeet Patro: [V:03+2] Localisation updates from https://translatewiki.net. [labs/tools/commons-mass-description] - 10https://gerrit.wikimedia.org/r/1270427 (owner: 10L10n-bot) [06:00:43] (03update) 10raymond-ndibe: replace job images with web images [repos/cloud/toolforge/jobs-api] (add_tests_for_image_variant_matching) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/263 (https://phabricator.wikimedia.org/T415322) [06:21:03] 10Cloud-VPS (Project-requests), 06Test Platform: Request creation of testplatform VPS project - https://phabricator.wikimedia.org/T423226 (10Peter) 03NEW [06:25:17] (03update) 10raymond-ndibe: jobs-api: use webservice image variants in one-off job tests [repos/cloud/toolforge/toolforge-deploy] (test_for_image_argument_handling_in_jobs) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1115 (https://phabricator.wikimedia.org/T415322) [06:41:25] RESOLVED: ToolforgeKubernetesCapacity: Kubernetes cluster k8s.tools.eqiad1.wikimedia.cloud:6443 in risk of running out of memory - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesCapacity - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesCapacity [06:42:25] FIRING: ToolforgeKubernetesCapacity: Kubernetes cluster k8s.tools.eqiad1.wikimedia.cloud:6443 in risk of running out of memory - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesCapacity - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesCapacity [06:43:49] 10Toolforge, 06tools-platform-team: [Toolforge]: User Activity (based on successful SSH login) - https://phabricator.wikimedia.org/T423228 (10komla) 03NEW [06:44:11] 10Toolforge, 06tools-platform-team: [Toolforge]: User Activity (based on successful SSH logins) - https://phabricator.wikimedia.org/T423228#11817792 (10komla) [07:02:59] FIRING: MaintainDBUsersManyErrors: Maintain-dbusers is having sustained errors - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainDBUsersManyErrors - https://grafana.wikimedia.org/d/ae240a06-c13e-49f3-b12c-58432c551e85/wmcs-maintain-dbusers - https://alerts.wikimedia.org/?q=alertname%3DMaintainDBUsersManyErrors [07:04:16] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Cloud-VPS: Move trove DB instances to rabbitmq transient quorum queues - https://phabricator.wikimedia.org/T421857#11817806 (10fgiunchedi) Done in codfw too. I left the security groups in place also in light of T422801 [07:04:18] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Cloud-VPS: Move trove DB instances to rabbitmq transient quorum queues - https://phabricator.wikimedia.org/T421857#11817808 (10fgiunchedi) 05Open→03Resolved a:03fgiunchedi [07:09:30] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Cloud-VPS: Move all openstack rabbitmq queues to quorum - https://phabricator.wikimedia.org/T421054#11817824 (10fgiunchedi) 05Open→03Resolved a:03fgiunchedi This is done in eqiad and codfw ` root@cloudrabbit1002:~# rabbitmqctl list_queues name type durabl... [07:13:06] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Cloud-VPS: Increased openstack latency and rabbitmq rolling restarts on certificate update - https://phabricator.wikimedia.org/T418444#11817838 (10fgiunchedi) 05Open→03Resolved a:03fgiunchedi This is done, rabbit/openstack are now able to survive a rabbit... [07:41:10] RESOLVED: ToolforgeKubernetesCapacity: Kubernetes cluster k8s.tools.eqiad1.wikimedia.cloud:6443 in risk of running out of memory - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesCapacity - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesCapacity [08:10:55] FIRING: ToolforgeKubernetesCapacity: Kubernetes cluster k8s.tools.eqiad1.wikimedia.cloud:6443 in risk of running out of memory - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesCapacity - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesCapacity [08:38:26] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Cloud-VPS: Designate API timing out - https://phabricator.wikimedia.org/T422646#11818211 (10fgiunchedi) The two failures I can identify are: oslo.messaging not failing over (T422820) and tooz lamenting memcached unavailable. I'm going to rename this task to address... [08:40:27] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Cloud-VPS: memcache is a SPOF for designate/tooz coordination - https://phabricator.wikimedia.org/T422646#11818230 (10fgiunchedi) [08:40:55] RESOLVED: ToolforgeKubernetesCapacity: Kubernetes cluster k8s.tools.eqiad1.wikimedia.cloud:6443 in risk of running out of memory - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesCapacity - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesCapacity [08:41:10] FIRING: ToolforgeKubernetesCapacity: Kubernetes cluster k8s.tools.eqiad1.wikimedia.cloud:6443 in risk of running out of memory - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesCapacity - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesCapacity [09:36:10] RESOLVED: ToolforgeKubernetesCapacity: Kubernetes cluster k8s.tools.eqiad1.wikimedia.cloud:6443 in risk of running out of memory - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesCapacity - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesCapacity [09:37:25] FIRING: ToolforgeKubernetesCapacity: Kubernetes cluster k8s.tools.eqiad1.wikimedia.cloud:6443 in risk of running out of memory - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesCapacity - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesCapacity [09:45:49] 10Tool-schedule-deployment: ScheduleDeploymentBot should escape wikitext in commit message ({{deploy}} |title= parameter) - https://phabricator.wikimedia.org/T423124#11818467 (10Lucas_Werkmeister_WMDE) >>! In T423124#11817359, @bd808 wrote: > Thinking more about deliberate template use for Deployment planning, I... [09:46:23] 06cloud-services-team, 10Toolforge: Audit tools memory requests vs actual usage - https://phabricator.wikimedia.org/T420565#11818469 (10fgiunchedi) Thinking about this problem a little more: we would be lowering the default memory request, while leaving limit untouched, therefore I think it should be safe to d... [10:07:49] 10Toolforge, 06Product Safety and Integrity, 10video2commons, 072026-user-javascript-incident, 07ContentSecurityPolicy: Request for CSP allowlist to wss://video2commons-socketio.toolforge.org from video2commons.toolforge.org - https://phabricator.wikimedia.org/T423037#11818544 (10A_smart_kitten) (ret... [11:02:59] FIRING: MaintainDBUsersManyErrors: Maintain-dbusers is having sustained errors - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainDBUsersManyErrors - https://grafana.wikimedia.org/d/ae240a06-c13e-49f3-b12c-58432c551e85/wmcs-maintain-dbusers - https://alerts.wikimedia.org/?q=alertname%3DMaintainDBUsersManyErrors [11:05:38] 10Data-Services, 06tools-platform-team, 06Data-Persistence, 13Patch-For-Review: [wikireplicas] Update grants for "maintainviews" user - https://phabricator.wikimedia.org/T422806#11818710 (10fnegri) These grants are present on all clouddb hosts but are not listed in [wiki-replicas.sql](https://phabricator.w... [11:09:19] 10Data-Services, 06tools-platform-team, 06Data-Persistence, 13Patch-For-Review: [wikireplicas] Update grants for "maintainviews" user - https://phabricator.wikimedia.org/T422806#11818724 (10fnegri) Two of the grants are actually needed to create views. This seems to be enough: `lang=mysql GRANT SET USER,... [11:17:17] 10Data-Services, 06tools-platform-team, 06Data-Persistence, 13Patch-For-Review: [wikireplicas] Update grants for "maintainviews" user - https://phabricator.wikimedia.org/T422806#11818738 (10fnegri) Added in https://gerrit.wikimedia.org/r/c/operations/puppet/+/1270891 [11:22:41] (03merge) 10vriaa: feat: add descriptions to viewport menu breakpoint items [toolforge-repos/centralnotice-banner-editor] - 10https://gitlab.wikimedia.org/toolforge-repos/centralnotice-banner-editor/-/merge_requests/53 [11:29:00] (03merge) 10vriaa: fix: use :deep() to style CdxMenu elements after Codex update [toolforge-repos/centralnotice-banner-editor] - 10https://gitlab.wikimedia.org/toolforge-repos/centralnotice-banner-editor/-/merge_requests/52 [11:29:44] (03merge) 10vriaa: fix: improve image URL tooltip with clearer instructions [toolforge-repos/centralnotice-banner-editor] - 10https://gitlab.wikimedia.org/toolforge-repos/centralnotice-banner-editor/-/merge_requests/51 [11:30:58] (03merge) 10vriaa: feat: add fixed CSS support to generated banner code [toolforge-repos/centralnotice-banner-editor] - 10https://gitlab.wikimedia.org/toolforge-repos/centralnotice-banner-editor/-/merge_requests/49 (https://phabricator.wikimedia.org/T420950) [11:31:16] (03update) 10fnegri: Catch SQL errors [repos/cloud/wikireplicas-utils] - 10https://gitlab.wikimedia.org/repos/cloud/wikireplicas-utils/-/merge_requests/12 (https://phabricator.wikimedia.org/T351637) [11:31:21] (03update) 10fnegri: Catch SQL errors [repos/cloud/wikireplicas-utils] - 10https://gitlab.wikimedia.org/repos/cloud/wikireplicas-utils/-/merge_requests/12 (https://phabricator.wikimedia.org/T351637) [11:32:04] 10Tool-centralnotice-banner-editor: Implement fixed CSS feature for default uneditable styles in banners - https://phabricator.wikimedia.org/T420950#11818787 (10Oyelola_Victoria) 05In progress→03Resolved [11:49:15] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Data-Services, 06tools-platform-team, 13Patch-For-Review: [wikireplicas] add proper dry-run/diff mode to maintain-views - https://phabricator.wikimedia.org/T351637#11818864 (10fnegri) I tested one full run on all databases on clouddb1017. It's slower than the ol... [12:45:28] !log filippo@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.unset_maintenance (T419658) [12:45:34] T419658: Controlled cloudsw down tests for F4 - https://phabricator.wikimedia.org/T419658 [12:45:37] !log filippo@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.unset_maintenance (exit_code=0) (T419658) [12:58:48] FIRING: PuppetZeroResources: Puppet has failed generate resources on cloudcephosd1033:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [13:03:48] FIRING: [2x] PuppetZeroResources: Puppet has failed generate resources on cloudcephosd1033:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [13:13:48] RESOLVED: [2x] PuppetZeroResources: Puppet has failed generate resources on cloudcephosd1033:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [13:30:34] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Toolforge: [infra,k8s] Upgrade Toolforge Kubernetes to version 1.32 - https://phabricator.wikimedia.org/T379047#11819613 (10fnegri) a:05fnegri→03None [13:31:00] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Data-Services: [wikireplicas] Gather usage stats - https://phabricator.wikimedia.org/T381587#11819617 (10fnegri) a:05fnegri→03None [13:35:19] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Cloud-VPS: Uprade cloudservices1005 and cloudservices1006 to MariaDB 10.11 - https://phabricator.wikimedia.org/T409395#11819630 (10fnegri) a:05fnegri→03None Unassigning myself while we decide who's going to do it. [13:37:25] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Cloud-VPS (Debian Bullseye Deprecation): Upgrade cloudinfra database hosts off of Bullseye - https://phabricator.wikimedia.org/T402005#11819642 (10fnegri) a:05fnegri→03None I won't be able to work on this in the near future, unassigning myself in case somebody e... [13:58:28] 06cloud-services-team, 10Cloud-VPS: Improvements to auto-generated floating ip ptr records - https://phabricator.wikimedia.org/T421739#11819694 (10Andrew) 05Open→03Resolved #1 is fixed with attached patches #2 was already handled; the reason I thought it wasn't was because of interactions with tofu-in... [14:31:38] 10Cloud-VPS (Project-requests), 06Test Platform: Request creation of testplatform VPS project - https://phabricator.wikimedia.org/T423226#11819880 (10Andrew) +1 [14:49:32] 10Cloud-VPS (Project-requests), 06Test Platform: Request creation of testplatform VPS project - https://phabricator.wikimedia.org/T423226#11820111 (10Andrew) @Peter would you consider a more specific project name like maybe 'ciperformance'? [14:58:37] 10Cloud-VPS (Project-requests), 06Test Platform: Request creation of testplatform VPS project - https://phabricator.wikimedia.org/T423226#11820159 (10Peter) @Andrew yep works fine, thanks. [14:58:54] 10Cloud-VPS (Project-requests), 06Test Platform: Request creation of testplatform VPS project - https://phabricator.wikimedia.org/T423226#11820161 (10Raymond_Ndibe) a:03Raymond_Ndibe [14:59:28] 06cloud-services-team, 10Cloud-VPS, 13Patch-For-Review: Cloud init and unattended upgrades while bootstrapping Trixie VMs - https://phabricator.wikimedia.org/T422509#11820173 (10Andrew) So here is what should be happening, and is configured to happen: 1) Base image is fully puppetized as a Trixie host. By t... [15:02:59] FIRING: MaintainDBUsersManyErrors: Maintain-dbusers is having sustained errors - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainDBUsersManyErrors - https://grafana.wikimedia.org/d/ae240a06-c13e-49f3-b12c-58432c551e85/wmcs-maintain-dbusers - https://alerts.wikimedia.org/?q=alertname%3DMaintainDBUsersManyErrors [15:11:20] 06cloud-services-team, 10Cloud-VPS, 13Patch-For-Review: Cloud init and unattended upgrades while bootstrapping Trixie VMs - https://phabricator.wikimedia.org/T422509#11820229 (10Andrew) Oh, one other point that might not be obvious: I can't just write straight to preferences.d from cloud-init because that's... [15:21:30] (03update) 10raymond-ndibe: images.py: add tests for image variant matching [repos/cloud/toolforge/jobs-api] (refactor_image_handling) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/286 (https://phabricator.wikimedia.org/T415322) [15:24:28] (03PS1) 10CDanis: hiddenparma: add atsuko [labs/private] - 10https://gerrit.wikimedia.org/r/1270977 [15:24:46] (03CR) 10CDanis: [C:03+2] hiddenparma: add atsuko [labs/private] - 10https://gerrit.wikimedia.org/r/1270977 (owner: 10CDanis) [15:25:05] (03CR) 10CDanis: [V:03+2 C:03+2] hiddenparma: add atsuko [labs/private] - 10https://gerrit.wikimedia.org/r/1270977 (owner: 10CDanis) [15:34:18] (03update) 10fnegri: [jobs-cli] refactor job payload [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/98 (https://phabricator.wikimedia.org/T389118 https://phabricator.wikimedia.org/T390136) (owner: 10raymond-ndibe) [15:48:52] (03update) 10fnegri: refactor image parsing and handling [repos/cloud/toolforge/jobs-api] (improve_image_parsing_tests) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/273 (https://phabricator.wikimedia.org/T415322) (owner: 10raymond-ndibe) [15:50:44] 10Toolforge, 06tools-platform-team, 13Patch-For-Review: [jobs-api] make job status an enum, with clearly defined states - https://phabricator.wikimedia.org/T401172#11820455 (10fnegri) [15:50:47] 10Toolforge, 06tools-platform-team: [jobs-api] Create storage layer, and save business models in persistent storage - https://phabricator.wikimedia.org/T359650#11820456 (10fnegri) [15:51:26] (03update) 10fnegri: replace job images with web images [repos/cloud/toolforge/jobs-api] (add_tests_for_image_variant_matching) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/263 (https://phabricator.wikimedia.org/T415322) (owner: 10raymond-ndibe) [15:51:38] (03update) 10fnegri: values.yaml: hoist web image variants to top of config [repos/cloud/toolforge/image-config] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/image-config/-/merge_requests/18 (https://phabricator.wikimedia.org/T415322) (owner: 10raymond-ndibe) [15:51:46] (03update) 10fnegri: jobs-api: use webservice image variants in one-off job tests [repos/cloud/toolforge/toolforge-deploy] (test_for_image_argument_handling_in_jobs) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1115 (https://phabricator.wikimedia.org/T415322) (owner: 10raymond-ndibe) [15:51:53] (03update) 10fnegri: jobs-api: test for proper handling of the diff variations of the --image argument [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1113 (https://phabricator.wikimedia.org/T414978 https://phabricator.wikimedia.org/T415322) (owner: 10raymond-ndibe) [15:51:56] (03update) 10fnegri: refactor image parsing and handling [repos/cloud/toolforge/jobs-api] (improve_image_parsing_tests) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/273 (https://phabricator.wikimedia.org/T415322) (owner: 10raymond-ndibe) [15:52:01] (03update) 10fnegri: values.yaml: add image variant name to aliases [repos/cloud/toolforge/image-config] (replace_job_with_webservice_image_variants) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/image-config/-/merge_requests/21 (https://phabricator.wikimedia.org/T415322) (owner: 10raymond-ndibe) [15:52:03] (03update) 10fnegri: images.py: add tests for image variant matching [repos/cloud/toolforge/jobs-api] (refactor_image_handling) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/286 (https://phabricator.wikimedia.org/T415322) (owner: 10raymond-ndibe) [16:21:44] 10Tool-wsindex: MIgrate Ws-index Codeberg repo to Gitlab - https://phabricator.wikimedia.org/T423313 (10Bodhisattwa) 03NEW [16:22:40] 10Tool-wsindex: MIgrate Ws-index Codeberg repo to Gitlab - https://phabricator.wikimedia.org/T423313#11820698 (10Bodhisattwa) [16:28:46] 10Toolforge, 06tools-platform-team: [Toolforge]: User Activity (based on successful SSH logins) - https://phabricator.wikimedia.org/T423228#11820748 (10bd808) Linked google sheet seems to be private access only. [16:29:37] 10Toolforge, 06tools-platform-team: [Toolforge]: User Stats Baseline (Outreach Channels, Accounts, Activity) - https://phabricator.wikimedia.org/T422783#11820755 (10bd808) Linked "Channels" google doc seems to be private access only. [17:26:25] 06cloud-services-team, 10Cloud-VPS, 13Patch-For-Review: Cloud init and unattended upgrades while bootstrapping Trixie VMs - https://phabricator.wikimedia.org/T422509#11820998 (10Andrew) Assumption #1 is incorrect: puppet does not actually downgrade puppet to version 7. So, our base image had puppet 8 already... [17:34:39] 10Tool-curator: Curator: more recent image sequences cannot be retrieved - https://phabricator.wikimedia.org/T423321 (10PantheraLeo1359531) 03NEW [17:35:30] 10Tool-curator: Curator: more recent image sequences cannot be retrieved - https://phabricator.wikimedia.org/T423321#11821079 (10PantheraLeo1359531) * https://www.mapillary.com/app/user/DariusV_Pietu?lat=54.45149470000001&lng=24.042975699999943&z=17&dateFrom=2026-04-13&focus=map&pKey=2449016855659491 * https://w... [17:49:26] 10Tool-humaniki-2: [Birth-years view] Add a form to filter chart by date range - https://phabricator.wikimedia.org/T422339#11821124 (10Danya) 05Open→03In progress [17:56:53] 10Tool-humaniki-2: [Birth-years view] Add a form to filter chart by date range - https://phabricator.wikimedia.org/T422339#11821144 (10Danya) 05In progress→03Open [18:03:25] 10Tool-paulina: Taxonomía de áreas creativas - https://phabricator.wikimedia.org/T393157#11821177 (10Pepe_piton) In the field of music, we can also add Q822146 (lyricist). [18:24:40] 06cloud-services-team, 10Cloud-VPS, 13Patch-For-Review: Cloud init and unattended upgrades while bootstrapping Trixie VMs - https://phabricator.wikimedia.org/T422509#11821281 (10Andrew) I think we're good -- I don't see new guest VMs trying to install puppet on startup. I'll roll out new base images shortly. [19:02:55] 10Toolforge, 06tools-platform-team: [Toolforge]: User Activity (based on successful SSH logins) - https://phabricator.wikimedia.org/T423228#11821401 (10komla) @bd808 oops, fixed now. Also updated the link to point directly to the specific sheet. [19:03:03] 10Toolforge, 06tools-platform-team: [Toolforge]: User Activity (based on successful SSH logins) - https://phabricator.wikimedia.org/T423228#11821402 (10komla) [19:07:35] 10Cloud-VPS (Project-requests), 06Test Platform: Request creation of ciperformance VPS project - https://phabricator.wikimedia.org/T423226#11821417 (10Raymond_Ndibe) [19:16:15] !log raymond-ndibe@cloudcumin1001 ciperformance START - Cookbook wmcs.vps.create_project for project ciperformance in eqiad1 (T423226) [19:16:17] raymond-ndibe@cloudcumin1001: Unknown project "ciperformance" [19:16:18] T423226: Request creation of ciperformance VPS project - https://phabricator.wikimedia.org/T423226 [19:16:58] (03open) 10group_199_bot_f98be072172e323ae6d1441939d3e461: projects: added project ciperformance [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/307 (https://phabricator.wikimedia.org/T423226) [19:18:49] (03approved) 10raymond-ndibe: projects: added project ciperformance [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/307 (https://phabricator.wikimedia.org/T423226) (owner: 10group_199_bot_f98be072172e323ae6d1441939d3e461) [19:19:21] (03merge) 10raymond-ndibe: projects: added project ciperformance [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/307 (https://phabricator.wikimedia.org/T423226) (owner: 10group_199_bot_f98be072172e323ae6d1441939d3e461) [19:20:12] raymond-ndibe@cloudcumin1001 create_project (PID 2629836) is awaiting input [19:21:41] !log raymond-ndibe@cloudcumin1001 ciperformance END (PASS) - Cookbook wmcs.vps.create_project (exit_code=0) for project ciperformance in eqiad1 (T423226) [19:21:43] raymond-ndibe@cloudcumin1001: Unknown project "ciperformance" [19:21:43] T423226: Request creation of ciperformance VPS project - https://phabricator.wikimedia.org/T423226 [19:24:57] 10Cloud-VPS (Project-requests), 06Test Platform, 13Patch-For-Review: Request creation of ciperformance VPS project - https://phabricator.wikimedia.org/T423226#11821453 (10Raymond_Ndibe) ` sudo wmcs-openstack project show ciperformance +-------------+-----------------------------------------------------------... [19:25:44] (03open) 10danyya: Draft: Refactor charts [toolforge-repos/humaniki] - 10https://gitlab.wikimedia.org/toolforge-repos/humaniki/-/merge_requests/9 [19:26:44] 10Cloud-VPS (Project-requests), 06Test Platform, 13Patch-For-Review: Request creation of ciperformance VPS project - https://phabricator.wikimedia.org/T423226#11821456 (10Raymond_Ndibe) 05Open→03Resolved [19:56:11] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for service: project,nova [19:57:22] FIRING: [2x] HAProxyBackendUnavailable: HAProxy service nova-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [20:01:48] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) on deployment eqiad1 for service: project,nova [20:02:22] RESOLVED: [2x] HAProxyBackendUnavailable: HAProxy service nova-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [21:03:49] !log tools.cluebotng-monitoring Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24422702845 (https://github.com/cluebotng/component-configs/commits/6d3be4a356e5c50103fa3381dc7c9ce81bf1788f) [21:03:51] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-monitoring/SAL [21:05:45] !log tools.cluebotng Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/24422836823 (https://github.com/cluebotng/component-configs/commits/10f4f0f81e169fac55d056176a273966c8160078) [21:05:46] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng/SAL [21:08:10] !log tools.cluebotng-review Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24422836848 (https://github.com/cluebotng/component-configs/commits/10f4f0f81e169fac55d056176a273966c8160078) [21:08:11] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-review/SAL [21:29:17] FIRING: JobUnavailable: Reduced availability for job rabbitmq in cloud@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [21:32:50] FIRING: [48x] NeutronAgentDown: Neutron neutron-openvswitch-agent on cloudnet1005 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown [21:33:51] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for all services [21:34:17] RESOLVED: JobUnavailable: Reduced availability for job rabbitmq in cloud@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [21:36:22] FIRING: [4x] HAProxyBackendUnavailable: HAProxy service glance-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [21:37:50] RESOLVED: [48x] NeutronAgentDown: Neutron neutron-openvswitch-agent on cloudnet1005 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown [21:41:22] RESOLVED: [4x] HAProxyBackendUnavailable: HAProxy service glance-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [21:48:01] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) on deployment eqiad1 for all services [21:57:59] !log tools.cluebotng-monitoring Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24424850045 (https://github.com/cluebotng/component-configs/commits/9124b4b266ce71985cca6d82fd4261c1850b6d4d) [21:58:01] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-monitoring/SAL [23:08:40] 10Cloud-VPS (Quota-requests), 10Tool-etherpad-backup: Increase object storage object count for etherpads3 project - https://phabricator.wikimedia.org/T423354 (10bd808) 03NEW [23:24:22] FIRING: HAProxyBackendUnavailable: HAProxy service nova-api_backend backend cloudcontrol1011.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [23:27:56] FIRING: SystemdUnitDown: The service unit nova-api.service is in failed status on host cloudcontrol1011. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1011 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [23:29:22] FIRING: [2x] HAProxyBackendUnavailable: HAProxy service nova-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [23:52:00] 10Cloud-VPS (Quota-requests), 10Tool-etherpad-backup: Increase object storage object count for etherpads3 project - https://phabricator.wikimedia.org/T423354#11822391 (10Andrew) +1