[00:00:26] (03update) 10chuckonwumelu: Draft: Started DNS imports [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/3 [00:08:38] (03update) 10chuckonwumelu: Draft: Started DNS imports [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/3 [00:22:00] (03update) 10bd808: flavors: Expose g4.cores8.ram24.disk20.ephemeral90.4xiops to zuul3 [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/202 (https://phabricator.wikimedia.org/T392294) [00:23:03] (03update) 10chuckonwumelu: Draft: Started DNS imports [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/3 [00:23:19] (03update) 10bd808: flavors: Expose g4.cores8.ram24.disk20.ephemeral90.4xiops to zuul3 [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/202 (https://phabricator.wikimedia.org/T392294) [01:08:41] (03update) 10chuckonwumelu: Draft: Started DNS imports [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/3 [01:57:01] (03update) 10chuckonwumelu: Draft: Started DNS imports [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/3 [02:07:56] FIRING: [2x] SystemdUnitDown: The service unit remove_dangling_cinder_snapshots.service is in failed status on host cloudbackup1001-dev. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [02:15:13] (03update) 10chuckonwumelu: Draft: Started DNS imports [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/3 [02:25:03] (03update) 10chuckonwumelu: Draft: Started DNS imports [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/3 [03:35:48] (03update) 10chuckonwumelu: Draft: Started DNS imports [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/3 [03:38:50] (03update) 10chuckonwumelu: Started DNS imports [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/3 [04:02:56] FIRING: [2x] SystemdUnitDown: The systemd unit remove_dangling_cinder_snapshots.service on node cloudbackup1001-dev has been failing for more than two hours. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [04:03:03] 06cloud-services-team: SystemdUnitDown - https://phabricator.wikimedia.org/T392547 (10phaultfinder) 03NEW [05:22:56] FIRING: SystemdUnitDown: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [05:42:56] RESOLVED: SystemdUnitDown: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [05:52:56] FIRING: SystemdUnitDown: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [06:02:56] FIRING: [2x] SystemdUnitDown: The service unit designate_floating_ip_ptr_records_updater.service is in failed status on host cloudcontrol1006. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [06:12:56] RESOLVED: [2x] SystemdUnitDown: The service unit designate_floating_ip_ptr_records_updater.service is in failed status on host cloudcontrol1006. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [06:22:56] FIRING: [2x] SystemdUnitDown: The service unit designate_floating_ip_ptr_records_updater.service is in failed status on host cloudcontrol1006. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [06:23:00] 06cloud-services-team, 10Data-Services, 06DBA, 10Wikifunctions, and 2 others: Make wikifunctionsclient_usage table available on cloud wiki replicas - https://phabricator.wikimedia.org/T392475#10763699 (10Marostegui) The table already exists on the wikireplicas hosts, they just need a view on them. Removing... [06:23:16] 06cloud-services-team, 10Data-Services, 10Wikifunctions, 10Abstract Wikipedia team (25Q4 (Apr–Jun)), 07Essential-Work: Make wikifunctionsclient_usage table available on cloud wiki replicas - https://phabricator.wikimedia.org/T392475#10763700 (10Marostegui) [06:42:56] RESOLVED: [2x] SystemdUnitDown: The service unit designate_floating_ip_ptr_records_updater.service is in failed status on host cloudcontrol1006. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [06:52:56] FIRING: [2x] SystemdUnitDown: The service unit designate_floating_ip_ptr_records_updater.service is in failed status on host cloudcontrol1006. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [06:58:49] (03approved) 10dcaro: tests: Update example Kubernetes API URL [repos/cloud/toolforge/alerts] (main-I7c2d294db1fe3046105c5f1e0865e59601f9a232) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/alerts/-/merge_requests/29 (https://phabricator.wikimedia.org/T262562) (owner: 10taavi) [06:59:04] (03approved) 10dcaro: Set custom User-Agent to avoid 429 during tests [repos/cloud/toolforge/alerts] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/alerts/-/merge_requests/30 (owner: 10taavi) [07:01:28] (03update) 10dcaro: [jobs-cli] only send timeout if it's set by the user [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/96 (https://phabricator.wikimedia.org/T389118) (owner: 10raymond-ndibe) [07:01:41] (03approved) 10dcaro: [jobs-cli] only send timeout if it's set by the user [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/96 (https://phabricator.wikimedia.org/T389118) (owner: 10raymond-ndibe) [07:01:53] (03update) 10dcaro: [jobs-cli] only send timeout if it's set by the user [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/96 (https://phabricator.wikimedia.org/T389118) (owner: 10raymond-ndibe) [07:12:56] RESOLVED: [2x] SystemdUnitDown: The service unit designate_floating_ip_ptr_records_updater.service is in failed status on host cloudcontrol1006. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [07:22:56] FIRING: [2x] SystemdUnitDown: The service unit designate_floating_ip_ptr_records_updater.service is in failed status on host cloudcontrol1006. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [07:24:34] (03merge) 10taavi: Set custom User-Agent to avoid 429 during tests [repos/cloud/toolforge/alerts] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/alerts/-/merge_requests/30 [07:24:37] (03update) 10taavi: tests: Update example Kubernetes API URL [repos/cloud/toolforge/alerts] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/alerts/-/merge_requests/29 (https://phabricator.wikimedia.org/T262562) [07:24:48] (03merge) 10taavi: tests: Update example Kubernetes API URL [repos/cloud/toolforge/alerts] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/alerts/-/merge_requests/29 (https://phabricator.wikimedia.org/T262562) [07:42:50] (03approved) 10aborrero: legacy_redirector: Add IPv6 records for toolserver.org [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/6 (https://phabricator.wikimedia.org/T392506) (owner: 10taavi) [07:42:56] RESOLVED: [2x] SystemdUnitDown: The service unit designate_floating_ip_ptr_records_updater.service is in failed status on host cloudcontrol1006. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [07:43:35] (03merge) 10taavi: legacy_redirector: Add IPv6 records for toolserver.org [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/6 (https://phabricator.wikimedia.org/T392506) [07:52:56] FIRING: [2x] SystemdUnitDown: The service unit designate_floating_ip_ptr_records_updater.service is in failed status on host cloudcontrol1006. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [07:55:32] !log taavi@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for service: project,designate [07:56:34] !log taavi@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.restart_openstack (exit_code=99) on deployment eqiad1 for service: project,designate [08:02:56] FIRING: [2x] SystemdUnitDown: The systemd unit remove_dangling_cinder_snapshots.service on node cloudbackup1001-dev has been failing for more than two hours. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [08:12:56] RESOLVED: [2x] SystemdUnitDown: The service unit designate_floating_ip_ptr_records_updater.service is in failed status on host cloudcontrol1006. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [08:19:56] (03open) 10taavi: legacy_redirector: Manage floating IP and toolserver.org A records [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/7 (https://phabricator.wikimedia.org/T392506) [08:21:34] (03update) 10taavi: legacy_redirector: Manage floating IP and toolserver.org A records [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/7 (https://phabricator.wikimedia.org/T392506) [08:22:54] (03update) 10taavi: legacy_redirector: Manage floating IP and toolserver.org A records [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/7 (https://phabricator.wikimedia.org/T392506) [08:25:46] (03CR) 10FNegri: [C:03+1] "Thanks for updating this!" [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1137739 (https://phabricator.wikimedia.org/T262562) (owner: 10Majavah) [08:26:00] (03CR) 10Majavah: [C:03+2] wmcs_libs: k8s: Update example API server URL [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1137739 (https://phabricator.wikimedia.org/T262562) (owner: 10Majavah) [08:28:30] (03update) 10taavi: legacy_redirector: Manage floating IP and toolserver.org A records [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/7 (https://phabricator.wikimedia.org/T392506) [08:29:47] (03Merged) 10jenkins-bot: wmcs_libs: k8s: Update example API server URL [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1137739 (https://phabricator.wikimedia.org/T262562) (owner: 10Majavah) [08:39:57] (03update) 10taavi: legacy_redirector: Manage floating IP and toolserver.org A records [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/7 (https://phabricator.wikimedia.org/T392506) [08:42:06] (03approved) 10aborrero: legacy_redirector: Manage floating IP and toolserver.org A records [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/7 (https://phabricator.wikimedia.org/T392506) (owner: 10taavi) [08:42:31] (03merge) 10taavi: legacy_redirector: Manage floating IP and toolserver.org A records [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/7 (https://phabricator.wikimedia.org/T392506) [08:45:52] (03open) 10taavi: legacy_redirector: Add AAAA record for tools.wmflabs.org [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/8 (https://phabricator.wikimedia.org/T392506) [08:48:17] (03update) 10taavi: legacy_redirector: Add AAAA record for tools.wmflabs.org [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/8 (https://phabricator.wikimedia.org/T392506) [08:52:07] (03approved) 10aborrero: legacy_redirector: Add AAAA record for tools.wmflabs.org [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/8 (https://phabricator.wikimedia.org/T392506) (owner: 10taavi) [08:52:14] (03update) 10taavi: legacy_redirector: Add AAAA record for tools.wmflabs.org [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/8 (https://phabricator.wikimedia.org/T392506) [08:52:34] (03update) 10taavi: legacy_redirector: Add AAAA record for tools.wmflabs.org [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/8 (https://phabricator.wikimedia.org/T392506) [08:53:38] (03merge) 10taavi: legacy_redirector: Add AAAA record for tools.wmflabs.org [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/8 (https://phabricator.wikimedia.org/T392506) [08:54:00] 06cloud-services-team, 10Cloud-VPS, 07IPv6, 13Patch-For-Review: CloudVPS: IPv6 in eqiad1 - https://phabricator.wikimedia.org/T380174#10763919 (10aborrero) 05Open→03Resolved a:03aborrero IPv6 is up and running. [08:55:12] 06cloud-services-team, 10Cloud-VPS, 07Epic, 07IPv6: Enable IPv6 on CloudVPS - https://phabricator.wikimedia.org/T37947#10763923 (10aborrero) 05Open→03Resolved a:03aborrero It "only" took us 13 years, but it has been finally enabled. [08:58:12] 06cloud-services-team, 10Toolforge (Toolforge iteration 19), 07IPv6, 13Patch-For-Review: Enable IPv6 for tools.wmflabs.org / *.toolserver.org legacy redirector service - https://phabricator.wikimedia.org/T392506#10763956 (10taavi) 05Open→03Resolved [08:58:27] 06cloud-services-team, 10Toolforge, 07IPv6: Enable IPv6 on toolforge.org - https://phabricator.wikimedia.org/T211575#10763958 (10taavi) 05Stalled→03Open [09:05:26] (03open) 10dcaro: pipeline: add unresolved source reference parameter [repos/cloud/toolforge/builds-builder] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/71 (https://phabricator.wikimedia.org/T389043) [09:06:13] 06cloud-services-team, 10Cloud-VPS, 06Data-Engineering, 07IPv6, 13Patch-For-Review: Add new WMCS IP ranges to analytics - https://phabricator.wikimedia.org/T392468#10763981 (10taavi) >>! In T392468#10762365, @Ottomata wrote: > @taavi thanks! What's the timeline for this? We've just made it possible to... [09:18:39] 06cloud-services-team, 10Toolforge, 07IPv6: Enable IPv6 for Toolforge services - https://phabricator.wikimedia.org/T392509#10763995 (10taavi) [09:18:42] 06cloud-services-team, 10Toolforge, 07IPv6, 07Kubernetes: Support IPv6 in Toolforge Kubernetes - https://phabricator.wikimedia.org/T380060#10763996 (10taavi) [09:28:18] 06cloud-services-team, 10Cloud-VPS: metricsinfra: maintain-projects is broken - https://phabricator.wikimedia.org/T392559 (10taavi) 03NEW [09:29:54] 06cloud-services-team, 10Cloud-VPS: metricsinfra: maintain-projects is broken - https://phabricator.wikimedia.org/T392559#10764053 (10taavi) ` MariaDB [prometheusconfig]> select * from alerts where id in (13, 14); +----+------------+---------------+--------------------------------------------------------------... [09:31:09] 06cloud-services-team, 10Cloud-VPS: metricsinfra: maintain-projects is broken - https://phabricator.wikimedia.org/T392559#10764055 (10taavi) Ok, the `tf-infra-test` project is gone and replaced by `tofuinfratest`. Fixed the alerts: ` MariaDB [prometheusconfig]> select * from projects where openstack_id = 'tofu... [09:33:25] 06cloud-services-team, 10Cloud-VPS: metricsinfra: maintain-projects should not crash when a project with alerts is deleted - https://phabricator.wikimedia.org/T392560 (10taavi) 03NEW [09:33:32] 06cloud-services-team, 10Cloud-VPS: metricsinfra: maintain-projects is broken - https://phabricator.wikimedia.org/T392559#10764073 (10taavi) 05Open→03Resolved The `maintain-projects` script succeeds now. Awesome. Filed {T392560} to fix the script. Also fixed the project references in the alerts. Clos... [09:39:28] FIRING: InstanceDown: Project tools instance tools-legacy-redirector-3 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [10:06:29] 10Toolforge (Toolforge iteration 19), 07Epic: Fix toolforge tests and deployment cicd pipelines - https://phabricator.wikimedia.org/T392524#10764177 (10Raymond_Ndibe) [10:31:49] 06cloud-services-team, 10Cloud-VPS: metricsinfra: Alert on SD failures - https://phabricator.wikimedia.org/T392568 (10taavi) 03NEW [10:31:57] 06cloud-services-team, 10Cloud-VPS: Get rid of cloud-cumin VMs in cloudinfra project - https://phabricator.wikimedia.org/T367725#10764302 (10taavi) 05Open→03Resolved [10:43:00] 06cloud-services-team, 10Cloud-VPS, 07IPv6: metricsinfra: Support scraping v6-enabled instances - https://phabricator.wikimedia.org/T392570 (10taavi) 03NEW [10:43:47] 06cloud-services-team, 10Cloud-VPS, 07IPv6: metricsinfra: Support scraping v6-enabled instances - https://phabricator.wikimedia.org/T392570#10764319 (10taavi) [11:11:46] 06cloud-services-team, 10Toolforge (Toolforge iteration 19), 07IPv6: Enable IPv6 for tools.wmflabs.org / *.toolserver.org legacy redirector service - https://phabricator.wikimedia.org/T392506#10764388 (10Paladox) I can't seem to ping the ipv6 address: ` ping6 tools.wmflabs.org PING6(56=40+8+8 bytes) 2a0... [11:33:11] 10Tool-schedule-deployment: 500 Internal Server Error - https://phabricator.wikimedia.org/T392575 (10Lucas_Werkmeister_WMDE) 03NEW [11:33:37] 10Tool-schedule-deployment: 500 Internal Server Error - https://phabricator.wikimedia.org/T392575#10764435 (10Lucas_Werkmeister_WMDE) [11:33:53] 10Tool-schedule-deployment: 500 Internal Server Error - https://phabricator.wikimedia.org/T392575#10764436 (10Lucas_Werkmeister_WMDE) I’m afraid I don’t have the time to investigate this right now. [11:34:44] 10Tool-schedule-deployment: 500 Internal Server Error - https://phabricator.wikimedia.org/T392575#10764443 (10Lucas_Werkmeister_WMDE) Ok, rebasing the change fixed it, so it was probably somehow related to it being ca. two weeks old (Gerrit showed it as a “merge conflict”, though the rebase succeeded without con... [11:53:41] 06cloud-services-team, 10Toolforge (Toolforge iteration 19), 07IPv6: Enable IPv6 for tools.wmflabs.org / *.toolserver.org legacy redirector service - https://phabricator.wikimedia.org/T392506#10764452 (10taavi) >>! In T392506#10764388, @Paladox wrote: > I can't seem to ping the ipv6 address: Yeah, this... [11:55:08] 06cloud-services-team, 10Toolforge (Toolforge iteration 19), 07IPv6: Enable IPv6 for tools.wmflabs.org / *.toolserver.org legacy redirector service - https://phabricator.wikimedia.org/T392506#10764453 (10Paladox) thanks! [11:58:36] (03open) 10dcaro: start: resolve the commit hash to build on start [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/129 [12:02:56] FIRING: [2x] SystemdUnitDown: The systemd unit remove_dangling_cinder_snapshots.service on node cloudbackup1001-dev has been failing for more than two hours. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [12:05:41] (03update) 10aborrero: Started DNS imports [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/3 (owner: 10chuckonwumelu) [12:13:11] (03open) 10aborrero: README: clarify a few things [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/9 [12:13:46] (03open) 10aborrero: README: clarify a few things [repos/cloud/cloud-vps/networktests-tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/networktests-tofu-provisioning/-/merge_requests/17 [12:13:48] (03approved) 10taavi: README: clarify a few things [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/9 (owner: 10aborrero) [12:14:04] (03merge) 10aborrero: README: clarify a few things [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/9 [12:21:02] (03merge) 10aborrero: README: clarify a few things [repos/cloud/cloud-vps/networktests-tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/networktests-tofu-provisioning/-/merge_requests/17 [12:35:45] 10Tool-fa-speed, 06Future-Audiences: [2025 Apr 01-Apr11] Committed sprint work - https://phabricator.wikimedia.org/T390687#10764562 (10DLin-WMF) 05Open→03Resolved a:03DLin-WMF [12:37:46] 06cloud-services-team, 10Cloud-VPS, 13Patch-Needs-Improvement: Change routing to ensure that traffic originating from Cloud VPS is seen as non-private IPs by Wikimedia wikis - https://phabricator.wikimedia.org/T209011#10764578 (10taavi) 05Stalled→03Declined We have decided to not pursue this, as [[ h... [12:39:01] 06cloud-services-team, 10Cloud-VPS: consider storing information on cloud NAT mappings - https://phabricator.wikimedia.org/T273734#10764617 (10taavi) [12:39:31] 06cloud-services-team, 10Cloud-VPS, 06Traffic-Icebox: Get traffic team green light for Cloud NAT to wikis change - https://phabricator.wikimedia.org/T273737#10764620 (10taavi) 05Stalled→03Declined (see T209011#10764578.) [12:39:32] 06cloud-services-team, 06MediaWiki-Engineering: Get platform engineering team green light for Cloud NAT to wikis change - https://phabricator.wikimedia.org/T273738#10764626 (10taavi) 05Open→03Declined (see T209011#10764578.) [12:39:37] 06cloud-services-team, 10Cloud-VPS, 06serviceops: Get Service Operations team green light for Cloud NAT to wikis change - https://phabricator.wikimedia.org/T273740#10764631 (10taavi) 05Open→03Declined (see T209011#10764578.) [12:41:30] 06cloud-services-team, 10Cloud-VPS: figure out what/how to handle bots/tools with restricted IP Oauth/password stored in the mediawiki databases - https://phabricator.wikimedia.org/T273724#10764639 (10taavi) 05Open→03Declined Declining as the parent has been declined (T209011#10764578). For the recor... [12:42:50] 06cloud-services-team, 10Data-Services, 06DBA, 10Wikidata, 07User-notice: Set up x3 replication to wikireplicas - https://phabricator.wikimedia.org/T390954#10764646 (10Ladsgroup) [[https://lists.wikimedia.org/hyperkitty/list/wikitech-l@lists.wikimedia.org/message/NLHRDDZCMP6OTAMMOEYCKOUXUEW3OXJ2/|Announc... [12:45:50] FIRING: ProbeDown: Service tools-static-15:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-15:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [12:50:50] RESOLVED: ProbeDown: Service tools-static-15:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-15:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [12:56:01] (03update) 10l10n-bot: Localisation updates from https://translatewiki.net. [toolforge-repos/wd-image-positions] - 10https://gitlab.wikimedia.org/toolforge-repos/wd-image-positions/-/merge_requests/35 [12:57:44] (03open) 10l10n-bot: Localisation updates from https://translatewiki.net. [toolforge-repos/ranker] - 10https://gitlab.wikimedia.org/toolforge-repos/ranker/-/merge_requests/19 [13:01:57] (03update) 10dcaro: start: resolve the commit hash to build on start [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/129 [13:03:45] (03CR) 10CI reject: [V:04-1] Localisation updates from https://translatewiki.net. [labs/tools/massmailer] - 10https://gerrit.wikimedia.org/r/1138745 (owner: 10L10n-bot) [13:03:47] (03CR) 10CI reject: [V:04-1] Localisation updates from https://translatewiki.net. [labs/tools/watch-translations] - 10https://gerrit.wikimedia.org/r/1138747 (owner: 10L10n-bot) [13:06:09] 06cloud-services-team, 10Cloud-VPS, 07IPv6, 13Patch-For-Review: Enable IPv6 for the Cloud VPS web proxy - https://phabricator.wikimedia.org/T379175#10764744 (10taavi) [13:09:22] (03update) 10dcaro: start: resolve the commit hash to build on start [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/129 [13:20:38] 06cloud-services-team, 10Cloud-VPS, 07IPv6, 13Patch-For-Review: horizon: enable the UI to select networks on VM creation panel - https://phabricator.wikimedia.org/T380081#10764789 (10taavi) 05Open→03Resolved [13:20:57] 06cloud-services-team, 10Cloud-VPS, 07Documentation, 07IPv6: Cloud VPS: prepare documentation on VXLAN/IPV6 migration - https://phabricator.wikimedia.org/T380054#10764791 (10taavi) 05Stalled→03Open [13:28:19] (03approved) 10taavi: flavors: Expose g4.cores8.ram24.disk20.ephemeral90.4xiops to zuul3 [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/202 (https://phabricator.wikimedia.org/T392294) (owner: 10bd808) [13:28:53] (03merge) 10fnegri: flavors: Expose g4.cores8.ram24.disk20.ephemeral90.4xiops to zuul3 [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/202 (https://phabricator.wikimedia.org/T392294) (owner: 10bd808) [13:29:13] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan+apply for main branch [13:29:50] !log fnegri@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.tofu (exit_code=99) running tofu plan+apply for main branch [13:40:26] 06cloud-services-team, 10Cloud-VPS (Quota-requests), 10Continuous-Integration-Infrastructure (Zuul upgrade): Quota increase for zuul3 project - https://phabricator.wikimedia.org/T392294#10764848 (10fnegri) Patch merged and applied, the g4 flavor should now be available. > Here's my mortal user adding a... [13:44:50] 06cloud-services-team, 10Cloud-VPS, 07IPv6, 13Patch-For-Review: Enable IPv6 for the Cloud VPS web proxy - https://phabricator.wikimedia.org/T379175#10764854 (10taavi) [13:59:55] (03update) 10chuckonwumelu: Started DNS imports [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/3 [14:02:11] (03update) 10chuckonwumelu: Started DNS imports [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/3 [14:03:18] 10Toolforge (Toolforge iteration 19), 13Patch-For-Review: [builds-api] Store the commit hash that was used for the build - https://phabricator.wikimedia.org/T389043#10764907 (10dcaro) a:03dcaro [14:03:24] 10Toolforge (Toolforge iteration 19), 13Patch-For-Review: [builds-api] Store the commit hash that was used for the build - https://phabricator.wikimedia.org/T389043#10764909 (10dcaro) 05Open→03In progress [14:04:53] (03update) 10chuckonwumelu: Started DNS imports [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/3 [14:06:53] (03approved) 10aborrero: Started DNS imports [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/3 (owner: 10chuckonwumelu) [14:07:48] (03merge) 10chuckonwumelu: Started DNS imports [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/3 [14:10:24] (03update) 10dcaro: [jobs-cli] health_check and quota refactor [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/97 (https://phabricator.wikimedia.org/T389118) (owner: 10raymond-ndibe) [14:10:47] (03update) 10dcaro: [jobs-api] move custom validations out of api models [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/150 (https://phabricator.wikimedia.org/T389118) (owner: 10raymond-ndibe) [14:15:00] (03open) 10chuckonwumelu: Removed record managed by Puppet [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/10 [14:18:29] (03approved) 10aborrero: Removed record managed by Puppet [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/10 (owner: 10chuckonwumelu) [14:18:56] (03merge) 10chuckonwumelu: Removed record managed by Puppet [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/10 [14:27:28] FIRING: InstanceDown: Project cvn instance cvn-app10 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [14:32:28] RESOLVED: InstanceDown: Project cvn instance cvn-app10 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [14:34:08] 06cloud-services-team, 10Toolforge: [toolsdb] Upgrade from 10.6.20 to 10.6.21 - https://phabricator.wikimedia.org/T392596 (10fnegri) 03NEW [14:37:19] 10Tool-schedule-deployment: 500 Internal Server Error - https://phabricator.wikimedia.org/T392575#10765081 (10Lucas_Werkmeister_WMDE) Strangely, the only error related to `/backport/1134692` that I see in `~tools.schedule-deployment/logs/web.log` is connection errors (timeout) when talking to Gerrit… o_O [14:40:49] (03open) 10aborrero: gitlab-ci: send email notifications if tofu diff is found [repos/cloud/cloud-vps/networktests-tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/networktests-tofu-provisioning/-/merge_requests/18 [14:45:08] (03update) 10aborrero: gitlab-ci: send email notifications if tofu diff is found [repos/cloud/cloud-vps/networktests-tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/networktests-tofu-provisioning/-/merge_requests/18 [14:47:33] (03approved) 10taavi: gitlab-ci: send email notifications if tofu diff is found [repos/cloud/cloud-vps/networktests-tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/networktests-tofu-provisioning/-/merge_requests/18 (owner: 10aborrero) [14:47:58] (03merge) 10aborrero: gitlab-ci: send email notifications if tofu diff is found [repos/cloud/cloud-vps/networktests-tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/networktests-tofu-provisioning/-/merge_requests/18 [14:50:35] (03open) 10aborrero: gitlab-ci: quote variable expansion in email sending function [repos/cloud/cloud-vps/networktests-tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/networktests-tofu-provisioning/-/merge_requests/19 [14:51:04] (03merge) 10aborrero: gitlab-ci: quote variable expansion in email sending function [repos/cloud/cloud-vps/networktests-tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/networktests-tofu-provisioning/-/merge_requests/19 [14:58:48] (03open) 10aborrero: gitlab-ci: verify the content of the plan we got [repos/cloud/cloud-vps/networktests-tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/networktests-tofu-provisioning/-/merge_requests/20 [14:59:46] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Data-Services, 13Patch-For-Review: Remove the compatibility layer of block schema in wikireplicas - https://phabricator.wikimedia.org/T390767#10765224 (10fnegri) Posted to cloud-announce: https://lists.wikimedia.org/hyperkitty/list/cloud-announce@lists.wikimedia.o... [15:06:09] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Data-Services, 13Patch-For-Review: Remove the compatibility layer of block schema in wikireplicas - https://phabricator.wikimedia.org/T390767#10765281 (10Ladsgroup) Thank you! [15:07:41] (03update) 10aborrero: gitlab-ci: verify the content of the plan we got [repos/cloud/cloud-vps/networktests-tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/networktests-tofu-provisioning/-/merge_requests/20 [15:11:37] (03update) 10aborrero: gitlab-ci: verify the content of the plan we got [repos/cloud/cloud-vps/networktests-tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/networktests-tofu-provisioning/-/merge_requests/20 [15:13:31] (03update) 10aborrero: gitlab-ci: verify the content of the plan we got [repos/cloud/cloud-vps/networktests-tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/networktests-tofu-provisioning/-/merge_requests/20 [15:15:41] (03merge) 10aborrero: gitlab-ci: verify the content of the plan we got [repos/cloud/cloud-vps/networktests-tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/networktests-tofu-provisioning/-/merge_requests/20 [15:17:48] FIRING: PuppetFailure: Puppet has failed on cloudcontrol1011:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [15:17:57] 06cloud-services-team: PuppetFailure Puppet has failed on cloudcontrol1011:9100 - https://phabricator.wikimedia.org/T392603 (10phaultfinder) 03NEW [16:02:56] FIRING: [2x] SystemdUnitDown: The systemd unit remove_dangling_cinder_snapshots.service on node cloudbackup1001-dev has been failing for more than two hours. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [16:04:35] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Cloud-VPS, 10Ceph, 06DC-Ops, and 2 others: [cloudceph] test the new DELL hard drives throughput - https://phabricator.wikimedia.org/T390134#10765593 (10dcaro) Did some tests, and we are on the clear, the new hard drives are performant enough (at low level) to ha... [16:07:32] 06cloud-services-team, 10Cloud-VPS, 07IPv6: VMs with ferm host-level firewall do not permit DHCPv6 responses - https://phabricator.wikimedia.org/T392611 (10taavi) 03NEW [16:12:14] 06cloud-services-team, 10Cloud-VPS, 07IPv6: VMs with ferm host-level firewall do not permit DHCPv6 responses - https://phabricator.wikimedia.org/T392611#10765631 (10taavi) a:03taavi [16:22:01] 06cloud-services-team, 10Cloud-VPS, 07IPv6, 13Patch-For-Review: VMs with ferm host-level firewall do not permit DHCPv6 responses - https://phabricator.wikimedia.org/T392611#10765658 (10taavi) 05Open→03Resolved [16:23:43] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Cloud-VPS, 10Ceph, 06DC-Ops, and 2 others: [cloudceph] test the new DELL hard drives throughput - https://phabricator.wikimedia.org/T390134#10765661 (10dcaro) [17:25:40] !log dcaro@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.bootstrap_and_add (T390134) [17:25:47] T390134: [cloudceph] test the new DELL hard drives throughput - https://phabricator.wikimedia.org/T390134 [17:26:48] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Cloud-VPS, 10Ceph, 06DC-Ops, and 2 others: [cloudceph] test the new DELL hard drives throughput - https://phabricator.wikimedia.org/T390134#10765911 (10dcaro) [17:29:10] PROBLEM - Host cloudcephosd1029 is DOWN: PING CRITICAL - Packet loss = 100% [17:30:05] RECOVERY - Host cloudcephosd1029 is UP: PING OK - Packet loss = 0%, RTA = 0.26 ms [17:32:09] FIRING: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [17:34:34] (03update) 10dcaro: [jobs-api] use pydantic for all models [repos/cloud/toolforge/jobs-api] (move_most_custom_validations_out_of_api_models) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/139 (https://phabricator.wikimedia.org/T389118) (owner: 10raymond-ndibe) [17:37:09] RESOLVED: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [17:42:06] 06cloud-services-team, 10Cloud-VPS, 07IPv6: IPv6 for cloud-realm services - https://phabricator.wikimedia.org/T379282#10765972 (10cmooney) @Majavah we are almost there with this one. Right now the BGP is up to the cloudsw and it's getting the route your sending: ` cmooney@cloudsw1-b1-codfw> show route rece... [17:45:26] 10Tool-inteGraality: Support qlever endpoint for integraality - https://phabricator.wikimedia.org/T385749#10765979 (10Sj) Qlever is approaching being able to support real-time updates, and is being more widely used by various wikidata-related initiatives as well as those involving larger datasets. This would be... [18:09:08] (03open) 10raymond-ndibe: [envvars-cli] test hide envvars value by default [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/754 (https://phabricator.wikimedia.org/T363544) [18:09:14] (03update) 10raymond-ndibe: [envvars-cli] test hide envvars value by default [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/754 (https://phabricator.wikimedia.org/T363544) [18:09:44] (03update) 10raymond-ndibe: [envvars-cli] test hide envvars value by default [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/754 (https://phabricator.wikimedia.org/T363544) [18:11:30] (03close) 10raymond-ndibe: [envvars-cli] test hide envvars value by default [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/753 (https://phabricator.wikimedia.org/T363544) [18:13:16] !log raymond-ndibe@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component envvars-cli [18:24:47] !log raymond-ndibe@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component envvars-cli [18:25:40] !log raymond-ndibe@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component envvars-cli [18:27:28] 06cloud-services-team, 10Cloud-VPS (Quota-requests), 10Continuous-Integration-Infrastructure (Zuul upgrade): Quota increase for zuul3 project - https://phabricator.wikimedia.org/T392294#10766083 (10Andrew) ` root@cloudcontrol1011:~# openstack role list | grep admin | 1102f4ff63c3435793d0e4340bf4b04e | gl... [18:32:25] !log raymond-ndibe@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.component.deploy (exit_code=99) for component envvars-cli [18:33:20] !log raymond-ndibe@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component envvars-cli [18:38:32] !log raymond-ndibe@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.component.deploy (exit_code=99) for component envvars-cli [18:39:19] !log raymond-ndibe@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component envvars-cli [18:50:19] !log raymond-ndibe@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component envvars-cli [18:57:48] RESOLVED: PuppetFailure: Puppet has failed on cloudcontrol1011:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [19:04:49] (03approved) 10raymond-ndibe: [envvars-cli] test hide envvars value by default [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/754 (https://phabricator.wikimedia.org/T363544) [19:04:49] (03update) 10raymond-ndibe: [envvars-cli] test hide envvars value by default [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/754 (https://phabricator.wikimedia.org/T363544) [19:04:54] (03merge) 10raymond-ndibe: [envvars-cli] test hide envvars value by default [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/754 (https://phabricator.wikimedia.org/T363544) [19:10:58] (03approved) 10raymond-ndibe: d/changelog: bump to 0.0.13 [repos/cloud/toolforge/envvars-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-cli/-/merge_requests/79 (https://phabricator.wikimedia.org/T363544) [19:11:08] (03merge) 10raymond-ndibe: d/changelog: bump to 0.0.13 [repos/cloud/toolforge/envvars-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-cli/-/merge_requests/79 (https://phabricator.wikimedia.org/T363544) [19:19:59] (03approved) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/wd-image-positions] - 10https://gitlab.wikimedia.org/toolforge-repos/wd-image-positions/-/merge_requests/35 (owner: 10l10n-bot) [19:20:02] (03merge) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/wd-image-positions] - 10https://gitlab.wikimedia.org/toolforge-repos/wd-image-positions/-/merge_requests/35 (owner: 10l10n-bot) [19:20:46] (03approved) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/ranker] - 10https://gitlab.wikimedia.org/toolforge-repos/ranker/-/merge_requests/19 (owner: 10l10n-bot) [19:20:50] (03merge) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/ranker] - 10https://gitlab.wikimedia.org/toolforge-repos/ranker/-/merge_requests/19 (owner: 10l10n-bot) [19:21:19] 06cloud-services-team, 10Toolforge (Toolforge iteration 19), 13Patch-For-Review: [envvars-cli] Add option to not show envvar values when listing - https://phabricator.wikimedia.org/T363544#10766223 (10Raymond_Ndibe) 05In progress→03Resolved [19:34:45] 06cloud-services-team, 10Toolforge: toolforge jobs load errors with 404 repetatively - https://phabricator.wikimedia.org/T381273#10766313 (10Raymond_Ndibe) 05Open→03Resolved a:03Raymond_Ndibe marking as resolved [20:02:56] FIRING: [2x] SystemdUnitDown: The systemd unit remove_dangling_cinder_snapshots.service on node cloudbackup1001-dev has been failing for more than two hours. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [20:04:36] (03update) 10raymond-ndibe: [jobs-cli] refactor job payload [repos/cloud/toolforge/jobs-cli] (health_check_and_quota_refactor) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/98 (https://phabricator.wikimedia.org/T389118 https://phabricator.wikimedia.org/T390136) [20:21:24] (03update) 10raymond-ndibe: [jobs-api] move custom validations out of api models [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/150 (https://phabricator.wikimedia.org/T389118) [20:25:34] (03update) 10raymond-ndibe: [jobs-api] use pydantic for all models [repos/cloud/toolforge/jobs-api] (move_most_custom_validations_out_of_api_models) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/139 (https://phabricator.wikimedia.org/T389118) [20:29:25] (03update) 10raymond-ndibe: [jobs-cli] health_check and quota refactor [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/97 (https://phabricator.wikimedia.org/T389118) [20:38:46] (03update) 10raymond-ndibe: [jobs-api] use pydantic for all models [repos/cloud/toolforge/jobs-api] (move_most_custom_validations_out_of_api_models) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/139 (https://phabricator.wikimedia.org/T389118) [21:02:56] FIRING: [4x] SystemdUnitDown: The systemd unit backup_cinder_volumes.service on node cloudbackup1001-dev has been failing for more than two hours. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [21:03:01] 06cloud-services-team: SystemdUnitDown - https://phabricator.wikimedia.org/T392547#10766573 (10phaultfinder) [21:07:13] 10Tool-inteGraality: Missing a space in the query - https://phabricator.wikimedia.org/T391523#10766582 (10JeanFred) Thanks for reporting this. Do you have an example dashboard where this happens? [21:39:11] (03update) 10raymond-ndibe: [jobs-cli] health_check and quota refactor [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/97 (https://phabricator.wikimedia.org/T389118) [21:39:20] (03update) 10raymond-ndibe: [jobs-cli] health_check and quota refactor [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/97 (https://phabricator.wikimedia.org/T389118) [21:57:48] FIRING: PuppetConstantChange: Puppet performing a change on every puppet run on cloudcontrol2005-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [22:02:49] (03update) 10raymond-ndibe: [jobs-api] split job models to oneoff, scheduled and continuous [repos/cloud/toolforge/jobs-api] (use_pydantic_for_core_job_model) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/154 (https://phabricator.wikimedia.org/T389118 https://phabricator.wikimedia.org/T390136) [22:15:18] (03update) 10raymond-ndibe: [jobs-api] move custom validations out of api models [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/150 (https://phabricator.wikimedia.org/T389118) [22:32:43] (03update) 10raymond-ndibe: [jobs-cli] health_check and quota refactor [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/97 (https://phabricator.wikimedia.org/T389118) [22:39:57] 06cloud-services-team, 10Toolforge: re-enable wmtran tool at toolforge - https://phabricator.wikimedia.org/T392408#10766947 (10bd808) >>! In T392408#10759109, @Gryllida wrote: > This worked. Thanks. Will it not require me to do this again, now, like after a reboot and such? The Kubernetes system works like a b... [22:52:24] (03update) 10raymond-ndibe: [jobs-cli] health_check and quota refactor [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/97 (https://phabricator.wikimedia.org/T389118) [22:52:32] (03update) 10raymond-ndibe: [jobs-cli] health_check and quota refactor [repos/cloud/toolforge/jobs-cli] (schedule_timeout_default_None) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/97 (https://phabricator.wikimedia.org/T389118) [23:16:30] 10VPS-project-Wikistats: Add nupwiki to wikistats - https://phabricator.wikimedia.org/T391856#10767086 (10Dzahn) a:03Dzahn [23:18:05] 10VPS-project-Wikistats: Add nupwiki to wikistats - https://phabricator.wikimedia.org/T391856#10767088 (10Dzahn) 05Open→03Resolved ` MariaDB [wikistats]> insert into wikipedias (prefix, lang, loclang, method) values ("nup","Nupe","Nupe",8); ... dzahn@wikistats-bookworm:~$ /usr/bin/php /usr/lib/wikistats/... [23:31:42] 10Tool-schedule-deployment: 500 Internal Server Error - https://phabricator.wikimedia.org/T392575#10767101 (10bd808) 05Open→03Declined I'm relatively certain this was just network issues between toolforge and gerrit; probably gerrit getting overwhelmed by bot traffic. `lang=python,lines=10 2025-04-24T11:... [23:59:57] (03update) 10raymond-ndibe: [jobs-cli] refactor job payload [repos/cloud/toolforge/jobs-cli] (health_check_and_quota_refactor) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/98 (https://phabricator.wikimedia.org/T389118 https://phabricator.wikimedia.org/T390136)