[01:42:56] FIRING: [3x] CloudVPSDesignateLeaks: Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [01:59:00] 10Tools, 10MediaViewer, 10Thumbor, 13Patch-For-Review: Explore moving the Panoviewer gadget/Toolforge tool into production - https://phabricator.wikimedia.org/T138933#9866171 (10tstarling) >>! In T138933#9847347, @Sdkb wrote: > @tstarling I don't understand the gerritbot comment above. Would you be able to... [02:29:43] 10Tool-schedule-deployment: Leave a comment on the Gerrit change when it is scheduled for a backport - https://phabricator.wikimedia.org/T366763#9866184 (10thcipriani) That message seems reasonable. Random thoughts: - might be nice to have deploy time (in SF time in this task) link to https://zonestamp.toolforg... [03:45:59] 10Tool-bridgebot: Bridge IRC to Matrix - https://phabricator.wikimedia.org/T366767 (10Legoktm) 03NEW [05:42:56] FIRING: [3x] CloudVPSDesignateLeaks: Detected 3 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [07:43:58] 10Tool-schedule-deployment: Leave a comment on the Gerrit change when it is scheduled for a backport - https://phabricator.wikimedia.org/T366763#9866370 (10kostajh) There was concern about adding too many comments in Gerrit history in {T323750}. As there is already a JS plugin for this tool, is it possible to us... [08:14:48] (03open) 10dcaro: Draft: investigate authentication [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/22 (https://phabricator.wikimedia.org/T363983) [08:30:41] FIRING: PrometheusRestarted: Prometheus/cloud restarted: beware monitoring artifacts. - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://grafana.wikimedia.org/d/GWvEXWDZk/prometheus-server?var-datasource=eqiad%20prometheus%2Fcloud - https://alerts.wikimedia.org/?q=alertname%3DPrometheusRestarted [08:40:41] FIRING: [2x] PrometheusRestarted: Prometheus/cloud restarted: beware monitoring artifacts. - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://alerts.wikimedia.org/?q=alertname%3DPrometheusRestarted [08:51:29] (03update) 10sstefanova: utils: update deb build and bump setup [repos/cloud/toolforge/builds-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-cli/-/merge_requests/73 (https://phabricator.wikimedia.org/T366674) [08:55:20] (03update) 10sstefanova: utils: update deb build and bump setup [repos/cloud/toolforge/builds-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-cli/-/merge_requests/73 (https://phabricator.wikimedia.org/T366674) [08:55:41] FIRING: [2x] PrometheusRestarted: Prometheus/cloud restarted: beware monitoring artifacts. - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://alerts.wikimedia.org/?q=alertname%3DPrometheusRestarted [09:00:11] 10Tool-schedule-deployment: Leave a comment on the Gerrit change when it is scheduled for a backport - https://phabricator.wikimedia.org/T366763#9866607 (10kostajh) >>! In T366763#9866370, @kostajh wrote: > There was concern about adding too many comments in Gerrit history in {T323750}. As there is already a JS... [09:05:41] FIRING: [4x] PrometheusRestarted: Prometheus/cloud restarted: beware monitoring artifacts. - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://alerts.wikimedia.org/?q=alertname%3DPrometheusRestarted [09:09:48] FIRING: PuppetDisabled: Puppet disabled on cloudidm2001-dev:9100 - https://wikitech.wikimedia.org/wiki/Puppet/Runbooks#Puppet_Disabled - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet?var-cluster=misc&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DPuppetDisabled [09:09:53] 06cloud-services-team: PuppetDisabled Puppet disabled on cloudidm2001-dev:9100 - https://phabricator.wikimedia.org/T366779 (10phaultfinder) 03NEW [09:16:37] 06cloud-services-team: PuppetDisabled Puppet disabled on cloudidm2001-dev:9100 - https://phabricator.wikimedia.org/T366779#9866718 (10SLyngshede-WMF) p:05Triage→03Low a:03SLyngshede-WMF [09:26:06] 06cloud-services-team: PuppetDisabled Puppet disabled on cloudidm2001-dev:9100 - https://phabricator.wikimedia.org/T366779#9866737 (10SLyngshede-WMF) 05Open→03Resolved [09:29:45] (03update) 10dcaro: Draft: investigate authentication [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/22 (https://phabricator.wikimedia.org/T363983) [09:29:48] RESOLVED: PuppetDisabled: Puppet disabled on cloudidm2001-dev:9100 - https://wikitech.wikimedia.org/wiki/Puppet/Runbooks#Puppet_Disabled - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet?var-cluster=misc&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DPuppetDisabled [09:30:41] RESOLVED: [2x] PrometheusRestarted: Prometheus/cloud restarted: beware monitoring artifacts. - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://alerts.wikimedia.org/?q=alertname%3DPrometheusRestarted [09:34:34] (03update) 10dcaro: Draft: investigate authentication [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/22 (https://phabricator.wikimedia.org/T363983) [09:37:18] (03update) 10aborrero: cli: initialize maintain_kubeusers_run_finished prometheus metrics [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/35 (https://phabricator.wikimedia.org/T366598) [09:37:57] (03close) 10aborrero: cli: initialize maintain_kubeusers_run_finished prometheus metrics [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/35 (https://phabricator.wikimedia.org/T366598) [09:40:35] (03update) 10dcaro: Draft: investigate authentication [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/22 (https://phabricator.wikimedia.org/T363983) [09:42:56] FIRING: [3x] CloudVPSDesignateLeaks: Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [09:43:28] (03open) 10aborrero: homedir: reduce filesystem checks and secure skel copy logic [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/37 (https://phabricator.wikimedia.org/T366564) [09:43:43] (03update) 10dcaro: Draft: investigate authentication [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/22 (https://phabricator.wikimedia.org/T363983) [09:48:34] (03approved) 10dcaro: homedir: reduce filesystem checks and secure skel copy logic [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/37 (https://phabricator.wikimedia.org/T366564) (owner: 10aborrero) [09:48:38] (03update) 10dcaro: homedir: reduce filesystem checks and secure skel copy logic [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/37 (https://phabricator.wikimedia.org/T366564) (owner: 10aborrero) [09:50:10] (03update) 10dcaro: Draft: investigate authentication [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/22 (https://phabricator.wikimedia.org/T363983) [09:55:25] (03CR) 10David Caro: [C:03+2] ceph.drain_osd_node: improve logs [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/990976 (owner: 10David Caro) [09:57:05] (03CR) 10David Caro: "It will complain that the node is not in the cluster later, but I can do a 'pre' check so no nodes are drained if you pass a wrong node." [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/990977 (owner: 10David Caro) [09:58:23] (03Merged) 10jenkins-bot: ceph.drain_osd_node: improve logs [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/990976 (owner: 10David Caro) [09:59:18] (03merge) 10aborrero: homedir: reduce filesystem checks and secure skel copy logic [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/37 (https://phabricator.wikimedia.org/T366564) [10:01:12] (03update) 10project_1317_bot_df3177307bed93c3f34e421e26c86e38: maintain-kubeusers: bump to 0.0.144-20240606095929-cf148997 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/313 (https://phabricator.wikimedia.org/T366564) [10:01:16] (03open) 10project_1317_bot_df3177307bed93c3f34e421e26c86e38: maintain-kubeusers: bump to 0.0.144-20240606095929-cf148997 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/313 (https://phabricator.wikimedia.org/T366564) [10:01:55] !log aborrero@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.component.deploy for component maintain-kubeusers [10:02:05] !log aborrero@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component maintain-kubeusers [10:02:45] 10Wikibugs: Make wikibugs Gerrit and GitLab streams be more consistent - https://phabricator.wikimedia.org/T366785 (10taavi) 03NEW [10:03:02] (03update) 10dcaro: Draft: investigate authentication [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/22 (https://phabricator.wikimedia.org/T363983) [10:03:56] (03update) 10dcaro: Draft: investigate authentication [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/22 (https://phabricator.wikimedia.org/T363983) [10:05:54] !log aborrero@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.component.deploy for component maintain-kubeusers [10:06:07] !log aborrero@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component maintain-kubeusers [10:07:21] (03merge) 10aborrero: maintain-kubeusers: bump to 0.0.144-20240606095929-cf148997 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/313 (https://phabricator.wikimedia.org/T366564) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [10:08:21] FIRING: MaintainKubeusersHang: maintain-kubeusers last finished run is 28.63M minutes old - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainKubeusersDown - https://prometheus-alerts.wmcloud.org/?q=alertname%3DMaintainKubeusersHang [10:09:01] (03update) 10dcaro: Draft: investigate authentication [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/22 (https://phabricator.wikimedia.org/T363983) [10:09:56] (03update) 10dcaro: Draft: investigate authentication [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/22 (https://phabricator.wikimedia.org/T363983) [10:13:21] RESOLVED: MaintainKubeusersHang: maintain-kubeusers last finished run is 28.63M minutes old - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainKubeusersDown - https://prometheus-alerts.wmcloud.org/?q=alertname%3DMaintainKubeusersHang [10:18:19] (03update) 10dcaro: Draft: investigate authentication [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/22 (https://phabricator.wikimedia.org/T363983) [10:30:08] (03update) 10dcaro: Draft: investigate authentication [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/22 (https://phabricator.wikimedia.org/T363983) [10:32:49] (03open) 10aborrero: homedir: remove needs_create() filesystem check after all accounts have state [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/38 (https://phabricator.wikimedia.org/T366564) [10:36:51] (03update) 10dcaro: Draft: investigate authentication [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/22 (https://phabricator.wikimedia.org/T363983) [10:37:06] (03update) 10dcaro: Draft: investigate authentication [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/22 (https://phabricator.wikimedia.org/T363983) [10:40:03] 10Tool-openstack-browser: openstack-browser: Support projects in non-default domains - https://phabricator.wikimedia.org/T366787 (10taavi) 03NEW [10:40:13] 10Tool-openstack-browser: openstack-browser support for heat/magnum - https://phabricator.wikimedia.org/T325465#9866946 (10taavi) [10:40:14] 10Tool-openstack-browser: openstack-browser: Support projects in non-default domains - https://phabricator.wikimedia.org/T366787#9866947 (10taavi) [12:03:16] (03update) 10dcaro: Draft: investigate authentication [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/22 (https://phabricator.wikimedia.org/T363983) [12:09:10] 10PAWS: paws not connecting to any reconciliation services - https://phabricator.wikimedia.org/T363917#9867127 (10rook) https://github.com/OpenRefine/CommonsExtension/issues/101 is closed and seems to fix the issue. When a new release is made we can try this with an updated plugin. [12:12:08] (03update) 10sstefanova: utils: update deb build and bump setup [repos/cloud/toolforge/builds-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-cli/-/merge_requests/73 (https://phabricator.wikimedia.org/T366674) [12:13:21] (03update) 10sstefanova: utils: update deb build and bump setup [repos/cloud/toolforge/builds-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-cli/-/merge_requests/73 (https://phabricator.wikimedia.org/T366674) [12:13:32] (03update) 10dcaro: Draft: investigate authentication [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/22 (https://phabricator.wikimedia.org/T363983) [12:15:11] (03update) 10sstefanova: utils: update deb build and bump setup [repos/cloud/toolforge/builds-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-cli/-/merge_requests/73 (https://phabricator.wikimedia.org/T366674) [12:15:39] (03update) 10dcaro: Draft: investigate authentication [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/22 (https://phabricator.wikimedia.org/T363983) [12:17:09] (03update) 10dcaro: Draft: investigate authentication [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/22 (https://phabricator.wikimedia.org/T363983) [12:27:13] (03approved) 10dcaro: homedir: remove needs_create() filesystem check after all accounts have state [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/38 (https://phabricator.wikimedia.org/T366564) (owner: 10aborrero) [12:27:13] (03update) 10dcaro: homedir: remove needs_create() filesystem check after all accounts have state [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/38 (https://phabricator.wikimedia.org/T366564) (owner: 10aborrero) [12:31:38] (03merge) 10aborrero: homedir: remove needs_create() filesystem check after all accounts have state [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/38 (https://phabricator.wikimedia.org/T366564) [12:33:30] (03open) 10project_1317_bot_df3177307bed93c3f34e421e26c86e38: maintain-kubeusers: bump to 0.0.145-20240606123146-6710dc2f [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/314 (https://phabricator.wikimedia.org/T366564) [12:39:51] 10Tool-schedule-deployment: Did not update deployment calendar - https://phabricator.wikimedia.org/T366794 (10kostajh) 03NEW [12:40:39] (03open) 10dcaro: funcitonal_tests.builds: add file cleanup [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/315 [12:41:46] (03approved) 10aborrero: funcitonal_tests.builds: add file cleanup [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/315 (owner: 10dcaro) [12:45:17] (03update) 10aborrero: components: add kyverno [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/238 (https://phabricator.wikimedia.org/T279110) [12:45:41] !log aborrero@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.component.deploy for component maintain-kubeusers [12:45:52] !log aborrero@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component maintain-kubeusers [12:46:29] !log aborrero@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.component.deploy for component maintain-kubeusers [12:46:41] !log aborrero@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component maintain-kubeusers [12:48:10] (03merge) 10aborrero: maintain-kubeusers: bump to 0.0.145-20240606123146-6710dc2f [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/314 (https://phabricator.wikimedia.org/T366564) (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [12:55:21] (03update) 10dcaro: funcitonal_tests.builds: add file cleanup [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/315 [12:55:43] (03merge) 10dcaro: funcitonal_tests.builds: add file cleanup [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/315 [12:56:00] (03update) 10aborrero: components: add kyverno [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/238 (https://phabricator.wikimedia.org/T279110) [12:56:32] (03update) 10sstefanova: Draft: openapi: refactor yaml [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/96 (https://phabricator.wikimedia.org/T366668) [12:57:42] (03update) 10sstefanova: Draft: openapi: refactor yaml [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/96 (https://phabricator.wikimedia.org/T366668) [12:57:55] (03update) 10sstefanova: openapi: refactor yaml [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/96 (https://phabricator.wikimedia.org/T366668) [13:01:56] (03open) 10aborrero: cli: fix interval datatype [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/39 [13:02:59] (03open) 10aborrero: maintain-kubeusers: raise interval values for local and toolsbeta [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/316 [13:15:11] (03approved) 10dcaro: maintain-kubeusers: raise interval values for local and toolsbeta [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/316 (owner: 10aborrero) [13:15:11] (03update) 10dcaro: maintain-kubeusers: raise interval values for local and toolsbeta [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/316 (owner: 10aborrero) [13:42:57] FIRING: [3x] CloudVPSDesignateLeaks: Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [13:43:01] (03open) 10sstefanova: openapi: refactor yaml [repos/cloud/toolforge/envvars-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-api/-/merge_requests/32 [14:02:32] (03update) 10dcaro: utils: update deb build and bump setup [repos/cloud/toolforge/builds-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-cli/-/merge_requests/73 (https://phabricator.wikimedia.org/T366674) (owner: 10sstefanova) [14:02:33] (03approved) 10dcaro: utils: update deb build and bump setup [repos/cloud/toolforge/builds-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-cli/-/merge_requests/73 (https://phabricator.wikimedia.org/T366674) (owner: 10sstefanova) [14:06:42] (03update) 10aborrero: cli: fix interval datatype [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/39 [14:06:53] (03merge) 10sstefanova: utils: update deb build and bump setup [repos/cloud/toolforge/builds-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-cli/-/merge_requests/73 (https://phabricator.wikimedia.org/T366674) [14:10:47] (03merge) 10aborrero: cli: fix interval datatype [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/39 [14:12:46] (03open) 10project_1317_bot_df3177307bed93c3f34e421e26c86e38: maintain-kubeusers: bump to 0.0.146-20240606141058-98931387 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/317 [14:13:09] !log aborrero@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.component.deploy for component maintain-kubeusers [14:13:20] !log aborrero@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component maintain-kubeusers [14:13:42] !log aborrero@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.component.deploy for component maintain-kubeusers [14:13:53] !log aborrero@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component maintain-kubeusers [14:14:28] (03merge) 10aborrero: maintain-kubeusers: bump to 0.0.146-20240606141058-98931387 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/317 (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [14:16:13] (03update) 10aborrero: maintain-kubeusers: raise interval values for local and toolsbeta [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/316 [14:16:21] FIRING: MaintainKubeusersHang: maintain-kubeusers last finished run is 28.63M minutes old - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainKubeusersDown - https://prometheus-alerts.wmcloud.org/?q=alertname%3DMaintainKubeusersHang [14:16:25] !log aborrero@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.component.deploy for component maintain-kubeusers [14:16:35] !log aborrero@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component maintain-kubeusers [14:20:15] (03approved) 10dcaro: openapi: refactor yaml [repos/cloud/toolforge/envvars-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-api/-/merge_requests/32 (owner: 10sstefanova) [14:20:16] (03update) 10dcaro: openapi: refactor yaml [repos/cloud/toolforge/envvars-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-api/-/merge_requests/32 (owner: 10sstefanova) [14:21:09] !log aborrero@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.component.deploy for component maintain-kubeusers [14:21:18] !log aborrero@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component maintain-kubeusers [14:22:06] (03merge) 10aborrero: maintain-kubeusers: raise interval values for local and toolsbeta [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/316 [14:23:42] (03merge) 10aborrero: maintain-kubeusers: MaintainKubeusersHang: adjust alert 'for' value [repos/cloud/toolforge/alerts] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/alerts/-/merge_requests/14 (https://phabricator.wikimedia.org/T366598) [14:26:17] (03merge) 10sstefanova: openapi: refactor yaml [repos/cloud/toolforge/envvars-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-api/-/merge_requests/32 [14:26:51] (03update) 10dcaro: openapi: refactor yaml [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/96 (https://phabricator.wikimedia.org/T366668) (owner: 10sstefanova) [14:26:52] (03approved) 10dcaro: openapi: refactor yaml [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/96 (https://phabricator.wikimedia.org/T366668) (owner: 10sstefanova) [14:27:34] 06cloud-services-team, 10Toolforge (Toolforge iteration 11), 13Patch-For-Review: toolforge: new maintain-kubeusers takes long time to loop over all the accounts to reconcile them - https://phabricator.wikimedia.org/T366564#9867578 (10aborrero) [14:28:15] 10Cloud-VPS (Quota-requests), 06Content-Transform-Team-WIP: Increase storage for parsoid visualdiff testing - https://phabricator.wikimedia.org/T365733#9867579 (10Jgiannelos) @dcaro not that I know of. [14:29:57] (03open) 10project_1317_bot_df3177307bed93c3f34e421e26c86e38: envvars-api: bump to 0.0.47-20240606142628-c69cfbb6 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/318 [14:31:22] 06cloud-services-team, 10Toolforge (Toolforge iteration 11), 13Patch-For-Review: toolforge: new maintain-kubeusers takes long time to loop over all the accounts to reconcile them - https://phabricator.wikimedia.org/T366564#9867589 (10aborrero) 05In progress→03Resolved last noop loop took 1.51 mins (a... [14:34:49] 06cloud-services-team, 10Toolforge: toolforge: track PSP migration plan - https://phabricator.wikimedia.org/T364297#9867603 (10aborrero) [14:47:42] FIRING: [3x] CloudVPSDesignateLeaks: Detected 3 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:52:42] RESOLVED: [3x] CloudVPSDesignateLeaks: Detected 3 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:55:17] 10Cloud Services Proposals: Decision request - kubernetes upgrade workgroup - https://phabricator.wikimedia.org/T363683#9867695 (10Raymond_Ndibe) [15:07:14] (03update) 10aborrero: components: add kyverno [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/238 (https://phabricator.wikimedia.org/T279110) [15:14:32] (03update) 10raymond-ndibe: [jobs-api] move simple job validations to pydantic [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/89 (https://phabricator.wikimedia.org/T366209) [15:29:55] (03update) 10raymond-ndibe: [jobs-api] move simple job validations to pydantic [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/89 (https://phabricator.wikimedia.org/T366209) [15:30:57] (03update) 10raymond-ndibe: [jobs-api] move simple job validations to pydantic [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/89 (https://phabricator.wikimedia.org/T366209) [15:38:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [15:45:24] (03update) 10raymond-ndibe: [jobs-api] move simple job validations to pydantic [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/89 (https://phabricator.wikimedia.org/T366209) [15:48:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [15:59:06] FIRING: ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_toolserver_org_redirects_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [16:00:37] (03update) 10raymond-ndibe: [jobs-api] move simple job validations to pydantic [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/89 (https://phabricator.wikimedia.org/T366209) [16:04:06] RESOLVED: ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_toolserver_org_redirects_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [16:45:32] (03CR) 10Andrew Bogott: [C:03+1] "sure, why not?" [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1039238 (owner: 10Majavah) [16:54:42] (03CR) 10Majavah: [C:03+2] vps: create_project: Allow using dashes again [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1039238 (owner: 10Majavah) [16:57:33] (03Merged) 10jenkins-bot: vps: create_project: Allow using dashes again [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1039238 (owner: 10Majavah) [17:08:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [17:18:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [18:10:00] 10Tool-bridgebot: Bridge IRC to Matrix - https://phabricator.wikimedia.org/T366767#9868576 (10bd808) Per https://github.com/42wim/matterbridge/wiki/Section-Matrix-%28basic%29#example-with-pantalaimon it looks like this will also need to add https://github.com/matrix-org/pantalaimon to the Bridgebot stack to impl... [18:10:29] 10Tool-bridgebot: Bridge #wikimedia-rust on libera.chat and #wikimedia-rust:matrix.org - https://phabricator.wikimedia.org/T366767#9868578 (10bd808) [18:15:00] 10Tool-bridgebot: Bridge #wikimedia-rust on libera.chat and #wikimedia-rust:matrix.org - https://phabricator.wikimedia.org/T366767#9868594 (10bd808) Bridgebot has an existing Matrix account from the {T337136} experiment. [18:49:11] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS, 06DC-Ops, 10ops-eqiad, 06SRE: cloudcephosd1021-1034: hard drive sector errors increasing - https://phabricator.wikimedia.org/T348643#9868739 (10wiki_willy) Ok, got it. Thanks for the info @dcaro. And just to confirm, cloudcephosd1001-1020 have the... [18:51:26] 10Tool-schedule-deployment: Leave a comment on the Gerrit change when it is scheduled for a backport - https://phabricator.wikimedia.org/T366763#9868743 (10thcipriani) >>! In T366763#9866607, @kostajh wrote: > Instead of "06:00 SF" could we use "UTC morning backport window", "UTC afternoon backport window" and "... [18:59:55] 10Toolforge (Toolforge iteration 11): Toolforge Aptfile not producing working copy of `ffmpeg` - https://phabricator.wikimedia.org/T365633#9868776 (10derenrich) So the urgency of this ticket is low as we probably won't be needing this for a couple months as this work is not being prioritized. Though if it's a de... [19:00:22] FIRING: HAProxyBackendUnavailable: HAProxy service keystone-public-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [19:05:22] RESOLVED: HAProxyBackendUnavailable: HAProxy service keystone-public-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [19:47:58] 10wikitech.wikimedia.org, 10Parsoid, 10Parsoid-Read-Views: Parsoid rendering error for the incidents template on Wikitech - https://phabricator.wikimedia.org/T366842 (10Quiddity) 03NEW [19:49:10] 10wikitech.wikimedia.org, 10Parsoid, 10Parsoid-Read-Views: Parsoid rendering error for the incidents template on Wikitech - https://phabricator.wikimedia.org/T366842#9868957 (10Quiddity) [20:06:12] 10Tool-schedule-deployment: Did not update deployment calendar - https://phabricator.wikimedia.org/T366794#9869005 (10bd808) I have just recreated this same failure when trying to add a patch to a deployment that was about to start, but then was successful in adding the same patch to the last backport window on... [20:30:46] 10Tool-schedule-deployment: Did not update deployment calendar - https://phabricator.wikimedia.org/T366794#9869134 (10bd808) >>! In T366794#9869005, @bd808 wrote: > I also noticed that the tool is showing me deployment windows from today that have already passed their start time. This is not expected behavior ei... [20:53:56] 10Tool-schedule-deployment: Did not update deployment calendar - https://phabricator.wikimedia.org/T366794#9869231 (10bd808) I found the cause of @kostajh's update failure in the logs: ` 2024-06-06T12:36:24Z deployments.mediawiki DEBUG: Before: {{Deployment calendar event card |when=2024-06-06 06:00 SF |... [21:25:03] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Kubernetes worker tools-k8s-worker-nfs-10 has many processes stuck on IO (probably NFS) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [22:17:21] (03open) 10bd808: Guard against various edit failures [toolforge-repos/schedule-deployment] - 10https://gitlab.wikimedia.org/toolforge-repos/schedule-deployment/-/merge_requests/6 (https://phabricator.wikimedia.org/T366794) [22:19:51] (03merge) 10bd808: Guard against various edit failures [toolforge-repos/schedule-deployment] - 10https://gitlab.wikimedia.org/toolforge-repos/schedule-deployment/-/merge_requests/6 (https://phabricator.wikimedia.org/T366794) [22:32:10] 10Tool-schedule-deployment: Did not update deployment calendar - https://phabricator.wikimedia.org/T366794#9869446 (10bd808) 05In progress→03Resolved The fix seems to work. https://wikitech.wikimedia.org/w/index.php?title=Deployments&diff=2189140&oldid=2189138 [22:35:03] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Kubernetes worker tools-k8s-worker-nfs-10 has many processes stuck on IO (probably NFS) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [23:19:22] 10wikitech.wikimedia.org, 13Patch-For-Review, 07Technical-Debt: Update Phabricator BlockIpComplete hook to use "user.edit" Conduit API - https://phabricator.wikimedia.org/T366587#9869583 (10bd808) 05In progress→03Resolved