[00:03:55] FIRING: MaxConntrack: Max conntrack at 85.51% on cloudvirt1042:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:48:56] RESOLVED: MaxConntrack: Max conntrack at 85.88% on cloudvirt1042:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:54:46] (03update) 10raymond-ndibe: [jobs-api] support services in jobs [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/71 (https://phabricator.wikimedia.org/T348758) [01:16:27] (03update) 10raymond-ndibe: [jobs-api] support services in jobs [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/71 (https://phabricator.wikimedia.org/T348758) [01:31:03] (03update) 10raymond-ndibe: [jobs-api] support services in jobs [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/71 (https://phabricator.wikimedia.org/T348758) [01:42:17] (03update) 10raymond-ndibe: [jobs-api] support services in jobs [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/71 (https://phabricator.wikimedia.org/T348758) [01:51:41] (03update) 10raymond-ndibe: [jobs-api] support services in jobs [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/71 (https://phabricator.wikimedia.org/T348758) [02:02:56] (03update) 10raymond-ndibe: [jobs-api] support services in jobs [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/71 (https://phabricator.wikimedia.org/T348758) [02:19:43] (03update) 10raymond-ndibe: [jobs-api] support services in jobs [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/71 (https://phabricator.wikimedia.org/T348758) [02:58:05] FIRING: NeutronAgentDown: Neutron neutron-linuxbridge-agent on cloudvirt1041 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown [03:12:41] FIRING: [2x] CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [03:22:41] RESOLVED: [2x] CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [03:51:15] FIRING: [3x] OpenstackAPIResponse: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [04:42:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [04:52:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [06:12:41] FIRING: [2x] CloudVPSDesignateLeaks: Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [06:17:41] FIRING: [3x] CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [06:22:41] FIRING: [3x] CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [06:27:41] RESOLVED: [3x] CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [06:58:05] FIRING: NeutronAgentDown: Neutron neutron-linuxbridge-agent on cloudvirt1041 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown [07:37:52] 06cloud-services-team, 10Cloud-VPS (Project-requests), 10Fiwiki-Wikidata-Commons: Adding new members to Cloud VPS project fails - https://phabricator.wikimedia.org/T365096 (10Zache) 03NEW [07:42:41] FIRING: [2x] CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [07:47:41] FIRING: [3x] CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [07:51:15] FIRING: [3x] OpenstackAPIResponse: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [07:52:41] FIRING: [3x] CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [07:56:00] FIRING: [3x] OpenstackAPIResponse: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [07:57:41] RESOLVED: [3x] CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [08:13:17] 14Grid-Engine-to-K8s-Migration, 10Tools, 06All-and-every-Wikisource: Migrate phetools from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319965#9803265 (10Cunegonde1) Thank you very much for your help. I just did a test and everything works perfectly. Thanks again for your... [08:14:50] 06cloud-services-team, 10Fiwiki-Wikidata-Commons: Adding new members to Cloud VPS project fails - https://phabricator.wikimedia.org/T365096#9803272 (10Peachey88) [08:18:55] 06cloud-services-team, 10Horizon, 10Fiwiki-Wikidata-Commons: Adding new members to Cloud VPS project fails - https://phabricator.wikimedia.org/T365096#9803276 (10taavi) [08:23:29] 06cloud-services-team, 10Horizon, 10Fiwiki-Wikidata-Commons: Adding new members to Cloud VPS project fails - https://phabricator.wikimedia.org/T365096#9803282 (10taavi) @Andrew, this seems related to the recent Horizon Django upgrade: ` [Thu May 16 07:06:34.034090 2024] [wsgi:error] [pid 8:tid 13970661000576... [08:50:46] (03CR) 10Arturo Borrero Gonzalez: [C:03+1] "LGTM." [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1031936 (https://phabricator.wikimedia.org/T365000) (owner: 10Majavah) [08:50:47] 10Cloud-VPS: Delete 'monitoring' project - https://phabricator.wikimedia.org/T365105 (10fgiunchedi) 03NEW [08:51:34] (03CR) 10Majavah: [C:03+2] neutron: Migrate to openstack router list [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1031936 (https://phabricator.wikimedia.org/T365000) (owner: 10Majavah) [08:52:15] 10Cloud-VPS: Delete 'monitoring' project - https://phabricator.wikimedia.org/T365105#9803452 (10taavi) a:03taavi [08:54:26] (03CR) 10Arturo Borrero Gonzalez: [C:03+1] "LGTM." [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1031939 (owner: 10Majavah) [08:54:40] (03Merged) 10jenkins-bot: neutron: Migrate to openstack router list [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1031936 (https://phabricator.wikimedia.org/T365000) (owner: 10Majavah) [08:55:14] 10Cloud-VPS: Delete 'monitoring' project - https://phabricator.wikimedia.org/T365105#9803470 (10taavi) 05Open→03Resolved Done, thank you! [08:55:38] (03CR) 10Arturo Borrero Gonzalez: [C:03+1] "LGTM." [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1031964 (https://phabricator.wikimedia.org/T365000) (owner: 10Majavah) [08:56:00] (03CR) 10Arturo Borrero Gonzalez: [C:03+1] "LGTM." [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1031968 (owner: 10Majavah) [08:56:12] (03CR) 10Majavah: [C:03+2] neutron: Filter agent hosts server side [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1031939 (owner: 10Majavah) [08:56:16] (03CR) 10Majavah: [C:03+2] neutron: Use openstack commands to get agent HA state [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1031964 (https://phabricator.wikimedia.org/T365000) (owner: 10Majavah) [08:56:21] (03CR) 10Majavah: [C:03+2] openstack: cloudnet: Don't run with proxy [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1031968 (owner: 10Majavah) [08:56:34] (03CR) 10Majavah: [C:03+2] neutron: Filter for agent types server-side [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1031969 (owner: 10Majavah) [08:56:39] (03CR) 10Majavah: [C:03+2] openstack: cloudnet: Remove check for router count [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1031970 (owner: 10Majavah) [08:59:00] (03Merged) 10jenkins-bot: neutron: Filter agent hosts server side [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1031939 (owner: 10Majavah) [08:59:03] (03Merged) 10jenkins-bot: neutron: Use openstack commands to get agent HA state [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1031964 (https://phabricator.wikimedia.org/T365000) (owner: 10Majavah) [08:59:39] (03Merged) 10jenkins-bot: openstack: cloudnet: Don't run with proxy [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1031968 (owner: 10Majavah) [08:59:39] (03Merged) 10jenkins-bot: neutron: Filter for agent types server-side [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1031969 (owner: 10Majavah) [08:59:53] (03Merged) 10jenkins-bot: openstack: cloudnet: Remove check for router count [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1031970 (owner: 10Majavah) [09:00:34] (03CR) 10Arturo Borrero Gonzalez: [C:03+1] "LGTM." [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1031969 (owner: 10Majavah) [09:01:15] (03CR) 10Arturo Borrero Gonzalez: [C:03+1] "LGTM." [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1031971 (https://phabricator.wikimedia.org/T365000) (owner: 10Majavah) [09:01:31] (03CR) 10Arturo Borrero Gonzalez: [C:03+1] "LGTM." [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1031972 (owner: 10Majavah) [09:02:25] (03CR) 10Arturo Borrero Gonzalez: [C:03+1] "LGTM." [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1031973 (owner: 10Majavah) [09:03:49] (03CR) 10Majavah: [C:03+2] neutron: Use openstack agent set to enable/disable agents [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1031971 (https://phabricator.wikimedia.org/T365000) (owner: 10Majavah) [09:03:53] (03CR) 10Majavah: [C:03+2] neutron: Remove unused error classes [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1031972 (owner: 10Majavah) [09:03:56] (03CR) 10Majavah: [C:03+2] neutron: Remove now-unused command running functionality [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1031973 (owner: 10Majavah) [09:06:48] (03Merged) 10jenkins-bot: neutron: Use openstack agent set to enable/disable agents [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1031971 (https://phabricator.wikimedia.org/T365000) (owner: 10Majavah) [09:06:55] (03Merged) 10jenkins-bot: neutron: Remove unused error classes [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1031972 (owner: 10Majavah) [09:06:55] (03Merged) 10jenkins-bot: neutron: Remove now-unused command running functionality [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1031973 (owner: 10Majavah) [09:12:41] FIRING: [2x] CloudVPSDesignateLeaks: Detected 28 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [09:17:41] FIRING: [3x] CloudVPSDesignateLeaks: Detected 26 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [09:33:36] 06cloud-services-team, 10Cloud-VPS: replace use of 'neutron' cli in wmcs-cookbooks - https://phabricator.wikimedia.org/T365000#9803645 (10taavi) 05Open→03Resolved [10:04:50] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [10:29:49] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [10:38:10] 10cloud-services-team (FY2023/2024-Q3-Q4), 10MediaWiki-extensions-CentralAuth, 10MediaWiki-Platform-Team (Radar), 10MW-1.43-notes (1.43.0-wmf.5; 2024-05-14), 13Patch-For-Review: Drop gu_salt from globaluser - https://phabricator.wikimedia.org/T364435#9803888 (10ops-monitoring-bot) Cookbook cookbooks.sre.... [10:42:15] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [10:45:41] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [10:48:15] 10cloud-services-team (FY2023/2024-Q3-Q4), 10MediaWiki-extensions-CentralAuth, 10MediaWiki-Platform-Team (Radar), 10MW-1.43-notes (1.43.0-wmf.5; 2024-05-14), 13Patch-For-Review: Drop gu_salt from globaluser - https://phabricator.wikimedia.org/T364435#9803960 (10ops-monitoring-bot) Cookbook cookbooks.sre.... [10:50:08] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [10:52:49] 10cloud-services-team (FY2023/2024-Q3-Q4), 10MediaWiki-extensions-CentralAuth, 10MediaWiki-Platform-Team (Radar), 10MW-1.43-notes (1.43.0-wmf.5; 2024-05-14), 13Patch-For-Review: Drop gu_salt from globaluser - https://phabricator.wikimedia.org/T364435#9803968 (10fnegri) a:05fnegri→03None > Claiming th... [10:56:24] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [10:57:22] FIRING: HAProxyBackendUnavailable: HAProxy service neutron-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [10:58:05] FIRING: NeutronAgentDown: Neutron neutron-linuxbridge-agent on cloudvirt1041 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown [11:00:43] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [11:02:22] RESOLVED: HAProxyBackendUnavailable: HAProxy service neutron-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [11:02:32] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T364312) [11:02:55] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [11:03:15] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T364312) [11:03:46] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [11:10:37] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [11:25:30] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [11:39:01] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [11:54:13] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [11:56:01] FIRING: [3x] OpenstackAPIResponse: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [12:16:22] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [12:23:05] 10Quarry, 10Internet-Archive: [bug] Lot of queries stuck in queued state for hours and days (with stop actions leading to HTTP 500) - https://phabricator.wikimedia.org/T365136 (10Teslaton) 03NEW [12:24:51] (03open) 10l10n-bot: Localisation updates from https://translatewiki.net. [toolforge-repos/wd-image-positions] - 10https://gitlab.wikimedia.org/toolforge-repos/wd-image-positions/-/merge_requests/5 [12:25:21] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [12:27:47] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [12:32:22] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [12:39:04] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [12:41:14] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [12:47:17] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [12:56:35] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [12:58:12] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [13:02:45] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [13:06:46] (03CR) 10Ebrahim: "There was even one file on main page that I had protected in Commons, then it was unprotected by a Commons admin thinking that's what anot" [labs/tools/fileprotectionsync] - 10https://gerrit.wikimedia.org/r/1028824 (owner: 10Ebrahim) [13:17:41] FIRING: [3x] CloudVPSDesignateLeaks: Detected 68 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [13:23:59] (03PS1) 10Eevans: cassandra: add faux creds for data_gateway role [labs/private] - 10https://gerrit.wikimedia.org/r/1032485 (https://phabricator.wikimedia.org/T364921) [13:25:43] (03CR) 10Eevans: [V:03+2 C:03+2] cassandra: add faux creds for data_gateway role [labs/private] - 10https://gerrit.wikimedia.org/r/1032485 (https://phabricator.wikimedia.org/T364921) (owner: 10Eevans) [14:14:18] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [14:15:54] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [14:22:54] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [14:26:02] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS: Create new g4 flavors to support hypervisor migration from Linuxbridge to OVS Neutron agents - https://phabricator.wikimedia.org/T364458#9804933 (10taavi) Seems like Nova (or Placement) do not support a flavor targeting the lack of an aggregate property v... [14:26:08] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [14:33:58] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS: Create new g4 flavors to support hypervisor migration from Linuxbridge to OVS Neutron agents - https://phabricator.wikimedia.org/T364458#9805001 (10taavi) Created new aggregates for testing in codfw1dev: `lang=shell-session taavi@cloudcontrol2004-dev ~ $... [14:34:07] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [14:36:57] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [14:40:18] 10Data-Services: [wikireplicas] clouddb hosts free memory decreases over time - https://phabricator.wikimedia.org/T365164 (10fnegri) 03NEW [14:41:01] 10Data-Services: [wikireplicas] clouddb* free memory decreases over time - https://phabricator.wikimedia.org/T365164#9805077 (10fnegri) [14:41:10] 10Data-Services: [wikireplicas] clouddb* free memory decreases over time - https://phabricator.wikimedia.org/T365164#9805073 (10fnegri) p:05Triage→03Low [14:43:25] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [14:45:25] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [14:47:52] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [14:57:05] 10Toolforge (Toolforge iteration 09): [maintain-kubeusers] Increment default services quota - https://phabricator.wikimedia.org/T362520#9805186 (10bd808) >>! In T362520#9802568, @Raymond_Ndibe wrote: > I thought the plan is to only allow services for continuous jobs (which web-service will probably become entang... [14:58:05] FIRING: NeutronAgentDown: Neutron neutron-linuxbridge-agent on cloudvirt1041 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown [15:01:53] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [15:04:26] (03update) 10raymond-ndibe: [jobs-api] support services in jobs [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/71 (https://phabricator.wikimedia.org/T348758) [15:19:25] 10Quarry: Error 500 when clicking "stop query" - https://phabricator.wikimedia.org/T362213#9805326 (10Oudedutchman) The bug could not be reproduced locally on Quarry when running with `docker-compose up`. [15:34:24] 06cloud-services-team, 10Toolforge: Find a modern hostname for tools-static.wmflabs.org - https://phabricator.wikimedia.org/T361435#9805416 (10aborrero) Maybe we can use [[ https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/DNS#*.svc.toolforge.org | svc.toolforge.org ]] [15:55:34] (03update) 10aborrero: Draft: maintain_kubeusers: introduce resource abstraction [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/23 (https://phabricator.wikimedia.org/T279110) [15:56:16] FIRING: [3x] OpenstackAPIResponse: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [16:04:49] 10Data-Services: [wikireplicas] clouddb* free memory decreases over time - https://phabricator.wikimedia.org/T365164#9805587 (10fnegri) Restarting the mariadb@s4.service freed up about 300G of RAM: {F53500549} I've added a note on this procedure to the [alert runbook](https://wikitech.wikimedia.org/wiki/MariaD... [16:36:01] FIRING: [4x] OpenstackAPIResponse: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [16:52:45] 10wikitech.wikimedia.org: Update operations/wikitech-static git repo from host - https://phabricator.wikimedia.org/T292347#9806071 (10Aklapper) a:05Reedy→03None @Reedy: Removing task assignee as this open task has been assigned for more than two years - see the email sent to all task assignees on 2024-04-15.... [17:03:35] 06cloud-services-team: NFS-on-ceph: monitoring - https://phabricator.wikimedia.org/T301279#9806211 (10Aklapper) a:05Andrew→03None @Andrew: Removing task assignee as this open task has been assigned for more than two years - see the email sent to all task assignees on 2024-04-15. Please assign this task to yo... [17:17:56] FIRING: [3x] CloudVPSDesignateLeaks: Detected 68 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [17:21:01] FIRING: [5x] OpenstackAPIResponse: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [17:40:58] 10Wikibugs: Wikibugs' gitlab connector stops working without a strong sign of why - https://phabricator.wikimedia.org/T364490#9806430 (10bd808) 05In progress→03Resolved `lang=shell-session $ kubectl get po | grep -E 'NAME|gitlab' NAME READY STATUS RESTARTS AGE gitlab-5c57... [18:16:40] 10PAWS: jupyterlab to 4.2.0 - https://phabricator.wikimedia.org/T364327#9806537 (10github-toolforge-bot) vivian-rook closed https://github.com/toolforge/paws/pull/410 [18:16:44] vivian-rook closed https://github.com/toolforge/paws/pull/410 [18:16:48] 10PAWS: jupyterlab to 4.2.0 - https://phabricator.wikimedia.org/T364327#9806540 (10rook) 05Stalled→03Resolved [18:56:57] 10Toolforge (Toolforge iteration 09): [maintain-kubeusers] Increment default services quota - https://phabricator.wikimedia.org/T362520#9806658 (10Raymond_Ndibe) Very valid point Bryan. I can't think of a reason too. The most important quotas to put in place imo are hardware resource quotas like ram and cpu. May... [19:42:57] 10Toolforge: [components-api] add one-off, scheduled and continuous jobs support to the yaml + api - https://phabricator.wikimedia.org/T362075#9806795 (10Raymond_Ndibe) I have a thing against `reuse-from`. It is not immediately clear what it means by just looking at it. `depends-on` is a more descriptive name if... [19:54:22] 10Cloud Services Proposals, 06Infrastructure-Foundations, 10netops, 06SRE: Separate WMCS control and management plane traffic - https://phabricator.wikimedia.org/T314847#9806824 (10cmooney) 05Open→03Resolved This has been implemented and the new vlan setup is recorded [[ https://wikitech.wikimedia.... [20:08:21] (03approved) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/wd-image-positions] - 10https://gitlab.wikimedia.org/toolforge-repos/wd-image-positions/-/merge_requests/5 (owner: 10l10n-bot) [20:08:24] (03merge) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/wd-image-positions] - 10https://gitlab.wikimedia.org/toolforge-repos/wd-image-positions/-/merge_requests/5 (owner: 10l10n-bot) [21:17:56] FIRING: [3x] CloudVPSDesignateLeaks: Detected 68 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [21:21:16] FIRING: [5x] OpenstackAPIResponse: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse