[00:02:39] 10Tools, 10Gerrit, 03Wikimedia-Hackathon-2024: Gerrit reviewer bot should add reviewers as CC instead of actual reviewers - https://phabricator.wikimedia.org/T363290#9743154 (10matmarex) If you really wanted to know for sure, I suppose you'd have to just ask everyone. There are 155 people listed (hmm, more t... [00:28:46] (OpenstackAPIResponse) firing: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [01:33:29] 10Toolforge, 10Tools, 06Data-Engineering, 10EventStreams, and 2 others: Frequent `429 Client Error: Too Many Requests for url: https://stream.wikimedia.org/v2/stream/recentchange` errors in SULWatcher - https://phabricator.wikimedia.org/T329327#9743279 (10Ottomata) Or, could we just avoid rate limiting Clo... [01:36:35] 10Toolforge, 10Tools, 06Data-Engineering, 10EventStreams, and 2 others: Frequent `429 Client Error: Too Many Requests for url: https://stream.wikimedia.org/v2/stream/recentchange` errors in SULWatcher - https://phabricator.wikimedia.org/T329327#9743280 (10Ottomata) Oh, another piece of info: WMF traffic fr... [02:14:54] 06cloud-services-team, 10wikitech.wikimedia.org, 07Epic, 07Security: sustainability of wikitech.wikimedia.org - https://phabricator.wikimedia.org/T363125#9743282 (10Bugreporter) > MediaWiki users get less bogged down with Wikimedia-specific detail. Alternatively they can be moved to a namespace in Meta-W... [02:21:41] (CloudVPSDesignateLeaks) firing: (3) Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [02:26:41] (CloudVPSDesignateLeaks) resolved: (3) Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [03:17:00] (OpenstackAPIResponse) firing: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [03:22:42] 10Tool-bridgebot: Replace custom deployment with build service and job service - https://phabricator.wikimedia.org/T363028#9743337 (10bd808) The code and config are ready to try switching everything over. I don't want to do this in my evening however due to the possibility of exciting new failure modes cropping... [03:37:16] 10Tool-bridgebot, 13Patch-For-Review, 07Upstream: Bridgebot freaks out and sends double messages from IRC to Telegram - https://phabricator.wikimedia.org/T305487#9743353 (10CodeReviewBot) bd808 opened https://gitlab.wikimedia.org/toolforge-repos/bridgebot-matterbridge/-/merge_requests/1 [hack] Patch irc han... [04:33:46] (OpenstackAPIResponse) firing: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [04:41:41] (CloudVPSDesignateLeaks) firing: (2) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [04:42:22] (HAProxyBackendUnavailable) firing: HAProxy service nova-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [04:56:41] (CloudVPSDesignateLeaks) resolved: (2) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [05:39:03] 06cloud-services-team, 10Data-Services, 06Data-Engineering, 06Privacy Engineering: Increased visibility in wiki-replicas for volunteers fighting vandals - https://phabricator.wikimedia.org/T284944#9743588 (10odimitrijevic) a:05odimitrijevic→03lbowmaker [07:10:58] !log dcaro@urcuchillay admin END (FAIL) - Cookbook wmcs.ceph.osd.bootstrap_and_add (exit_code=99) (T359049) [07:10:59] wm-bot2196: Unknown project "dcaro@urcuchillay" [07:11:00] T359049: hw troubleshooting: /dev/sdg disk not working properly in cloudcephosd1017.eqiad.wmnet - https://phabricator.wikimedia.org/T359049 [07:14:43] 06cloud-services-team, 10Cloud-VPS, 06DC-Ops, 10ops-eqiad, 06SRE: hw troubleshooting: /dev/sdg disk not working properly in cloudcephosd1017.eqiad.wmnet - https://phabricator.wikimedia.org/T359049#9743683 (10dcaro) 05Open→03Resolved The drive is back online and in the cluster 👍 [07:17:00] (OpenstackAPIResponse) firing: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [08:13:41] (CloudVPSDesignateLeaks) firing: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [08:23:41] (CloudVPSDesignateLeaks) resolved: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [08:32:26] 10Cloud Services Proposals, 06cloud-services-team, 10Toolforge: Decision Request - Toolforge policy agent - https://phabricator.wikimedia.org/T362233#9743851 (10aborrero) scheduled discussion meeting for 2024-04-30. [08:34:28] 10Cloud Services Proposals, 06cloud-services-team, 10Toolforge: Decision Request - Toolforge policy agent - https://phabricator.wikimedia.org/T362233#9743852 (10aborrero) 05Open→03In progress p:05Triage→03Medium [08:38:46] (OpenstackAPIResponse) firing: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [08:42:22] (HAProxyBackendUnavailable) firing: HAProxy service nova-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [08:55:13] 06cloud-services-team, 10Toolforge: toolforge: explore options to introduce egress network quotas - https://phabricator.wikimedia.org/T363296#9743894 (10aborrero) in your opinion, should we decline this task and focus on the other angle you mention? [09:08:04] 10VPS-project-Codesearch, 10Phabricator: Consider adding a way to query Codesearch from Phabricator - https://phabricator.wikimedia.org/T183608#9743932 (10Aklapper) 05Stalled→03Declined > maybe we should include results from there into Phabricator's quick search menu? I'd prefer not to maintain a down... [09:43:55] 06cloud-services-team: Striker/Horizon are running with a non-existing user - https://phabricator.wikimedia.org/T363452 (10MoritzMuehlenhoff) 03NEW [09:46:27] 06cloud-services-team, 10Toolforge, 13Patch-For-Review: toolforge lima-kilo: PodSecurityPolicy admission is disabled - https://phabricator.wikimedia.org/T363347#9744060 (10CodeReviewBot) aborrero opened https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/125 kind: cleanup kubeadm c... [09:48:18] 10Toolforge: New upstream release for Pywikibot - https://phabricator.wikimedia.org/T363132#9744087 (10taavi) 05Open→03Resolved [09:50:35] 06cloud-services-team, 10Toolforge, 13Patch-For-Review: toolforge lima-kilo: PodSecurityPolicy admission is disabled - https://phabricator.wikimedia.org/T363347#9744109 (10aborrero) The problem was we were using a deprecated apiVersion field in the embedded kubeadm configuration. Discovered by staring at th... [10:02:27] 06cloud-services-team, 10Toolforge, 13Patch-For-Review: toolforge lima-kilo: PodSecurityPolicy admission is disabled - https://phabricator.wikimedia.org/T363347#9744156 (10CodeReviewBot) aborrero merged https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/125 kind: cleanup kubeadm c... [10:27:26] 06cloud-services-team, 10Toolforge: lima-kilo: replicate sssd setup from Toolforge - https://phabricator.wikimedia.org/T362966#9744208 (10CodeReviewBot) aborrero merged https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/119 basic_system: install sssd [10:42:41] (CloudVPSDesignateLeaks) firing: (2) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [10:47:41] (CloudVPSDesignateLeaks) firing: (3) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [10:52:41] (CloudVPSDesignateLeaks) firing: (3) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [10:57:41] (CloudVPSDesignateLeaks) resolved: (3) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [11:02:06] 06cloud-services-team, 10Horizon, 10Striker: Striker/Horizon are running with a non-existing user - https://phabricator.wikimedia.org/T363452#9744332 (10taavi) [11:03:11] 06cloud-services-team, 10Horizon, 10Striker: Striker/Horizon are running with a non-existing user - https://phabricator.wikimedia.org/T363452#9744348 (10taavi) Those are running in containers, and the user does exist in the container: `lang=shell-session taavi@cloudweb1004 ~ $ sudo docker exec -it striker.se... [11:17:01] (OpenstackAPIResponse) firing: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [12:02:22] (HAProxyBackendUnavailable) resolved: HAProxy service nova-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [12:12:41] (CloudVPSDesignateLeaks) firing: (2) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [12:22:41] (CloudVPSDesignateLeaks) resolved: (2) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [12:31:09] 06cloud-services-team, 10Toolforge: toolforge lima-kilo: PodSecurityPolicy admission is disabled - https://phabricator.wikimedia.org/T363347#9744515 (10aborrero) 05In progress→03Resolved [12:43:46] (OpenstackAPIResponse) firing: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [12:55:23] !log dcaro@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.component.deploy for component jobs-api [12:55:34] !log dcaro@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component jobs-api [12:56:07] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS, 13Patch-For-Review: Deploy OVS test setup in codfw1dev - https://phabricator.wikimedia.org/T358761#9744645 (10taavi) I've been looking at this error recently: ` Apr 25 12:51:08 cloudvirt2001-dev nova-compute[2572868]: 2024-04-25 12:51:08.870 2572868 ERR... [12:57:02] !log dcaro@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.component.deploy for component jobs-api [12:57:14] !log dcaro@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component jobs-api [13:07:09] 10Toolforge: [components-api] Get a skeleton of API webservice and implement `/tool//deploy` with build-only features - https://phabricator.wikimedia.org/T362069#9744675 (10dcaro) [13:07:11] 10Toolforge: [jobs-api,builds-api,envvars-api,api-gateway] Prefix all endpoints with `/tool/` - https://phabricator.wikimedia.org/T363346#9744676 (10dcaro) [13:07:30] (03PS1) 10AntiCompositeNumber: StewardBot: fix typo in heartbeat() [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/1024397 [13:08:32] (03CR) 10AntiCompositeNumber: [C:03+2] StewardBot: fix typo in heartbeat() [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/1024397 (owner: 10AntiCompositeNumber) [13:09:27] (03Merged) 10jenkins-bot: StewardBot: fix typo in heartbeat() [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/1024397 (owner: 10AntiCompositeNumber) [14:26:53] 10VPS-project-Wikistats: Add mywikisource to wikistats - https://phabricator.wikimedia.org/T363274#9744898 (10Dzahn) 05Open→03Resolved ` MariaDB [wikistats]> insert into wikisources (prefix, lang, loclang, loclanglink, method) select prefix, lang, loclang, loclanglink, method from wikipedias where prefix... [14:27:15] 10VPS-project-Wikistats: Add mswikisource to wikistats - https://phabricator.wikimedia.org/T363253#9744901 (10Dzahn) 05Open→03Resolved ` MariaDB [wikistats]> insert into wikisources (prefix, lang, loclang, loclanglink, method) select prefix, lang, loclang, loclanglink, method from wikipedias where prefix... [14:27:50] 10VPS-project-Wikistats: Add kawikisource to wikistats - https://phabricator.wikimedia.org/T363247#9744916 (10Dzahn) 05Open→03Resolved ` MariaDB [wikistats]> insert into wikisources (prefix, lang, loclang, loclanglink, method) select prefix, lang, loclang, loclanglink, method from wikipedias where prefix... [14:38:51] 06cloud-services-team, 10Cloud-VPS, 10Toolforge: Taavi knowledge transfer: Toolforge misc services (e.g. mail server) - https://phabricator.wikimedia.org/T362447#9744935 (10Andrew) The toolforge exim server is using an experimental feature to support forwarding to gmail. That build is here: https://gitlab.wi... [14:43:04] (CloudVPSDesignateLeaks) firing: (2) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:44:57] 06cloud-services-team, 10Cloud-VPS, 10Toolforge: Taavi knowledge transfer: python-flask-keystone, novaproxy, enc api - https://phabricator.wikimedia.org/T362449#9744952 (10Andrew) [14:45:25] 06cloud-services-team, 10Cloud-VPS, 10Toolforge: Taavi knowledge transfer: python-flask-keystone, novaproxy, enc api - https://phabricator.wikimedia.org/T362449#9744954 (10Andrew) From a meeting about these services today: novaproxy api + enc api + Added keystone auth + Less horizon integration, mostly mana... [14:46:08] 06cloud-services-team, 10Cloud-VPS, 10Toolforge: Taavi knowledge transfer: python-flask-keystone, novaproxy, enc api - https://phabricator.wikimedia.org/T362449#9744968 (10Andrew) I (Andrew) am accepting this task to investigate deprecation warnings in these services and (probably) take over maintenance of p... [14:47:41] (CloudVPSDesignateLeaks) firing: (3) Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:52:41] (CloudVPSDesignateLeaks) firing: (3) Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:57:41] (CloudVPSDesignateLeaks) resolved: (3) Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:58:41] 06cloud-services-team, 10Toolforge: lima-kilo: replicate sssd setup from Toolforge - https://phabricator.wikimedia.org/T362966#9744998 (10aborrero) 05In progress→03Resolved [15:01:38] 06cloud-services-team, 10Toolforge: toolforge lima-kilo: refresh maintain-kubeusers test data - https://phabricator.wikimedia.org/T363482 (10aborrero) 03NEW [15:17:01] (OpenstackAPIResponse) firing: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [15:17:10] 10Toolforge (Quota-requests): Request increased quota for uploadmap Toolforge tool - https://phabricator.wikimedia.org/T362975#9745079 (10aborrero) a:03aborrero [15:18:37] 10Toolforge (Quota-requests): Request increased quota for uploadmap Toolforge tool - https://phabricator.wikimedia.org/T362975#9745081 (10aborrero) [15:18:38] 06cloud-services-team, 10Toolforge: Toolforge: consider introducing a command line for creating reverse proxies - https://phabricator.wikimedia.org/T337191#9745082 (10aborrero) [15:20:00] 10Toolforge (Quota-requests): Request increased quota for uploadmap Toolforge tool - https://phabricator.wikimedia.org/T362975#9745086 (10dcaro) +1 Note that that method is currently not supported: ` Any objects manually created in Kubernetes (as opposed to using toolforge clients) are not officially supported... [15:21:55] 10Toolforge (Quota-requests): Request increased quota for uploadmap Toolforge tool - https://phabricator.wikimedia.org/T362975#9745107 (10dcaro) An example of doing so with the php buildpack can be found here: T337191#8879427 [15:22:21] (03PS1) 10Muehlenhoff: Remove obsolete dummy certs [labs/private] - 10https://gerrit.wikimedia.org/r/1024421 (https://phabricator.wikimedia.org/T360439) [15:44:33] (03PS1) 10Andrew Bogott: Regenerate strings, again [openstack/horizon/horizon] (2024.1) - 10https://gerrit.wikimedia.org/r/1024422 [15:44:38] 10Toolforge (Quota-requests), 13Patch-For-Review: Request increased quota for uploadmap Toolforge tool - https://phabricator.wikimedia.org/T362975#9745234 (10CodeReviewBot) aborrero opened https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/267 maintain-kubeusers: bump service... [15:45:29] (03CR) 10Andrew Bogott: [V:03+2 C:03+2] Regenerate strings, again [openstack/horizon/horizon] (2024.1) - 10https://gerrit.wikimedia.org/r/1024422 (owner: 10Andrew Bogott) [16:00:12] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudweb.set_maintenance (T356287) [16:00:18] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [16:00:52] !log andrew@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.cloudweb.set_maintenance (exit_code=99) (T356287) [16:02:24] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node on host 'cloudcontrol1005.eqiad.wmnet' (T356287) [16:14:56] (SystemdUnitDown) firing: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1006. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1006 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [16:24:56] (SystemdUnitDown) firing: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [16:33:10] (GaleraClusterSizeMismatch) firing: Galera in eqiad1 has 2 nodes - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/GaleraClusterSizeMismatch - https://grafana.wikimedia.org/d/galera-cluster-summary/wmcs-openstack-eqiad-galera-cluster-summary - https://alerts.wikimedia.org/?q=alertname%3DGaleraClusterSizeMismatch [16:33:22] (HAProxyBackendUnavailable) firing: (14) HAProxy service cinder-api_backend backend cloudcontrol1005.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [16:34:43] (03CR) 10Jforrester: "I think we should revert this to 0.0.12, as for now the new breaking change about globals is unskippable." [labs/libraryupgrader/config] - 10https://gerrit.wikimedia.org/r/1023097 (https://phabricator.wikimedia.org/T187672) (owner: 10Majavah) [16:34:52] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node (exit_code=0) on host 'cloudcontrol1005.eqiad.wmnet' (T356287) [16:34:56] (SystemdUnitDown) resolved: (2) The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1005. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [16:35:23] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [16:37:03] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node on host 'cloudcontrol1006.eqiad.wmnet' (T356287) [16:38:10] (GaleraClusterSizeMismatch) resolved: Galera in eqiad1 has 2 nodes - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/GaleraClusterSizeMismatch - https://grafana.wikimedia.org/d/galera-cluster-summary/wmcs-openstack-eqiad-galera-cluster-summary - https://alerts.wikimedia.org/?q=alertname%3DGaleraClusterSizeMismatch [16:38:22] (HAProxyBackendUnavailable) firing: (14) HAProxy service cinder-api_backend backend cloudcontrol1005.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [16:43:22] (HAProxyBackendUnavailable) resolved: (14) HAProxy service cinder-api_backend backend cloudcontrol1005.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [16:48:09] 10Cloud-VPS (Debian Buster Deprecation), 06collaboration-services, 13Patch-For-Review: replace buster machines in devtools project - https://phabricator.wikimedia.org/T360964#9745491 (10Dzahn) new bullseye deployment server runs into known issue T257317 with scap init again [16:48:46] (OpenstackAPIResponse) firing: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [16:49:56] (SystemdUnitDown) firing: (2) The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1005. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [16:50:40] (GaleraClusterSizeMismatch) firing: (2) Galera in eqiad1 has 2 nodes - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/GaleraClusterSizeMismatch - https://grafana.wikimedia.org/d/galera-cluster-summary/wmcs-openstack-eqiad-galera-cluster-summary - https://alerts.wikimedia.org/?q=alertname%3DGaleraClusterSizeMismatch [16:52:07] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node (exit_code=0) on host 'cloudcontrol1006.eqiad.wmnet' (T356287) [16:52:13] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [16:52:56] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node on host 'cloudcontrol1007.eqiad.wmnet' (T356287) [16:56:01] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node on host 'cloudservices1005.eqiad.wmnet' (T356287) [16:59:56] (SystemdUnitDown) resolved: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1006. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1006 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [17:03:21] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node (exit_code=0) on host 'cloudservices1005.eqiad.wmnet' (T356287) [17:03:27] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [17:03:46] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node on host 'cloudservices1006.eqiad.wmnet' (T356287) [17:07:04] 10Openstack-Magnum: Deploy k8s greater than 1.23 - https://phabricator.wikimedia.org/T363504 (10rook) 03NEW [17:07:40] (GaleraClusterSizeMismatch) firing: (2) Galera in eqiad1 has 2 nodes - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/GaleraClusterSizeMismatch - https://grafana.wikimedia.org/d/galera-cluster-summary/wmcs-openstack-eqiad-galera-cluster-summary - https://alerts.wikimedia.org/?q=alertname%3DGaleraClusterSizeMismatch [17:09:52] (HAProxyBackendUnavailable) firing: (28) HAProxy service cinder-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [17:10:55] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node (exit_code=0) on host 'cloudservices1006.eqiad.wmnet' (T356287) [17:11:00] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [17:11:26] (SystemdUnitDown) firing: (2) The service unit designate-producer.service is in failed status on host cloudservices1005. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudservices1005 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [17:12:33] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node (exit_code=0) on host 'cloudcontrol1007.eqiad.wmnet' (T356287) [17:12:40] (GaleraClusterSizeMismatch) firing: (2) Galera in eqiad1 has 2 nodes - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/GaleraClusterSizeMismatch - https://grafana.wikimedia.org/d/galera-cluster-summary/wmcs-openstack-eqiad-galera-cluster-summary - https://alerts.wikimedia.org/?q=alertname%3DGaleraClusterSizeMismatch [17:13:15] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node on host 'cloudnet1005.eqiad.wmnet' (T356287) [17:14:52] (HAProxyBackendUnavailable) firing: (19) HAProxy service cinder-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [17:35:29] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1033.eqiad.wmnet' (T356287) [17:35:35] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [17:36:46] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1035.eqiad.wmnet' (T356287) [17:38:44] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) [17:41:30] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1035.eqiad.wmnet' (T356287) [17:41:31] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1037.eqiad.wmnet' (T356287) [17:41:35] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [17:46:25] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1037.eqiad.wmnet' (T356287) [17:46:26] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1039.eqiad.wmnet' (T356287) [17:51:31] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1039.eqiad.wmnet' (T356287) [17:51:32] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1034.eqiad.wmnet' (T356287) [17:51:36] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [17:52:49] (NeutronAgentDown) resolved: (4) Neutron neutron-linuxbridge-agent on cloudnet1005 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown [17:56:18] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1034.eqiad.wmnet' (T356287) [17:56:19] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1036.eqiad.wmnet' (T356287) [18:00:55] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1036.eqiad.wmnet' (T356287) [18:00:56] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1040.eqiad.wmnet' (T356287) [18:01:00] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [18:05:46] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1040.eqiad.wmnet' (T356287) [18:05:48] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1045.eqiad.wmnet' (T356287) [18:10:44] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1045.eqiad.wmnet' (T356287) [18:10:45] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1043.eqiad.wmnet' (T356287) [18:10:49] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [18:11:26] (SystemdUnitDown) resolved: (2) The service unit designate-producer.service is in failed status on host cloudservices1006. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudservices1006 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [18:13:41] (CloudVPSDesignateLeaks) firing: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [18:15:22] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1043.eqiad.wmnet' (T356287) [18:15:23] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1046.eqiad.wmnet' (T356287) [18:20:40] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1046.eqiad.wmnet' (T356287) [18:20:41] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1041.eqiad.wmnet' (T356287) [18:20:46] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [18:23:41] (CloudVPSDesignateLeaks) resolved: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [18:25:17] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1041.eqiad.wmnet' (T356287) [18:25:18] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1044.eqiad.wmnet' (T356287) [18:30:21] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1044.eqiad.wmnet' (T356287) [18:30:22] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1042.eqiad.wmnet' (T356287) [18:30:26] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [18:35:04] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1042.eqiad.wmnet' (T356287) [18:35:06] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1038.eqiad.wmnet' (T356287) [18:39:49] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1038.eqiad.wmnet' (T356287) [18:39:50] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1047.eqiad.wmnet' (T356287) [18:39:54] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [18:44:47] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1047.eqiad.wmnet' (T356287) [18:44:48] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt-wdqs1001.eqiad.wmnet' (T356287) [18:44:58] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [18:49:46] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt-wdqs1001.eqiad.wmnet' (T356287) [18:49:47] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt-wdqs1002.eqiad.wmnet' (T356287) [18:50:57] 06cloud-services-team, 10Toolforge: toolforge: explore options to introduce egress network quotas - https://phabricator.wikimedia.org/T363296#9746015 (10bd808) >>! In T363296#9743894, @aborrero wrote: > in your opinion, should we decline this task and focus on the other angle you mention? I don't want to veto... [18:55:07] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt-wdqs1002.eqiad.wmnet' (T356287) [18:55:08] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt-wdqs1003.eqiad.wmnet' (T356287) [18:55:13] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [19:00:16] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt-wdqs1003.eqiad.wmnet' (T356287) [19:00:17] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1048.eqiad.wmnet' (T356287) [19:00:22] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [19:05:06] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1048.eqiad.wmnet' (T356287) [19:05:07] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1052.eqiad.wmnet' (T356287) [19:06:08] 10Openstack-Magnum: Deploy k8s greater than 1.23 - https://phabricator.wikimedia.org/T363504#9746088 (10rook) Setting the labels as follows ` labels = { #kube_tag = "v1.26.15-rancher1-linux-amd64" kube_tag = "v1.26.8-rancher1" flannel_tag = "v0.21.5" cinder_csi_enabled = "true"... [19:10:18] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1052.eqiad.wmnet' (T356287) [19:10:19] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1053.eqiad.wmnet' (T356287) [19:10:24] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [19:15:27] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1053.eqiad.wmnet' (T356287) [19:15:28] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1049.eqiad.wmnet' (T356287) [19:15:34] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [19:20:41] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1049.eqiad.wmnet' (T356287) [19:20:42] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1050.eqiad.wmnet' (T356287) [19:20:47] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [19:25:58] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1050.eqiad.wmnet' (T356287) [19:25:59] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1051.eqiad.wmnet' (T356287) [19:26:04] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [19:30:54] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1051.eqiad.wmnet' (T356287) [19:30:56] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1056.eqiad.wmnet' (T356287) [19:36:23] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1056.eqiad.wmnet' (T356287) [19:36:24] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1057.eqiad.wmnet' (T356287) [19:36:30] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [19:41:37] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1057.eqiad.wmnet' (T356287) [19:41:38] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1061.eqiad.wmnet' (T356287) [19:41:43] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [19:44:17] (03PS1) 10VolkerE: Revert "releases: Bump jsdoc-wmf-theme" [labs/libraryupgrader/config] - 10https://gerrit.wikimedia.org/r/1024487 [19:46:32] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1061.eqiad.wmnet' (T356287) [19:46:33] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1059.eqiad.wmnet' (T356287) [19:51:03] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1059.eqiad.wmnet' (T356287) [19:51:05] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1058.eqiad.wmnet' (T356287) [19:51:09] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [19:55:45] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1058.eqiad.wmnet' (T356287) [19:55:46] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1060.eqiad.wmnet' (T356287) [19:56:21] 10Data-Services: Some PetScan queries do not return any results anymore for some days now - https://phabricator.wikimedia.org/T363073#9746251 (10M2k_dewiki) Hello, also see * https://github.com/magnusmanske/petscan_rs/issues/164 Thanks a lot! [20:00:26] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1060.eqiad.wmnet' (T356287) [20:00:27] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1055.eqiad.wmnet' (T356287) [20:00:33] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [20:02:48] (PuppetZeroResources) firing: Puppet has failed generate resources on cloudcontrol2004-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [20:05:36] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1055.eqiad.wmnet' (T356287) [20:05:37] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1054.eqiad.wmnet' (T356287) [20:05:44] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [20:10:12] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1054.eqiad.wmnet' (T356287) [20:10:13] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirtlocal1001.eqiad.wmnet' (T356287) [20:12:48] (PuppetZeroResources) firing: (2) Puppet has failed generate resources on cloudcontrol2001-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [20:15:31] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirtlocal1001.eqiad.wmnet' (T356287) [20:15:32] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirtlocal1002.eqiad.wmnet' (T356287) [20:15:41] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [20:16:45] (WidespreadPuppetFailure) firing: Puppet has failed on wmcs cluster - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet?orgId=1&viewPanel=3&var-cluster=wmcs - https://alerts.wikimedia.org/?q=alertname%3DWidespreadPuppetFailure [20:17:48] (PuppetZeroResources) firing: (3) Puppet has failed generate resources on cloudcontrol2001-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [20:20:46] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirtlocal1002.eqiad.wmnet' (T356287) [20:20:46] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirtlocal1003.eqiad.wmnet' (T356287) [20:20:51] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [20:25:33] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirtlocal1003.eqiad.wmnet' (T356287) [20:25:34] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1062.eqiad.wmnet' (T356287) [20:27:30] (OpenstackAPIResponse) resolved: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [20:30:53] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1062.eqiad.wmnet' (T356287) [20:30:54] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1065.eqiad.wmnet' (T356287) [20:30:59] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [20:35:28] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1065.eqiad.wmnet' (T356287) [20:35:29] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1067.eqiad.wmnet' (T356287) [20:40:19] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1067.eqiad.wmnet' (T356287) [20:40:20] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1064.eqiad.wmnet' (T356287) [20:40:25] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [20:45:27] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1064.eqiad.wmnet' (T356287) [20:45:28] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1066.eqiad.wmnet' (T356287) [20:45:37] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [20:50:00] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1066.eqiad.wmnet' (T356287) [20:50:01] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1063.eqiad.wmnet' (T356287) [20:52:14] 06cloud-services-team, 10Horizon, 10Striker: Striker/Horizon are running in Blubber built containers with a runtime UID that does not exist on the host machine - https://phabricator.wikimedia.org/T363452#9746388 (10bd808) [20:53:46] (OpenstackAPIResponse) firing: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [20:54:46] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1063.eqiad.wmnet' (T356287) [20:54:47] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1031.eqiad.wmnet' (T356287) [20:54:51] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [20:59:57] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1031.eqiad.wmnet' (T356287) [20:59:58] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1032.eqiad.wmnet' (T356287) [21:00:04] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [21:05:06] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1032.eqiad.wmnet' (T356287) [21:05:11] T356287: Upgrade cloud-vps openstack to version 'Bobcat' - https://phabricator.wikimedia.org/T356287 [21:14:41] (CloudVPSDesignateLeaks) firing: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [21:24:41] (CloudVPSDesignateLeaks) resolved: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [21:45:07] 10Tool-bridgebot: Replace custom deployment with build service and job service - https://phabricator.wikimedia.org/T363028#9746507 (10bd808) The main problem in the deploy was that there was [[https://gitlab.wikimedia.org/toolforge-repos/bridgebot/-/commit/f4022bd91be1d2bd09aecb0e8fe3f7b958e94f57|a cut-and-paste... [21:46:49] 10Tool-bridgebot: Add toml linter for config files - https://phabricator.wikimedia.org/T363529 (10bd808) 03NEW [21:50:46] 10Tool-bridgebot: Replace custom deployment with build service and job service - https://phabricator.wikimedia.org/T363028#9746558 (10bd808) [21:50:48] 10Tool-bridgebot: Figure out how to deploy ZNC using buildpacks - https://phabricator.wikimedia.org/T353559#9746559 (10bd808) [21:51:08] 10Tool-bridgebot: Figure out how to deploy ZNC using buildpacks - https://phabricator.wikimedia.org/T353559#9746552 (10bd808) This work actually got done as part of {T357729}. https://gitlab.wikimedia.org/toolforge-repos/wikibugs2-znc is now being used by Bridgebot as well. [21:51:39] 10Tool-bridgebot: Figure out how to deploy ZNC using buildpacks - https://phabricator.wikimedia.org/T353559#9746554 (10bd808) 05Open→03Resolved a:03bd808 [22:05:05] (03PS1) 10Andrew Bogott: Remove duplicate translations of "TOTP authentification" [openstack/horizon/horizon] (2024.1) - 10https://gerrit.wikimedia.org/r/1024517 [22:05:44] (03CR) 10Andrew Bogott: [V:03+2 C:03+2] Remove duplicate translations of "TOTP authentification" [openstack/horizon/horizon] (2024.1) - 10https://gerrit.wikimedia.org/r/1024517 (owner: 10Andrew Bogott) [22:50:16] (03PS1) 10Andrew Bogott: Update .gitreview [openstack/horizon/designate-dashboard] (2024.1) - 10https://gerrit.wikimedia.org/r/1024519 [22:50:16] (03PS1) 10Andrew Bogott: dashboard panels: don't try to include empty .css file [openstack/horizon/designate-dashboard] (2024.1) - 10https://gerrit.wikimedia.org/r/1024520 [22:50:32] (03CR) 10Andrew Bogott: [C:03+2] Update .gitreview [openstack/horizon/designate-dashboard] (2024.1) - 10https://gerrit.wikimedia.org/r/1024519 (owner: 10Andrew Bogott) [22:50:46] (03CR) 10Andrew Bogott: [V:03+2 C:03+2] dashboard panels: don't try to include empty .css file [openstack/horizon/designate-dashboard] (2024.1) - 10https://gerrit.wikimedia.org/r/1024520 (owner: 10Andrew Bogott) [22:50:55] (03CR) 10Andrew Bogott: [V:03+2 C:03+2] Update .gitreview [openstack/horizon/designate-dashboard] (2024.1) - 10https://gerrit.wikimedia.org/r/1024519 (owner: 10Andrew Bogott) [23:25:12] 10Tool-bridgebot: Replace custom deployment with build service and job service - https://phabricator.wikimedia.org/T363028#9746792 (10bd808) Such doc updates! Much wow! https://wikitech.wikimedia.org/w/index.php?title=Tool%3ABridgebot&diff=2172216&oldid=2169061 [23:29:14] 10Tool-bridgebot: Replace custom deployment with build service and job service - https://phabricator.wikimedia.org/T363028#9746797 (10bd808) 05In progress→03Resolved >>! In T363028#9746507, @bd808 wrote: > I will also make a new task about trying to find a linter to validate the toml files to add to CI....