[02:17:41] FIRING: [3x] CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [06:17:56] FIRING: [3x] CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [07:13:02] 10Tool-yearinreview, 10New-Engagement-Experiments, 03Wikimedia-Hackathon-2024: Improve the year in review tool for next year - https://phabricator.wikimedia.org/T362898#9782550 (10Jdlrobson) Thanks all for making these great changes to the UI! The blockers for putting this somewhere more discoverable are al... [07:52:35] !log aborrero@cloudcumin1001 huma START - Cookbook wmcs.vps.create_project for project huma in eqiad1 (T364509) [07:52:37] aborrero@cloudcumin1001: Unknown project "huma" [07:52:37] T364509: Request creation of Huma VPS project - https://phabricator.wikimedia.org/T364509 [07:53:11] !log aborrero@cloudcumin1001 huma END (PASS) - Cookbook wmcs.vps.create_project (exit_code=0) for project huma in eqiad1 (T364509) [07:53:11] aborrero@cloudcumin1001: Unknown project "huma" [07:54:04] !log aborrero@cloudcumin1001 huma START - Cookbook wmcs.vps.add_user_to_project for user 'ladsgroup' in role 'member' (T364509) [07:54:05] aborrero@cloudcumin1001: Unknown project "huma" [07:54:11] !log aborrero@cloudcumin1001 huma END (PASS) - Cookbook wmcs.vps.add_user_to_project (exit_code=0) for user 'ladsgroup' in role 'member' (T364509) [07:54:11] aborrero@cloudcumin1001: Unknown project "huma" [07:56:52] 06cloud-services-team, 10Cloud-VPS (Project-requests): Request creation of Huma VPS project - https://phabricator.wikimedia.org/T364509#9782577 (10aborrero) 05Open→03Resolved a:03aborrero done. [08:50:11] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Data-Services, 05Goal: [toolsdb] Upgrade to MariaDB 10.6 - https://phabricator.wikimedia.org/T352206#9782629 (10fnegri) @Marostegui I was thinking if we could take the chance and upgrade to an even more recent version. ToolsDB could potentially run a newer version... [09:22:26] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Data-Services, 05Goal: [toolsdb] Upgrade to MariaDB 10.6 - https://phabricator.wikimedia.org/T352206#9782670 (10Marostegui) @fnegri that is really up to your team - in production such big jumps in releases need to be carefully studied and benchmarked and that's wh... [10:17:56] FIRING: [3x] CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [10:28:44] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Data-Services, 05Goal: [toolsdb] Upgrade to MariaDB 10.6 - https://phabricator.wikimedia.org/T352206#9782802 (10fnegri) > I'd advise you capture and replay some of the traffic against that 10.11 and test if it is acceptable. This sounds promising. Is there any to... [10:34:43] 10Toolforge: `php` not on toolforge? - https://phabricator.wikimedia.org/T364530 (10Magnus) 03NEW [10:39:07] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Data-Services, 05Goal: [toolsdb] Upgrade to MariaDB 10.6 - https://phabricator.wikimedia.org/T352206#9782852 (10Marostegui) You can use a combination of: https://docs.percona.com/percona-toolkit/pt-query-digest.html https://docs.percona.com/percona-toolkit/pt-upgr... [10:41:07] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Data-Services, 05Goal: [toolsdb] Upgrade to MariaDB 10.6 - https://phabricator.wikimedia.org/T352206#9782854 (10fnegri) Thanks, I'll give this a go! [10:46:19] 10Toolforge: `php` not on toolforge? - https://phabricator.wikimedia.org/T364530#9782892 (10Magnus) Also no `php` as user: ` magnus@tools-bastion-13:~$ php -bash: php: command not found ` [10:57:26] 10Toolforge: `php` not on toolforge? - https://phabricator.wikimedia.org/T364530#9782954 (10fnegri) Related: https://lists.wikimedia.org/hyperkitty/list/cloud-announce@lists.wikimedia.org/thread/UAMLGQ42CVHLRZ5W2CZBJDJFRNSBT4DC/ You can still access the old bastion at login-buster.toolforge.org [11:04:42] 10Toolforge: TODO makr php available on new bastion (was: `php` not on toolforge?) - https://phabricator.wikimedia.org/T364530#9782962 (10Magnus) [11:05:02] 10Toolforge: TODO make php available on new bastion (was: `php` not on toolforge?) - https://phabricator.wikimedia.org/T364530#9782963 (10Magnus) [11:06:20] 10Toolforge: TODO make php available on new bastion (was: `php` not on toolforge?) - https://phabricator.wikimedia.org/T364530#9782965 (10Magnus) Thanks @fnegri I changed the ticket title. Seems like an oversight not to add PHP? [11:19:57] 06cloud-services-team, 10Cloud-VPS (Project-requests): Request creation of Huma VPS project - https://phabricator.wikimedia.org/T364509#9782980 (10Ladsgroup) Thank you!! [12:07:08] (03CR) 10CI reject: [V:04-1] Localisation updates from https://translatewiki.net. [labs/tools/intuition] - 10https://gerrit.wikimedia.org/r/1029514 (owner: 10L10n-bot) [12:22:42] (03CR) 10CI reject: [V:04-1] Localisation updates from https://translatewiki.net. [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1029524 (owner: 10L10n-bot) [12:56:16] 06cloud-services-team, 10Toolforge: toolforge: introduce some logic to backfill maintain-kubeuser resources (like per-tool kyverno policies) - https://phabricator.wikimedia.org/T364312#9783204 (10aborrero) We can have maintain-kubeusers to inject a couple of labels to namespace resources: * `app.kubernetes.io/... [13:02:08] 06cloud-services-team, 10Toolforge: toolforge: introduce some logic to backfill maintain-kubeuser resources (like per-tool kyverno policies) - https://phabricator.wikimedia.org/T364312#9783211 (10taavi) The basic idea sounds good to me. Using the Git hash means that all tools will be processed on the first boo... [13:05:04] (03PS1) 10Elukey: Add fake TLS keystore password for Cassandra clusters [labs/private] - 10https://gerrit.wikimedia.org/r/1029538 (https://phabricator.wikimedia.org/T352647) [13:39:51] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS: Migrate eqiad1 cloudnets to Neutron OVS agent - https://phabricator.wikimedia.org/T364459#9783300 (10taavi) [13:55:47] (03PS1) 10Elukey: Delete the Cassandra directory in secrets [labs/private] - 10https://gerrit.wikimedia.org/r/1029567 (https://phabricator.wikimedia.org/T352647) [14:02:55] 10Tool-refill: Recurrent API worker failures - https://phabricator.wikimedia.org/T310754#9783364 (10Ponor) Can we keep one of the "Waiting for an available worker" reports open so we can ping you without opening up a new task every time it happens? (it seems dead atm) [14:17:56] FIRING: [3x] CloudVPSDesignateLeaks: Detected 6 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:51:40] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q3:rack/setup/install cloudcephosd10(3[5-9]|40) - https://phabricator.wikimedia.org/T324998#9783587 (10Andrew) 05Stalled→03Invalid I'm closing this as invalid since those hosts have come and gone :) [14:54:40] 10Toolforge (Software install/update): TODO make php available on new bastion (was: `php` not on toolforge?) - https://phabricator.wikimedia.org/T364530#9783608 (10JJMC89) [14:55:29] 10Toolforge (Software install/update): php-cli for dev.toolforge.org - https://phabricator.wikimedia.org/T360511#9783612 (10JJMC89) [14:56:08] 10Toolforge (Software install/update): TODO make php available on new bastion (was: `php` not on toolforge?) - https://phabricator.wikimedia.org/T364530#9783610 (10JJMC89) →14Duplicate dup:03T360511 [15:19:00] 10Cloud-VPS (Debian Buster Deprecation), 06Moderator-Tools-Team, 06The-Wikipedia-Library: Replace deprecated Buster VMs in Cloud VPS - https://phabricator.wikimedia.org/T364399#9783661 (10jsn.sherman) [15:46:23] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS: Migrate eqiad1 cloudnets to Neutron OVS agent - https://phabricator.wikimedia.org/T364459#9783744 (10taavi) [16:07:11] (03CR) 10Eevans: [C:03+1] Add fake TLS keystore password for Cassandra clusters [labs/private] - 10https://gerrit.wikimedia.org/r/1029538 (https://phabricator.wikimedia.org/T352647) (owner: 10Elukey) [16:09:21] (03CR) 10Eevans: [C:03+1] Delete the Cassandra directory in secrets [labs/private] - 10https://gerrit.wikimedia.org/r/1029567 (https://phabricator.wikimedia.org/T352647) (owner: 10Elukey) [16:15:20] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS: Migrate eqiad1 cloudnets to Neutron OVS agent - https://phabricator.wikimedia.org/T364459#9783849 (10taavi) [16:23:54] 06cloud-services-team, 06Infrastructure-Foundations, 10netops, 10ops-codfw: Create (or teach Andrew how to create) private connections+dns entries for new cloudcontrols - https://phabricator.wikimedia.org/T364559 (10Andrew) 03NEW [16:33:55] 06cloud-services-team, 06Infrastructure-Foundations, 10netops, 10ops-codfw, 06SRE: Create (or teach Andrew how to create) private connections+dns entries for new cloudcontrols - https://phabricator.wikimedia.org/T364559#9783893 (10cmooney) Hey Andrew, Yeah this is on me, I'd not completed the work to ma... [16:52:42] FIRING: [3x] CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [16:57:41] RESOLVED: [3x] CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [17:00:28] FIRING: PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance metricsinfra-puppetserver-1 in project metricsinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [17:42:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [17:50:57] (03CR) 10Nikerabbit: [V:03+2] Localisation updates from https://translatewiki.net. [labs/tools/intuition] - 10https://gerrit.wikimedia.org/r/1029514 (owner: 10L10n-bot) [17:51:43] (03CR) 10Nikerabbit: [V:03+2] Localisation updates from https://translatewiki.net. [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1029524 (owner: 10L10n-bot) [17:52:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [18:32:37] 10Wikibugs: Stop sending CodeReviewBot comments to IRC - https://phabricator.wikimedia.org/T364575 (10taavi) 03NEW [18:46:08] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack [18:49:58] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) [18:55:24] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack [18:57:47] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) [19:02:57] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack [19:05:13] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) [19:18:42] 06cloud-services-team, 10decommission-hardware: decommission cloudcontrol2001-dev.codfw.wmnet - https://phabricator.wikimedia.org/T364577 (10Andrew) 03NEW [19:24:04] 06cloud-services-team, 06Infrastructure-Foundations, 10netops, 10ops-codfw, 06SRE: Create (or teach Andrew how to create) private connections+dns entries for new cloudcontrols - https://phabricator.wikimedia.org/T364559#9784420 (10Andrew) Reimaging cloudcontrol2006-dev works now, thanks! Bonus points: I... [19:28:38] 06cloud-services-team, 10decommission-hardware, 13Patch-For-Review: decommission cloudcontrol2001-dev.codfw.wmnet - https://phabricator.wikimedia.org/T364577#9784437 (10Andrew) [19:29:01] 06cloud-services-team, 10decommission-hardware, 13Patch-For-Review: decommission cloudcontrol2001-dev.codfw.wmnet - https://phabricator.wikimedia.org/T364577#9784438 (10Andrew) [19:29:45] 06cloud-services-team, 10decommission-hardware, 13Patch-For-Review: decommission cloudcontrol2001-dev.codfw.wmnet - https://phabricator.wikimedia.org/T364577#9784440 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by andrew@cumin1002 for hosts: `cloudcontrol2001-dev.codfw.wmnet` - cloudcontr... [19:40:01] 10Wikibugs: Stop sending CodeReviewBot comments to IRC - https://phabricator.wikimedia.org/T364575#9784480 (10bd808) I should probably get {T364490} figured out first, but once that is done then it does seem reasonable to add @CodeReviewBot to the [[https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/blob/5... [19:48:41] 10Wikibugs: Wikibugs' gitlab connector stops working without a strong sign of why - https://phabricator.wikimedia.org/T364490#9784512 (10bd808) 05Open→03In progress p:05Triage→03High a:03bd808 [19:50:54] 10Wikibugs, 03Wikimedia-Hackathon-2024: Report GitLab merge request events to IRC - https://phabricator.wikimedia.org/T362500#9784519 (10bd808) 05In progress→03Resolved https://www.mediawiki.org/wiki/Wikibugs has been updated to describe the new architectural changes this feature introduced. [19:52:47] 10Wikibugs, 10GitLab (Integrations): Automate setup of comment, pipeline, and job webhooks for all GitLab projects - https://phabricator.wikimedia.org/T362940#9784524 (10bd808) >>! In T362940#9732453, @bd808 wrote: > I haven't yet spotted where and when `configure-projects` gets run. Is it on a systemd timer s... [20:05:16] FIRING: [3x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [20:08:22] FIRING: HAProxyBackendUnavailable: HAProxy service keystone-public-api_backend backend cloudcontrol1005.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [20:10:16] RESOLVED: [4x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [20:13:22] RESOLVED: HAProxyBackendUnavailable: HAProxy service keystone-public-api_backend backend cloudcontrol1005.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [21:24:52] (03PS1) 10Andrew Bogott: creation workflow: hard code 'nova' availability zone [openstack/horizon/trove-dashboard] - 10https://gerrit.wikimedia.org/r/1029714 (https://phabricator.wikimedia.org/T325774) [21:24:58] (03PS1) 10Andrew Bogott: creation workflow: don't provide __DEFAULT_ as a volume type option [openstack/horizon/trove-dashboard] - 10https://gerrit.wikimedia.org/r/1029715 (https://phabricator.wikimedia.org/T325774) [21:25:00] (03PS1) 10Andrew Bogott: creation workflow: Replace semi-incoherent help panel with a doc link [openstack/horizon/trove-dashboard] - 10https://gerrit.wikimedia.org/r/1029716 (https://phabricator.wikimedia.org/T325774) [21:25:38] (03PS1) 10Andrew Bogott: creation workflow: hard code 'nova' availability zone [openstack/horizon/trove-dashboard] (2024.1) - 10https://gerrit.wikimedia.org/r/1029717 (https://phabricator.wikimedia.org/T325774) [21:25:40] (03PS1) 10Andrew Bogott: creation workflow: don't provide __DEFAULT_ as a volume type option [openstack/horizon/trove-dashboard] (2024.1) - 10https://gerrit.wikimedia.org/r/1029718 (https://phabricator.wikimedia.org/T325774) [21:25:42] (03PS1) 10Andrew Bogott: creation workflow: Replace semi-incoherent help panel with a doc link [openstack/horizon/trove-dashboard] (2024.1) - 10https://gerrit.wikimedia.org/r/1029719 (https://phabricator.wikimedia.org/T325774) [21:28:10] (03CR) 10Andrew Bogott: [V:03+2 C:03+2] creation workflow: hard code 'nova' availability zone [openstack/horizon/trove-dashboard] - 10https://gerrit.wikimedia.org/r/1029714 (https://phabricator.wikimedia.org/T325774) (owner: 10Andrew Bogott) [21:28:15] (03CR) 10Andrew Bogott: [V:03+2 C:03+2] creation workflow: don't provide __DEFAULT_ as a volume type option [openstack/horizon/trove-dashboard] - 10https://gerrit.wikimedia.org/r/1029715 (https://phabricator.wikimedia.org/T325774) (owner: 10Andrew Bogott) [21:28:20] (03CR) 10Andrew Bogott: [V:03+2 C:03+2] creation workflow: Replace semi-incoherent help panel with a doc link [openstack/horizon/trove-dashboard] - 10https://gerrit.wikimedia.org/r/1029716 (https://phabricator.wikimedia.org/T325774) (owner: 10Andrew Bogott) [21:28:34] (03CR) 10Andrew Bogott: [V:03+2 C:03+2] creation workflow: hard code 'nova' availability zone [openstack/horizon/trove-dashboard] (2024.1) - 10https://gerrit.wikimedia.org/r/1029717 (https://phabricator.wikimedia.org/T325774) (owner: 10Andrew Bogott) [21:28:38] (03CR) 10Andrew Bogott: [V:03+2 C:03+2] creation workflow: don't provide __DEFAULT_ as a volume type option [openstack/horizon/trove-dashboard] (2024.1) - 10https://gerrit.wikimedia.org/r/1029718 (https://phabricator.wikimedia.org/T325774) (owner: 10Andrew Bogott) [21:28:43] (03CR) 10Andrew Bogott: [V:03+2 C:03+2] creation workflow: Replace semi-incoherent help panel with a doc link [openstack/horizon/trove-dashboard] (2024.1) - 10https://gerrit.wikimedia.org/r/1029719 (https://phabricator.wikimedia.org/T325774) (owner: 10Andrew Bogott) [21:51:15] 10Horizon: Improve UI text and content for "Launch [database] instance" dialogue box in Horizon UI - https://phabricator.wikimedia.org/T325774#9784762 (10Andrew) Rivers change course, civilizations rise and fall, and I have finally done some work on this task. [x] Remove: "Specify the details for launch...