[01:41:30] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment codfw1dev for all services [01:43:55] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) on deployment codfw1dev for all services [02:25:51] (03open) 10andrew: magnum.tf: use fcos38 in codfw1dev, same as in eqiad1 [repos/cloud/cloud-vps/tf-infra-test] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tf-infra-test/-/merge_requests/2 [02:26:00] (03merge) 10andrew: magnum.tf: use fcos38 in codfw1dev, same as in eqiad1 [repos/cloud/cloud-vps/tf-infra-test] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tf-infra-test/-/merge_requests/2 [07:15:13] 10Tool-extjsonuploader: Validate Composer names against Packagist - https://phabricator.wikimedia.org/T370729#10756677 (10Samwilson) 05Open→03Resolved This all looks fine now. [09:14:09] (03update) 10aborrero: eqiad1: codfw1dev: default_sg_rules: allow SSH from VXLAN/IPv4-only CIDR [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/203 [09:14:11] (03open) 10aborrero: eqiad1: codfw1dev: default_sg_rules: allow SSH from VXLAN/IPv4-only CIDR [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/203 [09:28:49] 06cloud-services-team, 10Toolforge (Toolforge iteration 19), 13Patch-For-Review: [toolforge] [redis] Prometheus exporter logging errors - https://phabricator.wikimedia.org/T366471#10756692 (10taavi) a:03taavi [09:29:01] 06cloud-services-team, 10Toolforge (Toolforge iteration 19), 13Patch-For-Review: [toolforge] [redis] Prometheus exporter logging errors - https://phabricator.wikimedia.org/T366471#10756693 (10taavi) [09:33:26] (03update) 10aborrero: eqiad1: codfw1dev: default_sg_rules: allow SSH from VXLAN/IPv4-only CIDR [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/203 [09:34:04] 06cloud-services-team, 10Toolforge: toolforge: Investigate ingress-nginx replacements - https://phabricator.wikimedia.org/T392356 (10taavi) 03NEW [09:49:48] 06cloud-services-team, 10Toolforge: New upstream release for Pywikibot - https://phabricator.wikimedia.org/T385400#10756707 (10taavi) a:03taavi [09:52:26] (03update) 10aborrero: eqiad1: add support for operations in the deployment [repos/cloud/cloud-vps/networktests-tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/networktests-tofu-provisioning/-/merge_requests/15 (https://phabricator.wikimedia.org/T391325) [09:54:41] 06cloud-services-team, 10Toolforge: New upstream release for Pywikibot - https://phabricator.wikimedia.org/T385400#10756719 (10taavi) 05Open→03Resolved [10:05:03] 06cloud-services-team, 10Toolforge: Cannot delete directory from incolabot project on Toolforge - https://phabricator.wikimedia.org/T357342#10756734 (10taavi) 05Open→03Resolved a:03taavi [10:09:15] 06cloud-services-team, 10Toolforge: [infra] Fix the mis-named k8s service in tools and toolsbeta projects - https://phabricator.wikimedia.org/T262562#10756752 (10taavi) a:03taavi >>! In T262562#9175833, @taavi wrote: > I'm still planning to do this when we next refresh the toolforge control plane nodes. Tur... [10:12:55] 06cloud-services-team, 10Toolforge: [infra] Fix the mis-named k8s service in tools and toolsbeta projects - https://phabricator.wikimedia.org/T262562#10756756 (10taavi) Apparently maintain-kubeusers loads the name from the `cluster-info` config map in the `kube-public` namespace. I updated that in toolsbeta,... [10:29:07] (03merge) 10aborrero: eqiad1: add support for operations in the deployment [repos/cloud/cloud-vps/networktests-tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/networktests-tofu-provisioning/-/merge_requests/15 (https://phabricator.wikimedia.org/T391325) [10:43:31] (03update) 10aborrero: eqiad1: codfw1dev: default_sg_rules: allow SSH from VXLAN/IPv4-only CIDR [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/203 [10:44:08] 06cloud-services-team, 10Cloud-VPS: Get rid of cloud-cumin VMs in cloudinfra project - https://phabricator.wikimedia.org/T367725#10756775 (10taavi) a:03taavi [10:50:09] (03PS1) 10Majavah: Remove root keys for former staff [labs/private] - 10https://gerrit.wikimedia.org/r/1137729 [10:50:33] (03CR) 10Arturo Borrero Gonzalez: [C:03+1] "LGTM." [labs/private] - 10https://gerrit.wikimedia.org/r/1137729 (owner: 10Majavah) [10:51:53] (03CR) 10Majavah: [V:03+2 C:03+2] Remove root keys for former staff [labs/private] - 10https://gerrit.wikimedia.org/r/1137729 (owner: 10Majavah) [10:54:07] 06cloud-services-team, 10Cloud-VPS, 10Bitu: Find or create .deb package for mwclient 0.11.0 (or mwclient 0.10.0 with writeapi dependency removed) - https://phabricator.wikimedia.org/T372345#10756779 (10taavi) 05Open→03Resolved Let's say that this is done. If there's need to get this updated more quic... [10:54:28] FIRING: PuppetStaleCertificates: Found non-revoked Puppet certificates for 2 deleted instances on cloudinfra-internal-puppetserver-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [10:58:25] (03open) 10aborrero: eqiad1: enable VXLAN/dualstack network [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/204 (https://phabricator.wikimedia.org/T380174) [11:00:47] (03update) 10aborrero: eqiad1: enable VXLAN/dualstack network [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/204 (https://phabricator.wikimedia.org/T380174) [11:02:40] (03approved) 10taavi: eqiad1: codfw1dev: default_sg_rules: allow SSH from VXLAN/IPv4-only CIDR [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/203 (owner: 10aborrero) [11:03:17] (03merge) 10aborrero: eqiad1: codfw1dev: default_sg_rules: allow SSH from VXLAN/IPv4-only CIDR [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/203 [11:03:28] !log aborrero@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan+apply for main branch [11:05:52] !log aborrero@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.tofu (exit_code=0) running tofu plan+apply for main branch [11:10:11] (03update) 10aborrero: eqiad1: enable VXLAN/dualstack network [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/204 (https://phabricator.wikimedia.org/T380174) [11:13:04] 06cloud-services-team, 10Cloud-VPS, 07IPv6: openstack: network problems when introducing new networks - https://phabricator.wikimedia.org/T380728#10756801 (10aborrero) 05Open→03Resolved a:03aborrero we think all problems have been addresses. Among other things: * we have made a number of changes to... [11:35:41] FIRING: CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [11:35:55] 06cloud-services-team, 10Cloud-VPS, 07IPv6, 13Patch-For-Review: CloudVPS: IPv6 in eqiad1 - https://phabricator.wikimedia.org/T380174#10756811 (10aborrero) merging https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/204 has been scheduled for 2025-04-23 @ 09:30 UTC. [11:45:03] (03open) 10aborrero: eqiad1: cloudinfra: introduce PTR zones for 2a02:ec80:a000:: [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/205 (https://phabricator.wikimedia.org/T380746) [11:59:54] 06cloud-services-team, 10Toolforge: Create separate k8s cluster for admin-owned applications - https://phabricator.wikimedia.org/T219076#10756830 (10taavi) 05Open→03Declined Boldly declining. [12:02:02] 06cloud-services-team, 10Tools, 07Mobile: tools.wmflabs.org landing page ("admin" tool) has poor layout on devices with width <550px - https://phabricator.wikimedia.org/T218606#10756833 (10taavi) 05Open→03Invalid admin tools is now a redirect to striker which works properly on mobile, closing. [12:07:19] 10Tools, 06All-and-every-Wikisource, 10Wikidata: Build a Scholia like website for Wikisource - https://phabricator.wikimedia.org/T344328#10756836 (10TiagoLubiana) Fun task! It is related to something I was discussing with @waldyrious at the Lusophone WikiTech channel. There are some different ways works/edit... [12:21:46] (03update) 10aborrero: eqiad1: cloudinfra: introduce PTR zones for 2a02:ec80:a000:: [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/205 (https://phabricator.wikimedia.org/T380746) [12:25:05] (03open) 10taavi: tests: Update example Kubernetes API URL [repos/cloud/toolforge/alerts] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/alerts/-/merge_requests/29 (https://phabricator.wikimedia.org/T262562) [12:25:08] (03update) 10taavi: tests: Update example Kubernetes API URL [repos/cloud/toolforge/alerts] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/alerts/-/merge_requests/29 (https://phabricator.wikimedia.org/T262562) [12:25:17] (03update) 10taavi: tests: Update example Kubernetes API URL [repos/cloud/toolforge/alerts] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/alerts/-/merge_requests/29 (https://phabricator.wikimedia.org/T262562) [12:25:55] (03PS1) 10Majavah: vps: Drop support for .wmflabs hostnames [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1137738 [12:26:49] (03PS1) 10Majavah: wmcs_libs: k8s: Update example API server URL [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1137739 (https://phabricator.wikimedia.org/T262562) [12:27:00] (03open) 10l10n-bot: Localisation updates from https://translatewiki.net. [toolforge-repos/wd-image-positions] - 10https://gitlab.wikimedia.org/toolforge-repos/wd-image-positions/-/merge_requests/35 [12:28:33] (03CR) 10CI reject: [V:04-1] Localisation updates from https://translatewiki.net. [labs/tools/massmailer] - 10https://gerrit.wikimedia.org/r/1137741 (owner: 10L10n-bot) [12:28:43] 06cloud-services-team, 10Toolforge, 13Patch-For-Review: [infra] Fix the mis-named k8s service in tools and toolsbeta projects - https://phabricator.wikimedia.org/T262562#10756851 (10taavi) [12:32:33] (03update) 10taavi: Set custom User-Agent to avoid 429 during tests [repos/cloud/toolforge/alerts] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/alerts/-/merge_requests/30 [12:32:33] (03open) 10taavi: Set custom User-Agent to avoid 429 during tests [repos/cloud/toolforge/alerts] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/alerts/-/merge_requests/30 [12:32:33] (03update) 10taavi: tests: Update example Kubernetes API URL [repos/cloud/toolforge/alerts] (main-I7c2d294db1fe3046105c5f1e0865e59601f9a232) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/alerts/-/merge_requests/29 (https://phabricator.wikimedia.org/T262562) [12:32:41] (03update) 10taavi: tests: Update example Kubernetes API URL [repos/cloud/toolforge/alerts] (main-I7c2d294db1fe3046105c5f1e0865e59601f9a232) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/alerts/-/merge_requests/29 (https://phabricator.wikimedia.org/T262562) [12:32:44] (03update) 10taavi: Set custom User-Agent to avoid 429 during tests [repos/cloud/toolforge/alerts] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/alerts/-/merge_requests/30 [12:39:27] (03update) 10taavi: Set custom User-Agent to avoid 429 during tests [repos/cloud/toolforge/alerts] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/alerts/-/merge_requests/30 [12:39:28] (03update) 10taavi: tests: Update example Kubernetes API URL [repos/cloud/toolforge/alerts] (main-I7c2d294db1fe3046105c5f1e0865e59601f9a232) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/alerts/-/merge_requests/29 (https://phabricator.wikimedia.org/T262562) [12:41:06] (03open) 10taavi: gitlab: Ignore Gerritlab artifacts in target branch [toolforge-repos/wikibugs2] - 10https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/54 [12:41:09] (03update) 10taavi: gitlab: Ignore Gerritlab artifacts in target branch [toolforge-repos/wikibugs2] - 10https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/54 [12:41:27] (03update) 10taavi: gitlab: Ignore Gerritlab artifacts in target branch [toolforge-repos/wikibugs2] - 10https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/54 [12:42:35] (03update) 10taavi: Set custom User-Agent to avoid 429 during tests [repos/cloud/toolforge/alerts] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/alerts/-/merge_requests/30 [12:45:41] RESOLVED: CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [12:46:39] (03update) 10taavi: tests: Update for Phorge project rename [toolforge-repos/wikibugs2] - 10https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/55 [12:46:40] (03open) 10taavi: tests: Update for Phorge project rename [toolforge-repos/wikibugs2] - 10https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/55 [12:46:40] (03update) 10taavi: gitlab: Ignore Gerritlab artifacts in target branch [toolforge-repos/wikibugs2] (main-I4473512c1e65dd70208244c51c0cfffba390c37a) - 10https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/54 [12:46:48] (03update) 10taavi: gitlab: Ignore Gerritlab artifacts in target branch [toolforge-repos/wikibugs2] (main-I4473512c1e65dd70208244c51c0cfffba390c37a) - 10https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/54 [12:46:49] (03update) 10taavi: tests: Update for Phorge project rename [toolforge-repos/wikibugs2] - 10https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/55 [12:46:57] (03update) 10taavi: tests: Update for Phorge project rename [toolforge-repos/wikibugs2] - 10https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/55 [12:51:02] (03update) 10taavi: gitlab: Ignore Gerritlab artifacts in target branch [toolforge-repos/wikibugs2] (main-I4473512c1e65dd70208244c51c0cfffba390c37a) - 10https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/54 [12:51:02] (03update) 10taavi: tests: Update for Phorge project rename [toolforge-repos/wikibugs2] - 10https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/55 [12:56:09] (03update) 10aborrero: Start [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/1 (owner: 10chuckonwumelu) [12:56:28] FIRING: InstanceDown: Project cvn instance cvn-app10 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [12:58:44] (03update) 10aborrero: Start [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/1 (https://phabricator.wikimedia.org/T390057) (owner: 10chuckonwumelu) [12:59:28] RESOLVED: PuppetStaleCertificates: Found non-revoked Puppet certificates for 2 deleted instances on cloudinfra-internal-puppetserver-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [13:00:58] (03update) 10aborrero: Start [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/1 (https://phabricator.wikimedia.org/T390057) (owner: 10chuckonwumelu) [13:01:28] RESOLVED: InstanceDown: Project cvn instance cvn-app10 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [13:05:06] (03update) 10aborrero: Start [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/1 (https://phabricator.wikimedia.org/T390057) (owner: 10chuckonwumelu) [13:07:29] (03update) 10aborrero: tofu-provisioning: bootstrap opentofu code and gitlab CI/CD pipeline [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/1 (https://phabricator.wikimedia.org/T390057) (owner: 10chuckonwumelu) [13:08:28] FIRING: InstanceDown: Project cvn instance cvn-app10 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [13:08:33] (03approved) 10chuckonwumelu: tofu-provisioning: bootstrap opentofu code and gitlab CI/CD pipeline [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/1 (https://phabricator.wikimedia.org/T390057) [13:08:46] (03merge) 10aborrero: tofu-provisioning: bootstrap opentofu code and gitlab CI/CD pipeline [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/1 (https://phabricator.wikimedia.org/T390057) (owner: 10chuckonwumelu) [13:23:28] RESOLVED: InstanceDown: Project cvn instance cvn-app10 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [13:27:59] (03open) 10aborrero: tofu-proviosinig: introduce DNS module for toolsbeta [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/2 (https://phabricator.wikimedia.org/T390057) [13:28:28] FIRING: InstanceDown: Project cvn instance cvn-app10 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [13:33:07] 06cloud-services-team, 10Toolforge: [k8s, cookbooks] Transient error during Toolsbeta k8s 1.25 -> 1.26 upgrade - https://phabricator.wikimedia.org/T373533#10756925 (10taavi) 05Open→03Invalid Closing since we haven't seen this since. [13:36:01] 06cloud-services-team, 10Toolforge: Webservice should produce better diagnostic message when attempting to nest sessions - https://phabricator.wikimedia.org/T332508#10756941 (10taavi) 05Open→03Invalid I believe this is no longer an issue as the webservice containers no longer include the `webservice` b... [13:36:30] 10cloud-services-team (FY2024/2025-Q3-Q4), 06DC-Ops, 10ops-eqiad, 06SRE: Temperature Inlet Temp issue on clouddumps1001:9290 - https://phabricator.wikimedia.org/T383723#10756944 (10Jclark-ctr) No alerts for 4 days and temps and fan speeds have dropped closing this ticket for Temp The system inlet tempera... [13:37:36] 06cloud-services-team, 10Toolforge: Create /shared symlink within Kubernetes images - https://phabricator.wikimedia.org/T327034#10756947 (10taavi) 05Open→03Declined `/shared` is no longer deployed on new bastions either. Closing per above. [13:47:51] 06cloud-services-team, 10Cloud-VPS, 10wikitech.wikimedia.org: Reimage cloudweb hosts to bookworm - https://phabricator.wikimedia.org/T376277#10756952 (10taavi) [13:48:06] 06cloud-services-team, 10Horizon, 10Striker, 10wikitech.wikimedia.org: Reimage cloudweb hosts to bookworm - https://phabricator.wikimedia.org/T376277#10756954 (10taavi) [14:12:08] 06cloud-services-team, 10Cloud-VPS: Cloud VPS mail servers should drop mail sent from non-supported domains - https://phabricator.wikimedia.org/T366935#10757005 (10taavi) [14:16:43] 06cloud-services-team, 06DC-Ops, 10ops-codfw, 06SRE: Service implementation for cloudcephosd2004-dev - https://phabricator.wikimedia.org/T392366 (10Andrew) 03NEW [14:17:29] 06cloud-services-team, 10Cloud-VPS: Reject outbound traffic to port 25 (SMTP) from instances without public IPs - https://phabricator.wikimedia.org/T366936#10757022 (10taavi) a:03taavi [14:18:37] 06cloud-services-team, 10Cloud-VPS, 13Patch-For-Review: Cloud VPS mail servers should drop mail sent from non-supported domains - https://phabricator.wikimedia.org/T366935#10757028 (10taavi) a:03taavi [14:20:32] (03open) 10taavi: build: Remove hardcoded repository name [repos/cloud/cloud-vps/terraform-cloudvps] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/terraform-cloudvps/-/merge_requests/4 [14:23:28] RESOLVED: InstanceDown: Project cvn instance cvn-app10 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [14:25:49] (03open) 10taavi: build: Add GitLab CI to run tests [repos/cloud/cloud-vps/go-cloudvps] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/go-cloudvps/-/merge_requests/3 [14:25:55] (03update) 10taavi: build: Add GitLab CI to run tests [repos/cloud/cloud-vps/go-cloudvps] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/go-cloudvps/-/merge_requests/3 [14:28:18] (03update) 10taavi: build: Remove hardcoded repository name [repos/cloud/cloud-vps/terraform-cloudvps] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/terraform-cloudvps/-/merge_requests/4 [14:28:33] 06cloud-services-team, 06DC-Ops, 10ops-codfw, 06SRE, 13Patch-For-Review: Service implementation for cloudcephosd2004-dev - https://phabricator.wikimedia.org/T392366#10757040 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1002 for host cloudcephosd2004-dev.codfw.wmn... [14:48:28] FIRING: InstanceDown: Project cvn instance cvn-app10 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [14:53:21] (03merge) 10taavi: build: Add GitLab CI to run tests [repos/cloud/cloud-vps/go-cloudvps] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/go-cloudvps/-/merge_requests/3 [14:53:28] RESOLVED: InstanceDown: Project cvn instance cvn-app10 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [14:53:40] (03update) 10taavi: proxies: handle 400 return code from proxy API [repos/cloud/cloud-vps/go-cloudvps] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/go-cloudvps/-/merge_requests/2 (owner: 10andrew) [14:55:19] (03update) 10aborrero: tofu-proviosinig: introduce DNS module for toolsbeta [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/2 (https://phabricator.wikimedia.org/T390057) [14:57:39] 06cloud-services-team, 06DC-Ops, 10ops-codfw, 06SRE: Service implementation for cloudcephosd2004-dev - https://phabricator.wikimedia.org/T392366#10757083 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1002 for host cloudcephosd2004-dev.codfw.wmnet with OS bullseye execut... [15:02:06] (03update) 10aborrero: tofu-proviosinig: introduce DNS module for toolsbeta [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/2 (https://phabricator.wikimedia.org/T390057) [15:03:25] 06cloud-services-team, 06DC-Ops, 10ops-codfw, 06SRE, 13Patch-For-Review: Service implementation for cloudcephosd2004-dev - https://phabricator.wikimedia.org/T392366#10757108 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1002 for host cloudcephosd2004-dev.codfw.wmn... [15:03:28] FIRING: InstanceDown: Project cvn instance cvn-app10 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [15:03:48] (03update) 10aborrero: tofu-proviosinig: introduce DNS module for toolsbeta [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/2 (https://phabricator.wikimedia.org/T390057) [15:05:28] (03merge) 10aborrero: tofu-proviosinig: introduce DNS module for toolsbeta [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/2 (https://phabricator.wikimedia.org/T390057) [15:08:28] RESOLVED: InstanceDown: Project cvn instance cvn-app10 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [15:09:04] (03update) 10taavi: build: Remove hardcoded repository name [repos/cloud/cloud-vps/terraform-cloudvps] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/terraform-cloudvps/-/merge_requests/4 [15:09:06] (03update) 10taavi: build: Fix builds with forks [repos/cloud/cloud-vps/terraform-cloudvps] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/terraform-cloudvps/-/merge_requests/4 [15:09:28] FIRING: InstanceDown: Project cvn instance cvn-app10 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [15:11:39] (03update) 10taavi: build: Fix builds with forks [repos/cloud/cloud-vps/terraform-cloudvps] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/terraform-cloudvps/-/merge_requests/4 [15:12:43] (03update) 10taavi: build: Fix builds with forks [repos/cloud/cloud-vps/terraform-cloudvps] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/terraform-cloudvps/-/merge_requests/4 [15:14:28] RESOLVED: InstanceDown: Project cvn instance cvn-app10 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [15:14:58] FIRING: InstanceDown: Project cvn instance cvn-app10 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [15:19:58] RESOLVED: InstanceDown: Project cvn instance cvn-app10 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [15:25:28] FIRING: InstanceDown: Project cvn instance cvn-app10 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [15:39:58] (03Abandoned) 10Arturo Borrero Gonzalez: wmcs.openstack.migrate_server_to_vxlan_and_ipv6: introduce cookbook [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1080755 (https://phabricator.wikimedia.org/T377346) (owner: 10Arturo Borrero Gonzalez) [15:42:13] 06cloud-services-team, 06DC-Ops, 10ops-codfw, 06SRE, 13Patch-For-Review: Service implementation for cloudcephosd2004-dev - https://phabricator.wikimedia.org/T392366#10757237 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1002 for host cloudcephosd2004-dev.codfw.wmnet w... [15:53:01] 06cloud-services-team, 10Cloud-VPS, 07IPv6, 13Patch-For-Review: CloudVPS: IPv6 in eqiad1 - https://phabricator.wikimedia.org/T380174#10757259 (10aborrero) we need to allocate this in netbox: * 2a02:ec80:a000:fe04::1003:1 (cloudgw1003 virt leg) * 2a02:ec80:a000:fe03::1003:1 (cloudgw1003 wan leg) * 2a02:ec80... [16:10:43] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.bootstrap_and_add [16:16:50] !log andrew@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.ceph.osd.bootstrap_and_add (exit_code=99) [16:30:28] RESOLVED: InstanceDown: Project cvn instance cvn-app10 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [16:32:28] FIRING: InstanceDown: Project cvn instance cvn-app10 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [16:43:02] (03PS1) 10Andrew Bogott: wmcs.ceph.osd.bootstrap_and_add: support overriding number of expected drives [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1137802 (https://phabricator.wikimedia.org/T392366) [16:43:51] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.bootstrap_and_add [16:46:25] (03CR) 10CI reject: [V:04-1] wmcs.ceph.osd.bootstrap_and_add: support overriding number of expected drives [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1137802 (https://phabricator.wikimedia.org/T392366) (owner: 10Andrew Bogott) [16:57:28] RESOLVED: InstanceDown: Project cvn instance cvn-app10 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [17:01:28] FIRING: InstanceDown: Project cvn instance cvn-app10 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [17:20:08] (03open) 10tburmeister: Revert "Add custom styling and file upload form UI" [toolforge-repos/tech-doc-metrics] (design) - 10https://gitlab.wikimedia.org/toolforge-repos/tech-doc-metrics/-/merge_requests/2 [17:20:18] (03merge) 10tburmeister: Revert "Add custom styling and file upload form UI" [toolforge-repos/tech-doc-metrics] (design) - 10https://gitlab.wikimedia.org/toolforge-repos/tech-doc-metrics/-/merge_requests/2 [17:26:28] RESOLVED: InstanceDown: Project cvn instance cvn-app10 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [17:58:39] FIRING: QuarryDown: Quarry application is unreachable - https://prometheus-alerts.wmcloud.org/?q=alertname%3DQuarryDown [18:45:43] 10Horizon, 10Openstack-Magnum, 07Upstream: magnum dashboard shows clusters across all projects - https://phabricator.wikimedia.org/T392384 (10taavi) 03NEW [18:58:39] RESOLVED: QuarryDown: Quarry application is unreachable - https://prometheus-alerts.wmcloud.org/?q=alertname%3DQuarryDown [19:04:23] 10Quarry: Quarry test suite is not being run anymore - https://phabricator.wikimedia.org/T392385 (10taavi) 03NEW [19:19:55] 10Quarry: Quarry test suite is not being run anymore - https://phabricator.wikimedia.org/T392385#10757629 (10taavi) a:03taavi [20:07:54] supertassu opened https://github.com/toolforge/quarry/pull/78 [21:24:04] 10Tool-erinnermich: [ErinnerMichBot] Possible support for other languages and projects? - https://phabricator.wikimedia.org/T384842#10757882 (10Mr_Tortue) No problem, it's not urgent anyway. However, is it possible to have access to the repository in order to help at least a bit ? [21:53:48] !log andrew@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.ceph.osd.bootstrap_and_add (exit_code=99) [22:06:45] 06cloud-services-team, 10Toolforge: toolforge: Investigate ingress-nginx replacements - https://phabricator.wikimedia.org/T392356#10757996 (10bd808) https://github.com/kubernetes/ingress-nginx/issues/13002 seems to state that ingress-nginx will not enter maintenance mode until there is a stable release of InGa... [22:09:27] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.bootstrap_and_add [22:09:28] !log andrew@cloudcumin1001 admin END (ERROR) - Cookbook wmcs.ceph.osd.bootstrap_and_add (exit_code=97) [22:09:39] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.bootstrap_and_add [22:09:44] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.bootstrap_and_add (exit_code=0) [22:10:56] 06cloud-services-team, 10Toolforge: toolforge: Investigate ingress-nginx replacements - https://phabricator.wikimedia.org/T392356#10757997 (10bd808) There are a few tools that I know of which use ingress-nginx specific features like the `nginx.ingress.kubernetes.io/permanent-redirect` annotation described in h... [22:11:02] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [22:41:05] !log andrew@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=99) [23:04:12] 10VPS-project-Codesearch, 06collaboration-services: Graduate codesearch to production - https://phabricator.wikimedia.org/T268199#10758145 (10Dzahn) 05In progress→03Open [23:17:59] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [23:18:05] !log andrew@cloudcumin1001 admin END (ERROR) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=97) [23:18:10] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [23:34:34] !log andrew@cloudcumin1001 admin END (ERROR) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=97) [23:34:44] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.drain_node