[08:24:03] 10CAS-SSO, 10Infrastructure-Foundations, 10SRE, 10collaboration-services, and 4 others: migrate gitlab away from the CAS protocol - https://phabricator.wikimedia.org/T320390 (10Jelto) Last Friday we've done some troubleshooting and tested a lot of different configurations, thanks @SLyngshede-WMF again! In... [09:43:51] 10SRE-tools, 10Infrastructure-Foundations: Add GraphQL support to wmflib - https://phabricator.wikimedia.org/T341968 (10ayounsi) [09:50:55] 10SRE-tools, 10Infrastructure-Foundations: Add GraphQL support to wmflib - https://phabricator.wikimedia.org/T341968 (10ayounsi) [10:13:23] (SystemdUnitFailed) firing: cadvisor.service Failed on ganeti2026:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [10:14:07] godog: ^ [10:15:57] XioNoX: thank you! yeah puppet is currently racy on first installing cadvisor, it looks like [10:18:28] (SystemdUnitFailed) resolved: cadvisor.service Failed on ganeti2026:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [10:34:33] 10netops, 10Cloud-VPS, 10Infrastructure-Foundations, 10SRE, and 2 others: Move cloud vps ns-recursor IPs to host/row-independent addressing - https://phabricator.wikimedia.org/T307357 (10aborrero) [10:55:52] 10SRE-tools, 10Spicerack: Spicerack: add distributed locking support - https://phabricator.wikimedia.org/T341973 (10Volans) p:05Triage→03Medium [12:26:53] 10SRE-tools, 10Infrastructure-Foundations, 10Prod-Kubernetes, 10SRE-Sprint-Week-Sustainability-March2023, and 2 others: Write a cookbook to set a k8s cluster in maintenance mode - https://phabricator.wikimedia.org/T277677 (10JMeybohm) [12:27:05] 10netops, 10Infrastructure-Foundations, 10Prod-Kubernetes, 10SRE, and 2 others: Agree strategy for Kubernetes BGP peering to top-of-rack switches - https://phabricator.wikimedia.org/T306649 (10JMeybohm) [12:57:43] 10CAS-SSO, 10Infrastructure-Foundations, 10SRE, 10collaboration-services, and 4 others: migrate gitlab away from the CAS protocol - https://phabricator.wikimedia.org/T320390 (10Jelto) As requested, @jbond your two users on the replica and production dumped via API (`curl "https://gitlab-replica.wikimedia.o... [13:10:28] 10CAS-SSO, 10Infrastructure-Foundations, 10SRE, 10collaboration-services, and 4 others: migrate gitlab away from the CAS protocol - https://phabricator.wikimedia.org/T320390 (10jbond) I did some more testing today and can confirm that the required config is `cas.authn.oidc.id-token.include-id-token-claims=... [13:31:54] 10CAS-SSO, 10Infrastructure-Foundations, 10SRE, 10collaboration-services, and 4 others: migrate gitlab away from the CAS protocol - https://phabricator.wikimedia.org/T320390 (10jbond) > from our side we will need to check if cas.authn.oidc.id-token.include-id-token-claims=true is ok to enable globally or i... [13:54:57] 10CAS-SSO, 10Infrastructure-Foundations, 10SRE, 10collaboration-services, and 4 others: migrate gitlab away from the CAS protocol - https://phabricator.wikimedia.org/T320390 (10Jelto) Thanks for troubleshooting this more! I can confirm existing users have `cas3` in the `identities` section. This leads to a... [13:56:04] Hi folks. I'm seeing this DHCP-relay issue happening again, I believe, but with a different host. analytics1072 isn't getting a DHCP address https://gerrit.wikimedia.org/r/c/operations/homer/public/+/936036 [13:57:15] Oh, scratch that message. Sorry. It did get a DHCP response. [14:27:53] 10Puppet, 10Infrastructure-Foundations: Nuyaml_backend does not allow binary Hiera data - https://phabricator.wikimedia.org/T113328 (10jbond) 05Open→03Resolved a:03jbond no update [14:28:30] 10Puppet, 10Infrastructure-Foundations, 10Wikimedia-IRC-RC-Server, 10User-jbond: Ensure puppet sends the correct ircd signals to update config and motd - https://phabricator.wikimedia.org/T284052 (10jbond) 05Open→03Resolved a:03jbond fixed with last patch [14:33:37] 10netbox, 10Infrastructure-Foundations, 10Puppet-Infrastructure, 10SRE: Netbox missing physical device in PuppetDB when Puppet disabled for too long - https://phabricator.wikimedia.org/T254986 (10joanna_borun) [14:33:53] 10Puppet, 10Cloud-VPS, 10Infrastructure-Foundations, 10Patch-For-Review: role::puppetmaster::puppetdb uses nginx as reverse proxy and cannot be used together with Apache applications - https://phabricator.wikimedia.org/T154105 (10jbond) 05Open→03Declined going to close this as declined. [[ https://ger... [14:38:17] 10Puppet, 10netops, 10Infrastructure-Foundations, 10good first task: Routinator: use tmpfs - https://phabricator.wikimedia.org/T300955 (10jbond) [14:40:05] 10netbox, 10Infrastructure-Foundations, 10Puppet-Infrastructure, 10Puppet (Puppet 7.0): puppet lookup causes spurious puppetdb entries - https://phabricator.wikimedia.org/T303170 (10jbond) [14:40:19] 10netbox, 10Infrastructure-Foundations, 10Puppet-Infrastructure, 10Puppet (Puppet 7.0): puppet lookup causes spurious puppetdb entries - https://phabricator.wikimedia.org/T303170 (10jbond) We should validate if this is fixed by puppet7 [14:42:43] 10netops, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team: Configure cloudsw1-b1-codfw and migrate cloud hosts in codfw B1 to it - https://phabricator.wikimedia.org/T327919 (10aborrero) I think we are ready for this cloudweb2002-dev move today, assuming no IP change, just a poweroff-poweron oper... [14:51:35] 10Puppet, 10Infrastructure-Foundations: Consider alternative configuration management tooling - https://phabricator.wikimedia.org/T321874 (10joanna_borun) 05Open→03Declined There are no specific actions we can take regarding this ticket. If additional discussion is needed, we can schedule a dedicated meeting. [15:04:40] 10Puppet, 10Infrastructure-Foundations, 10SRE, 10User-jbond: in puppet 6 some core types have been moved to external modules. check and confirm our exposure - https://phabricator.wikimedia.org/T265143 (10jbond) 05Open→03Resolved a:03jbond This has been handled as part of the puppet7 migration [15:04:49] 10Puppet, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review, 10User-jbond: Work required to prepare for puppet 7 - https://phabricator.wikimedia.org/T265138 (10jbond) [15:10:02] 10Puppet, 10Infrastructure-Foundations, 10SRE, 10Performance Issue: Investigate mysterious_sysctl settings and figure out what to do with them - https://phabricator.wikimedia.org/T118812 (10jbond) 05Open→03Resolved a:03jbond [15:10:17] 10Puppet, 10Infrastructure-Foundations, 10Technical-Debt: "Setting templatedir is deprecated" warning issued on self-hosted puppetmaster - https://phabricator.wikimedia.org/T95158 (10jbond) 05Open→03Resolved a:03jbond templatedir setting is now removed [15:25:14] 10Puppet, 10Infrastructure-Foundations: Bashisms in various /bin/sh scripts - https://phabricator.wikimedia.org/T95064 (10jbond) [15:26:15] 10Puppet, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 4 others: Deprecate `base::service_unit` in puppet - https://phabricator.wikimedia.org/T194724 (10jbond) [15:32:14] 10Puppet, 10Infrastructure-Foundations, 10Puppet-Core, 10User-jbond: puppetlabs: create puppet 7 environment in WMCS to test code - https://phabricator.wikimedia.org/T294841 (10jbond) [15:37:48] 10netops, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team: Configure cloudsw1-b1-codfw and migrate cloud hosts in codfw B1 to it - https://phabricator.wikimedia.org/T327919 (10Papaul) server move complete [15:40:20] 10Puppet, 10Infrastructure-Foundations, 10Puppet CI, 10SRE, 10Release-Engineering-Team (Radar): Integrate the puppet compiler in the puppet CI pipeline - https://phabricator.wikimedia.org/T166066 (10jbond) [15:40:52] 10Puppet, 10SRE, 10User-Joe: Prepare for Puppet 4 - https://phabricator.wikimedia.org/T169548 (10jbond) [15:41:42] 10Puppet, 10Infrastructure-Foundations, 10Puppet CI, 10SRE, 10Release-Engineering-Team (Radar): Integrate the puppet compiler in the puppet CI pipeline - https://phabricator.wikimedia.org/T166066 (10jbond) 05Open→03Resolved a:03jbond im going to close this, I think with the `auto` keyword this is c... [15:53:25] 10Puppet, 10Infrastructure-Foundations, 10SRE: puppet lint check for resource names - https://phabricator.wikimedia.org/T93231 (10jbond) @fgiunchedi I'm tempted to close this as invalid as i don't see any issue with having spaces in resource titles and in some cases (e.g. notify, exec) it can be desirable.... [15:54:55] 10CFSSL-PKI, 10Infrastructure-Foundations, 10Puppet-Infrastructure, 10SRE, 10Puppet (Puppet 7.0): Create dynamic CRL - https://phabricator.wikimedia.org/T340543 (10jbond) [15:54:58] 10CFSSL-PKI, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, 10Puppet (Puppet 7.0): puppet7: drop instances of :undef in erb files - https://phabricator.wikimedia.org/T341071 (10jbond) [17:59:43] 10netbox, 10Infrastructure-Foundations: Netbox report test_mgmt_dns_hostname - rq.timeouts.JobTimeoutException - https://phabricator.wikimedia.org/T341843 (10Volans) I wonder if has anything to do with T321704 and if the patch there (not merged) could help. Also T339133 is on a similar topic although I think u... [19:06:00] 10Puppet, 10Beta-Cluster-Infrastructure: Puppet failure on Beta Cluster role::beta::docker_services boxes - https://phabricator.wikimedia.org/T342038 (10Jdforrester-WMF) [19:06:33] 10Puppet, 10Beta-Cluster-Infrastructure: Puppet failure on Beta Cluster role::beta::docker_services boxes - https://phabricator.wikimedia.org/T342038 (10Jdforrester-WMF)