[08:20:20] 10netops, 10Ganeti, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Investigate Ganeti in routed mode - https://phabricator.wikimedia.org/T300152 (10ayounsi) The above patches should get us as far as DHCP. DHCP is going to be the next big challenge to solve, partly because of the setback of Opti... [12:48:52] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Puppet-Core, and 3 others: Revert dbstore migration from puppet7 to puppet5 - https://phabricator.wikimedia.org/T354411 (10MoritzMuehlenhoff) @Marostegui @ABran-WMF With https://gerrit.wikimedia.org/r/c/operations/puppet/+/991082/ deployed, these are good... [13:01:56] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Puppet-Core, and 2 others: puppet7 on cumin breaks database connections - https://phabricator.wikimedia.org/T352974 (10Marostegui) [13:02:35] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Puppet-Core, and 3 others: Revert dbstore migration from puppet7 to puppet5 - https://phabricator.wikimedia.org/T354411 (10Marostegui) 05Stalled→03Declined Good to decline! We can always reopen if needed. Thank you Ben for the help you've provided tro... [13:14:16] (SystemdUnitFailed) firing: prometheus-ganeti-exporter.service Failed on ganeti1035:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [13:29:31] 10netops, 10Infrastructure-Foundations, 10SRE: Verify and Configure ECMP operation for EVPN switches - https://phabricator.wikimedia.org/T334658 (10cmooney) 05Open→03Resolved Closing this. It's a global setting and as per the description we need to keep ports in play to get a load-balance for VXLAN traf... [13:30:25] (SystemdUnitFailed) resolved: prometheus-ganeti-exporter.service Failed on ganeti1035:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [13:38:20] 10netops, 10Infrastructure-Foundations, 10SRE: Create single Homer BGP group template to cover all variants - https://phabricator.wikimedia.org/T349116 (10cmooney) [13:42:10] 10netops, 10Infrastructure-Foundations, 10SRE: Firewall filter blocking traceroute in underlay QFX5120 EVPN - https://phabricator.wikimedia.org/T348120 (10cmooney) >>! In T348120#9224531, @ayounsi wrote: > Nice rabbit hole! I found this: https://www.reddit.com/r/Juniper/comments/g12qxh/the_right_way_to_allow... [13:46:47] 10netops, 10Infrastructure-Foundations, 10SRE: Re-IP hosts on codfw row A and B to new per-rack vlans/subnets - https://phabricator.wikimedia.org/T354869 (10cmooney) [14:27:12] 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 5 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10MoritzMuehlenhoff) [14:29:16] (PuppetConstantChange) firing: Puppet performing a change on every puppet run on debmonitor2003:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [14:29:16] (SystemdUnitFailed) firing: (3) debmonitor-maintenance-gc.service Failed on debmonitor2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [14:47:02] 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 5 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10MoritzMuehlenhoff) [15:49:35] jhathaway, slyngs: FYI for the git hooks in spicerack I preferred to suggest the simplest tox usage instead of other tools like pre-commit or similar. See https://doc.wikimedia.org/spicerack/master/development.html#code-style [15:50:46] It's also a tool we're already comfortable with and that people knows how to use. I think that's a good option [15:53:07] thanks volans I'll take a look at that as well [15:53:19] 10netops, 10Infrastructure-Foundations, 10Prod-Kubernetes, 10SRE, 10serviceops: Test IP-renumbering on kubestage2002.codfw.wmnet - https://phabricator.wikimedia.org/T352883 (10Clement_Goubert) `lang=bash cgoubert@kubestage2002:~$ sudo calicoctl node status Calico process is running. IPv4 BGP status +---... [15:53:23] yes but for the puppet repo is not running everything [15:53:38] I do use utils/hooks/pre-push [15:53:45] (as I've opt-in to use it) [15:54:04] pre-push -> ../../utils/hooks/pre-push [15:54:55] we do have a pre-commit hook there too [15:55:19] XioNoX: good to see more stuff behind bird, but also a bit worrying to see more stuff behind bird :) [15:55:55] sukhe: yeah :) that's why we need to keep the code clean/lean [15:58:36] yep. like I mentioned in the CR, happy to take care of moving the bits to bird in a separate commit and then we can rebase the ganeti patch on that [16:17:49] 10netops, 10Infrastructure-Foundations, 10Prod-Kubernetes, 10SRE, 10serviceops: Test IP-renumbering on kubestage2002.codfw.wmnet - https://phabricator.wikimedia.org/T352883 (10Clement_Goubert) No-op on these nodes, proceeding with the rest. [16:20:00] 10netops, 10Infrastructure-Foundations, 10Prod-Kubernetes, 10SRE, 10serviceops: Test IP-renumbering on kubestage2002.codfw.wmnet - https://phabricator.wikimedia.org/T352883 (10Clement_Goubert) >>! In T352883#9469622, @Clement_Goubert wrote: > `lang=bash > IPv6 BGP status > +-------------------+----------... [16:40:08] 10netops, 10Infrastructure-Foundations, 10SRE, 10Traffic, and 2 others: Move lvs2011 primary uplink and connect to new row A/B vlans - https://phabricator.wikimedia.org/T352912 (10cmooney) [16:41:34] 10netops, 10Infrastructure-Foundations, 10Prod-Kubernetes, 10SRE, 10serviceops: Test IP-renumbering on kubestage2002.codfw.wmnet - https://phabricator.wikimedia.org/T352883 (10Clement_Goubert) No-op on the rest of the infra. [16:43:04] 10netops, 10Infrastructure-Foundations, 10Prod-Kubernetes, 10SRE, and 2 others: Update puppet's topology.kubernetes.io/zone logic to take into account the new setup - https://phabricator.wikimedia.org/T352893 (10Clement_Goubert) Summary of deployment from {T352883}: - No-op on all nodes except kubestage200... [17:53:43] 10netops, 10Infrastructure-Foundations, 10SRE, 10Traffic, 10ops-codfw: Migrate lvs2011 and lvs2012 to new top-of-rack switches - https://phabricator.wikimedia.org/T348178 (10cmooney) [17:53:52] 10netops, 10Infrastructure-Foundations, 10SRE, 10Traffic, 10ops-codfw: Move lvs2011 primary uplink and connect to new row A/B vlans - https://phabricator.wikimedia.org/T352912 (10cmooney) 05Open→03Resolved Alll done! [17:54:41] 10netops, 10Infrastructure-Foundations, 10SRE: Re-IP hosts on codfw row A and B to new per-rack vlans/subnets - https://phabricator.wikimedia.org/T354869 (10cmooney) [17:54:49] 10netops, 10Infrastructure-Foundations, 10SRE, 10Traffic, 10ops-codfw: Migrate lvs2011 and lvs2012 to new top-of-rack switches - https://phabricator.wikimedia.org/T348178 (10cmooney) [17:54:57] 10netops, 10Infrastructure-Foundations, 10SRE, 10Traffic, and 2 others: Move lvs2012 from private1-b-codfw (row) to private1-b2-codfw (rack) vlan - https://phabricator.wikimedia.org/T352918 (10cmooney) [17:55:09] 10netops, 10Infrastructure-Foundations, 10SRE: Re-IP hosts on codfw row A and B to new per-rack vlans/subnets - https://phabricator.wikimedia.org/T354869 (10cmooney) [17:55:17] 10netops, 10Infrastructure-Foundations, 10SRE, 10Traffic, 10ops-codfw: Migrate lvs2011 and lvs2012 to new top-of-rack switches - https://phabricator.wikimedia.org/T348178 (10cmooney) [17:55:25] 10netops, 10Infrastructure-Foundations, 10SRE, 10Traffic, and 2 others: Move lvs2011 from private1-a-codfw (row) to private1-a2-codfw (rack) vlan - https://phabricator.wikimedia.org/T352920 (10cmooney) [17:55:40] 10netops, 10Infrastructure-Foundations, 10SRE: Codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10cmooney) [17:55:50] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Codfw row A-B migration - non-standard device moves - https://phabricator.wikimedia.org/T348128 (10cmooney) 05Open→03Resolved [17:56:02] 10netops, 10Infrastructure-Foundations, 10SRE, 10Traffic, 10ops-codfw: Migrate lvs2011 and lvs2012 to new top-of-rack switches - https://phabricator.wikimedia.org/T348178 (10cmooney) 05Open→03Resolved [17:56:10] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Codfw row A-B migration - non-standard device moves - https://phabricator.wikimedia.org/T348128 (10cmooney)