[08:25:26] 10netops, 10Ganeti, 10Infrastructure-Foundations, 10SRE: Investigate Ganeti in routed mode - https://phabricator.wikimedia.org/T300152 (10MoritzMuehlenhoff) >>! In T300152#9437438, @ayounsi wrote: > On naming I didn't use `private1-ganeti-codfw` as I didn't want to tie the IPs to a specific tool. On the ot... [09:05:30] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Upgrade asw1-eqsin - https://phabricator.wikimedia.org/T332395 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=6bec1528-7372-478d-856a-a08325eb04f0) set by ayounsi@cumin1002 for 2:00:00 on 35 host(s) and their services w... [10:47:01] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Upgrade asw1-eqsin - https://phabricator.wikimedia.org/T332395 (10ayounsi) 05Open→03Resolved All done. ~10min downtime. [10:52:09] 10Traffic, 10netops, 10Infrastructure-Foundations: Network issues for users in the UK and Ireland - https://phabricator.wikimedia.org/T354065 (10cmooney) 05Open→03Resolved a:03cmooney Great @Sideswipe9th thanks for the feedback. Definitely was a strange one, glad you could shed a bit more light on it... [13:00:32] hi, I want to remove the wikireplica LVS balancers, and would appreciate a review on https://gerrit.wikimedia.org/r/c/operations/puppet/+/978539 for that [14:53:06] 10netops, 10Ganeti, 10Infrastructure-Foundations, 10SRE: Investigate Ganeti in routed mode - https://phabricator.wikimedia.org/T300152 (10cmooney) >>! In T300152#9440438, @MoritzMuehlenhoff wrote: >>>! In T300152#9437438, @ayounsi wrote: >> On naming I didn't use `private1-ganeti-codfw` as I didn't want to... [15:22:07] 10netops, 10Ganeti, 10Infrastructure-Foundations, 10SRE: Investigate Ganeti in routed mode - https://phabricator.wikimedia.org/T300152 (10MoritzMuehlenhoff) >>! In T300152#9441778, @cmooney wrote: >>>! In T300152#9440438, @MoritzMuehlenhoff wrote: >>>>! In T300152#9437438, @ayounsi wrote: >>> On naming I d... [15:37:52] 10netops, 10Infrastructure-Foundations, 10SRE: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10Clement_Goubert) [15:38:10] 10netops, 10Infrastructure-Foundations, 10Prod-Kubernetes, 10SRE, 10serviceops: Test IP-renumbering on kubestage2002.codfw.wmnet - https://phabricator.wikimedia.org/T352883 (10Clement_Goubert) 05Open→03In progress a:05Clement_Goubert→03Papaul Host is now drained and cordoned. It is in codfw rack... [15:50:23] 10Traffic, 10Patch-For-Review: tcp-mss-clamper doesn't work on bullseye / kernel 5.10 - https://phabricator.wikimedia.org/T353657 (10CodeReviewBot) vgutierrez opened https://gitlab.wikimedia.org/repos/sre/tcp-mss-clamper/-/merge_requests/12 clamper: support bullseye kernels [15:51:56] 10Traffic: Cookbook to depool a site in AuthDNS - https://phabricator.wikimedia.org/T334048 (10joanna_borun) [16:01:07] sukhe: thanks for the +1! would now be a good time to deploy it? [16:01:19] taavi: works for me! I am around [16:01:31] yep [16:01:43] 10Traffic, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: GeoIP mapping experiments - https://phabricator.wikimedia.org/T332024 (10joanna_borun) [16:02:14] cool, I'm following https://wikitech.wikimedia.org/wiki/LVS#Remove_a_load_balanced_service and will ask if anything's unclear [16:02:37] thanks, primary/secondary are in modules/profile/manifests/lvs/configuration.pp [16:03:56] this is high-traffic2 so lvs1018 and lvs1020 [16:04:09] looks good [16:04:32] yep [16:04:36] running puppet [16:07:04] puppet run done [16:07:35] when it says restart pybal is that just a literal 'systemctl restart pybal' or is there a cookbook or similar I should be using? [16:07:51] on the host itself, a literal restart but there is a cookbook as well [16:09:01] https://github.com/wikimedia/operations-cookbooks/blob/master/cookbooks/sre/loadbalancer/restart-pybal.py this is the cookbook [16:09:46] restarted pybal on 1020 [16:11:40] 10netops, 10Infrastructure-Foundations, 10Observability-Metrics, 10SRE, 10observability: Investigate Junos Prometheus exporter - https://phabricator.wikimedia.org/T333210 (10cmooney) p:05Triage→03Low [16:14:30] that's 300 secs, restarting on 1018 [16:16:17] pybal restarts done, and as expected "PyBal IPVS diff check" is alerting for "Services in IPVS but unknown to PyBal" and "Hosts in IPVS but unknown to PyBal" [16:16:28] yeah [16:17:49] starting to remove the ipvs services on 1020 [16:20:05] (make sure to log the addr and port action) [16:20:52] 10Traffic, 10Infrastructure-Foundations, 10SRE: NetworkProbeLimit cookie should set samesite attribute - https://phabricator.wikimedia.org/T342624 (10joanna_borun) a:03ayounsi [16:22:53] thanks, done [16:23:09] 1020 is clear on icinga, moving to 1018 [16:23:49] 10netops, 10Infrastructure-Foundations, 10Prod-Kubernetes, 10SRE, 10serviceops: Test IP-renumbering on kubestage2002.codfw.wmnet - https://phabricator.wikimedia.org/T352883 (10Papaul) @Clement_Goubert thanks will work on it in a minute [16:25:40] 10netops, 10Infrastructure-Foundations, 10Prod-Kubernetes, 10SRE, 10serviceops: Test IP-renumbering on kubestage2002.codfw.wmnet - https://phabricator.wikimedia.org/T352883 (10cmooney) @papaul let me know what port is used on lsw1-b8-codfw once done and I will make the Netbox changes and assign new IPs f... [16:25:53] done, both hosts are now green. [16:26:21] thanks for taking care of it! [16:26:24] thanks! [16:26:44] I can now remove the service::catalog stanza entirely, right? https://gerrit.wikimedia.org/r/c/operations/puppet/+/988483 [16:28:03] yep [17:58:06] 10netops, 10Infrastructure-Foundations, 10Prod-Kubernetes, 10SRE, 10serviceops: Test IP-renumbering on kubestage2002.codfw.wmnet - https://phabricator.wikimedia.org/T352883 (10Papaul) @cmooney xe-0/0/26 [17:58:58] 10netops, 10Infrastructure-Foundations, 10Prod-Kubernetes, 10SRE, 10serviceops: Test IP-renumbering on kubestage2002.codfw.wmnet - https://phabricator.wikimedia.org/T352883 (10Papaul) [19:35:26] 10Traffic, 10Data-Engineering, 10Movement-Insights, 10Patch-For-Review: Identify and label prefetch proxy data in our traffic - https://phabricator.wikimedia.org/T346463 (10dr0ptp4kt) I'll amend the patch.