[04:53:22] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: Migrate codfw servers in rows C & D from legacy ASW to LSW - https://phabricator.wikimedia.org/T370630#10098595 (10Marostegui) [06:33:53] 10netops, 06Infrastructure-Foundations: Apply egress Source Address Validation on the Wikimedia core routers - https://phabricator.wikimedia.org/T372158#10098689 (10ayounsi) > I'm wondering if we can have Homer populate a prefix list instead. That's always an option, it of course comes down to how much more co... [07:18:35] 10netops, 06Infrastructure-Foundations: Publish, and maintain ASPA records for valid AS14907 upstreams - https://phabricator.wikimedia.org/T372161#10098724 (10ayounsi) > Ta da: https://wikitech.wikimedia.org/w/index.php?title=Adding_and_removing_transit_providers&diff=2218856&oldid=2042295. Can you verify this... [07:53:50] 10netops, 06Infrastructure-Foundations, 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10098761 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by jayme@cumin1002 from kubernetes200... [07:54:46] 10netops, 06Infrastructure-Foundations, 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10098762 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jayme@cumin1002 for host wiki... [08:08:40] FIRING: SystemdUnitFailed: generate_vrts_aliases.service on mx-in1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:46:10] 10netops, 06Infrastructure-Foundations, 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10098906 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jayme@cumin1002 for host wikikube... [09:06:10] RESOLVED: SystemdUnitFailed: generate_vrts_aliases.service on mx-in1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [10:20:22] XioNoX: I think if we merge this first: https://gerrit.wikimedia.org/r/c/operations/puppet/+/1066708 then it's just creating a new idp-test VM and adding it in site.pp [10:20:50] XioNoX: The IDP serverss do use public IPs, I don't know that ruins the test for you? [10:33:30] Anyone feeling up for a CAS review: https://gerrit.wikimedia.org/r/c/operations/software/cas-overlay-template/+/1064354 you can't break anything [10:43:04] slyngs: https://phabricator.wikimedia.org/T362330 :) [11:09:17] Always one step ahead of me :-) [11:10:18] it would be a good test actually [11:12:52] Good, until I get Redis working, the hosts are pretty much independent, so we can just create as many as we'd like and not affect anything [11:15:08] If you'd give this a quick review: https://gerrit.wikimedia.org/r/c/operations/puppet/+/1066708 then we can do the puppet change for the routed one [11:35:17] 10netbox, 06Infrastructure-Foundations, 13Patch-For-Review: Netbox: basic change rollback - https://phabricator.wikimedia.org/T310589#10099380 (10cmooney) Nice work! >>! In T310589#10090992, @ayounsi wrote: > I still think such script can be useful as long as the limitations are known, dry-run is used first... [12:31:17] o/ [13:06:10] FIRING: [2x] SystemdUnitFailed: sync-puppet-ca.service on puppetserver1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [13:13:40] RESOLVED: [2x] SystemdUnitFailed: sync-puppet-ca.service on puppetserver1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [14:55:18] 10netops, 06Infrastructure-Foundations, 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10099911 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin1002 for host w... [15:33:34] topranks: fyi, after the ping from sukhe about the noisy alerting, I cleaned up and improved the RPKI dashboard: https://grafana.wikimedia.org/d/UwUa77GZk/rpki?orgId=1 [15:34:06] oh nice - looks great! [15:34:07] one major change is that instead of showing all the rrdp and rsync failures, I shows a % of failure [15:34:20] thanks XioNoX! [15:40:51] 10netops, 06Infrastructure-Foundations, 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10100199 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin1002 for host wikik... [15:51:18] XioNoX: I have a bunch of changes when running homer on lsw1-b6-codfw that are not just adding the new neighbour, can I trouble you for a look? [15:51:44] it's scheduler-maps and dscp cclassifiers [15:53:56] https://phabricator.wikimedia.org/P68064 is the additional diff [15:58:08] oh looks like something topranks merged today on homer [15:58:23] https://gerrit.wikimedia.org/r/c/operations/homer/public/+/1052167 [15:58:28] can I apply? [16:04:57] claime: hey [16:05:26] yep feel free to apply that, I'm in the process of pushing it out across all our devices [16:05:28] I think, reading the commit message, they're just definitions and not used and I can merge? [16:05:31] All right :D [16:05:32] claime: apply and blame topranks later. you have my +1. /me hides [16:05:33] I'm at lsw1-b5 now actually :P [16:05:50] you can skip b6 then :D [16:05:53] and yes - they are just definitions, there is another toggle to actually apply them to ints [16:06:02] claime: super, thanks :) [16:13:38] Arzhel if you're still online I'm not sure why this graph isn't showing more connections on rpki2003 [16:13:42] https://grafana.wikimedia.org/goto/ocxhfT3Ig?orgId=1 [16:13:59] it spooked me a bit so I checked on all the CRs and the session is up on all of them, so I think nothing to worry about [16:26:23] topranks: yeah it's on one of my tabs to check for later. I saw it increased and thought it was just taking it's time [16:36:11] 10netops, 06Infrastructure-Foundations, 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10100393 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.pool-depool-node started by cgoubert@cumin1002 pool fo... [16:53:11] I restarted routinator just in case and will dig deeper if it still show less than expected [16:56:29] `routinator_rtr_current_connections 16` all good :) [17:04:44] hmm yeah ok :) [17:24:57] 10netops, 06Infrastructure-Foundations, 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10100621 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host... [18:04:39] 10netops, 06Infrastructure-Foundations, 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10100785 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host wiki... [18:10:55] 10netops, 06Infrastructure-Foundations, 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10100821 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by akosiaris@cumin1002 from mw2294 to... [18:15:16] 10netops, 06Infrastructure-Foundations, 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10100846 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host... [18:59:58] 10netops, 06Infrastructure-Foundations, 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10100972 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host wiki... [21:21:45] 10netops, 06Infrastructure-Foundations, 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10101297 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by swfrench@cumin2002 from kubernetes... [21:23:44] 10netops, 06Infrastructure-Foundations, 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10101298 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by swfrench@cumin2002 for host w... [22:11:53] 10netops, 06Infrastructure-Foundations, 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10101358 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by swfrench@cumin2002 for host wikik...