[06:37:06] 10netops, 10Infrastructure-Foundations, 10SRE: Automate L3 Switch to Core Router BGP peerings (and remove OSPF on drmrs switches) - https://phabricator.wikimedia.org/T349125 (10ayounsi) Isn't OSPF required there to benefit from the end to end link cost calculations (eg. draining a transport link)? [08:12:43] 10netops, 10Infrastructure-Foundations, 10SRE: Bring Juniper switches in eqiad racks E5-7 and F5-7 online and ready for servers - https://phabricator.wikimedia.org/T334230 (10ayounsi) [08:12:51] 10netops, 10Infrastructure-Foundations, 10SRE: Put Dell SONiC switches in production - https://phabricator.wikimedia.org/T335028 (10ayounsi) [08:12:59] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, 10ops-eqiad: Cabling for Eqiad racks E5-8 and F5-8 - https://phabricator.wikimedia.org/T334231 (10ayounsi) 05Resolved→03Open a:05cmooney→03Jclark-ctr I can't get the links to the Dell switches up, only looking at lsw1-e8 for now it seems li... [08:16:07] 10netops, 10Infrastructure-Foundations, 10SRE: Put Dell SONiC switches in production - https://phabricator.wikimedia.org/T335028 (10ayounsi) [08:35:17] XioNoX: BTW, pwru (the tool to debug XDP by cillium) is pretty nice [08:36:05] XioNoX: I'm using it right now to debug some issues I'm seeing with my attempt to implement outer IP randomization for ipvs IPIP encapsulaption [08:37:00] vgutierrez: not sure it's worth the effort though. I don't think having fixed source IP is an issue in our infra [08:37:47] XioNoX: hmm we should check how broadcom NICs balance packets to several queues [08:38:09] if it's using the L4 data.. cp servers could have some issues [08:38:49] ah yeah, I was only taking about the middle part, the network :) [08:38:56] LOL [08:39:13] dunno how the servers behave [08:39:30] at least katran does it to help NICs steering packets to multiple queues [08:39:53] it randomize it per flow or per packets? [08:40:08] like multiple packets from a same session might end up on multiple queues? [08:40:24] nope, per flow [08:40:30] just based on source ip and source port [08:40:41] so as those stay steady you get the same randomized IP [08:43:43] ok, nice [08:44:10] they could even just copy the inner source IP to the outer source IP [08:44:21] for our usecase it would works fine [08:54:42] LOL.. I'm wondering if my code is working but the real server is just ignoring the packet cause 172.16.0.0/0 isn't a valid source for them [09:16:11] nah.. good old tcpdump showed the issue [09:16:35] IP 172.16.99.165 > 192.168.42.100: IP 192.168.42.1.52138 > 70.213.10.10.80 [09:16:56] for some season the destination IP of the inner IP header is being overwritten [09:23:11] https://www.irccloud.com/pastebin/IqTToSDL/ [09:24:21] ^^ XioNoX sample of pwru output.. you can see how the kernel gets the original packet from curl.. requesting http://10.10.10.10:80/.. how ipvs encapsulates that on a IPIP packet towards the realserver (192.168.42.100) and then my tc action written in eBPF replaces the source IP [09:25:23] that's neat [09:29:54] kind request to review https://gerrit.wikimedia.org/r/c/operations/dns/+/965119/ when any of you has a couple of spare minutes [12:48:11] 10Traffic, 10DNS, 10Infrastructure-Foundations, 10SRE, and 2 others: SVC DNS zonefiles and source of truth - https://phabricator.wikimedia.org/T270071 (10ayounsi) [12:57:12] 10Traffic, 10DNS, 10Infrastructure-Foundations, 10SRE, and 2 others: SVC DNS zonefiles and source of truth - https://phabricator.wikimedia.org/T270071 (10ayounsi) Let's move all the A/AAAA SVC records to Netbox. And keep the CNAMEs in the DNS repo if we can't get rid of them. Then have follow up tasks to... [13:54:12] hello! I have bad news and good news. The bad news is that I'm back with another change (please!) https://gerrit.wikimedia.org/r/c/operations/puppet/+/966851 - the good news is is that this is the second last AQS2 service [13:54:29] queries to the gateway look like `curl -H "Host: wikimedia.org" https://rest-gateway.discovery.wmnet:4113/wikimedia.org/v1/metrics/editors/top-by-edits/sw.wikipedia/all-editor-types/all-page-types/2018/01/01` [13:57:55] I can handle it [13:59:53] thanks! [14:03:18] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1001 for host cp1108.eqiad.wmnet with OS bullseye [14:09:40] looks good to me [14:10:43] fabfur: cool, thank you! I'll merge and do the usual rollout dance after I make a coffee [14:11:05] godspeed and good coffee! [14:31:53] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1001 for host cp1114.eqiad.wmnet with OS bullseye [14:36:25] looks okay to me in cp2037 [14:37:26] ack [14:37:41] going to repool and enable puppet [14:43:07] all done, thanks! [14:43:37] :) [14:52:03] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1001 for host cp1108.eqiad.wmnet with OS bullseye completed: - cp1108 (**PASS**) - Removed from Puppet... [14:56:02] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10Jclark-ctr) [15:03:15] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1001 for host cp1111.eqiad.wmnet with OS bullseye [15:05:02] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1001 for host cp1110.eqiad.wmnet with OS bullseye [15:07:18] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1001 for host cp1114.eqiad.wmnet with OS bullseye completed: - cp1114 (**PASS**) - Removed from Puppet... [15:08:03] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10Jclark-ctr) [15:10:11] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1001 for host cp1107.eqiad.wmnet with OS bullseye [15:26:23] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1001 for host cp1106.eqiad.wmnet with OS bullseye [15:29:49] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1001 for host cp1105.eqiad.wmnet with OS bullseye [15:45:50] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, 10ops-eqiad: Cabling for Eqiad racks E5-8 and F5-8 - https://phabricator.wikimedia.org/T334231 (10Jclark-ctr) Unsure if port is turned off or if fs dell optics are not compatible. I put loopback on optic in dell switch and link did not come up [15:46:28] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1001 for host cp1111.eqiad.wmnet with OS bullseye completed: - cp1111 (**PASS**) - Removed from Puppet... [15:48:31] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10Jclark-ctr) [15:50:33] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1001 for host cp1103.eqiad.wmnet with OS bullseye [15:51:01] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1001 for host cp1107.eqiad.wmnet with OS bullseye completed: - cp1107 (**WARN**) - Downtimed on Icinga/... [15:51:35] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10Jclark-ctr) [15:57:32] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1001 for host cp1110.eqiad.wmnet with OS bullseye executed with errors: - cp1110 (**FAIL**) - Removed f... [16:02:30] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1001 for host cp1106.eqiad.wmnet with OS bullseye completed: - cp1106 (**PASS**) - Removed from Puppet... [16:07:05] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10Jclark-ctr) [16:07:34] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1001 for host cp1105.eqiad.wmnet with OS bullseye completed: - cp1105 (**PASS**) - Removed from Puppet... [16:08:26] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10Jclark-ctr) [16:19:53] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1001 for host cp1110.eqiad.wmnet with OS bullseye [16:22:54] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1001 for host cp1100.eqiad.wmnet with OS bullseye [16:23:23] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1001 for host cp1101.eqiad.wmnet with OS bullseye [16:25:34] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1001 for host cp1102.eqiad.wmnet with OS bullseye [16:28:39] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1001 for host cp1103.eqiad.wmnet with OS bullseye completed: - cp1103 (**PASS**) - Removed from Puppet... [16:29:57] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10Jclark-ctr) [16:56:09] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1001 for host cp1102.eqiad.wmnet with OS bullseye completed: - cp1102 (**PASS**) - Removed from Puppet... [17:00:53] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10Jclark-ctr) [17:01:22] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1001 for host cp1101.eqiad.wmnet with OS bullseye completed: - cp1101 (**PASS**) - Removed from Puppet... [17:04:07] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10Jclark-ctr) [17:04:11] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1001 for host cp1100.eqiad.wmnet with OS bullseye completed: - cp1100 (**PASS**) - Removed from Puppet... [17:05:14] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1001 for host cp1104.eqiad.wmnet with OS bullseye [17:12:22] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1001 for host cp1110.eqiad.wmnet with OS bullseye executed with errors: - cp1110 (**FAIL**) - Removed f... [17:25:15] 10netops, 10Infrastructure-Foundations, 10SRE: Automate L3 Switch to Core Router BGP peerings (and remove OSPF on drmrs switches) - https://phabricator.wikimedia.org/T349125 (10cmooney) >>! In T349125#9260678, @ayounsi wrote: > Isn't OSPF required there to benefit from the end to end link cost calculations (... [17:34:45] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1001 for host cp1110.eqiad.wmnet with OS bullseye [18:06:02] 10Traffic, 10DNS, 10SRE, 10Patch-For-Review: Update DNS records for Greenhouse - https://phabricator.wikimedia.org/T348335 (10ssingh) 05Open→03Resolved We have updated the DNS records for Greenhouse, confirmed email delivery including 'reply-to' and checklist on the Greenhouse web interface. Marking th... [18:06:08] 10Traffic, 10DNS, 10SRE, 10Patch-For-Review: Update DNS records for Greenhouse - https://phabricator.wikimedia.org/T348335 (10ssingh) For posterity: we are now using `gh-mail.wikimedia.org` for the Greenhouse mails. [18:17:24] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host dns6001.wikimedia.org with OS bookworm [19:00:16] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1001 for host cp1104.eqiad.wmnet with OS bullseye completed: - cp1104 (**PASS**) - Removed from Puppet... [19:00:20] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1001 for host cp1110.eqiad.wmnet with OS bullseye completed: - cp1110 (**WARN**) - Removed from Puppet... [19:00:50] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10Jclark-ctr) [19:01:06] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10Jclark-ctr) 05Open→03Resolved [19:16:58] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host dns6001.wikimedia.org with OS bookworm completed: - dns6001 (**PASS**) - Downtimed on Icinga/Al... [19:34:11] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10BCornwall) [20:33:42] 10Traffic, 10SRE: Q1:Install cp11[00-15] and rotate into production - https://phabricator.wikimedia.org/T349244 (10Fabfur) [20:34:15] 10Traffic, 10SRE: Q1:Install cp11[00-15] and rotate into production - https://phabricator.wikimedia.org/T349244 (10Fabfur)