[02:05:20] FIRING: [3x] PKICertificateExpiry: Intermediate certificate in the trust chain for discovery expires in 6d 11h 49m 25s - https://wikitech.wikimedia.org/wiki/PKI/CA_Operations - TODO - https://alerts.wikimedia.org/?q=alertname%3DPKICertificateExpiry [06:05:20] FIRING: [3x] PKICertificateExpiry: Intermediate certificate in the trust chain for discovery expires in 6d 7h 49m 25s - https://wikitech.wikimedia.org/wiki/PKI/CA_Operations - TODO - https://alerts.wikimedia.org/?q=alertname%3DPKICertificateExpiry [07:28:58] oh yes we know! [08:06:05] 10netops, 06Infrastructure-Foundations, 13Patch-For-Review: Enable gNMI on SRX devices and fasw - https://phabricator.wikimedia.org/T390052#11859386 (10ayounsi) 05Resolved→03Open There is some hope that Junos 25.4R1 comes with gNMI support - https://apps.juniper.net/feature-explorer/select-platform.html?... [08:43:08] 10netops, 06Infrastructure-Foundations: Upgrade netflow hosts to Trixie - https://phabricator.wikimedia.org/T424478 (10ayounsi) 03NEW p:05Triage→03Low [10:05:20] FIRING: [3x] PKICertificateExpiry: Intermediate certificate in the trust chain for discovery expires in 6d 3h 49m 25s - https://wikitech.wikimedia.org/wiki/PKI/CA_Operations - TODO - https://alerts.wikimedia.org/?q=alertname%3DPKICertificateExpiry [10:52:46] 10netops, 06Infrastructure-Foundations: Upgrade netflow hosts to Trixie - https://phabricator.wikimedia.org/T424478#11860419 (10ayounsi) [13:21:31] 10netops, 06Infrastructure-Foundations: mr1-eqiad: move from OSPF to BGP - https://phabricator.wikimedia.org/T421238#11860895 (10cmooney) 05Open→03Resolved [13:46:48] 10netops, 06Infrastructure-Foundations, 13Patch-For-Review: Enable gNMI on SRX devices and fasw - https://phabricator.wikimedia.org/T390052#11861075 (10cmooney) >>! In T390052#11859386, @ayounsi wrote: > Now we need to figure out if it's worth upgrading the management routers or not, as it's more recent that... [13:53:24] 10netops, 06Infrastructure-Foundations: Create public vlans in eqiad and codfw - https://phabricator.wikimedia.org/T422043#11861105 (10Jclark-ctr) For Eqiad, I would choose A3, C1, and either E8 or F8. A3 is currently 1G, and C1 is pending the arrival of new switches. It was previously out for fundraising. [14:05:20] FIRING: [3x] PKICertificateExpiry: Intermediate certificate in the trust chain for discovery expires in 5d 23h 49m 25s - https://wikitech.wikimedia.org/wiki/PKI/CA_Operations - TODO - https://alerts.wikimedia.org/?q=alertname%3DPKICertificateExpiry [14:51:25] 10netbox, 10netops, 10SRE-tools, 10bacula, and 2 others: netbox2003 backups (maybe others?) are missconfigured or failing to find the backup directory - https://phabricator.wikimedia.org/T423689#11861581 (10jcrespo) 05Open→03Resolved Resolving unless issues are seen after deployment- reopen if anyt... [15:06:20] 10netops, 06Infrastructure-Foundations: Upgrade netflow hosts to Trixie - https://phabricator.wikimedia.org/T424478#11861669 (10ayounsi) [15:08:38] 10netops, 06Infrastructure-Foundations: Create public vlans in eqiad and codfw - https://phabricator.wikimedia.org/T422043#11861674 (10ayounsi) [15:08:43] 10netops, 06Infrastructure-Foundations: Create public vlans in eqiad and codfw - https://phabricator.wikimedia.org/T422043#11861676 (10ayounsi) [15:09:22] 10netops, 06Infrastructure-Foundations: Create public vlans in eqiad and codfw - https://phabricator.wikimedia.org/T422043#11861677 (10ayounsi) Great, thanks ! Task description updated. [15:10:49] topranks: fyi, I've upraded gnmic to 0.45.0 in ulsfo, magru and one of the two codfw servers, I'll roll it to all of them tomorrow if no issues [15:11:54] ok nice! [15:12:41] topranks: https://debmonitor.wikimedia.org/packages/gnmic for a quick overview [15:23:34] 10netops, 06Infrastructure-Foundations: Create public vlans in eqiad and codfw - https://phabricator.wikimedia.org/T422043#11861745 (10ayounsi) [15:52:22] topranks: XioNoX: quick question. what's your go-to command for checking for BGP sessions in a routed Ganeti setup? essentially I am running and parsing `show route detail` and then parsing the next-hop and all that. how do I quickly verify though that for example a VIP is being correctly advertised by the Ganeti? [15:53:56] sukhe: I'd normally alternate between these commands [15:54:03] show bgp summary group Ganeti[4|6] [15:54:21] show route receive-protocol bgp [15:56:30] sukhe: for juniper anyway that's it, Nokia isn't quite as neat to do, or at least couldn't find anything [15:56:32] oh interesting, OK makes sense, the two step approach [15:57:28] one shows the status of the hosts in the group, and their IP, next needs the IP hence I typically run it [15:57:43] Nokia there is no good way to show the group peers alone. if you know the host IP you can run: [15:58:02] show network-instance [default|PRODUCTION] protocols bgp neighbor received-routes [ipv4|ipv6] [15:59:24] noted thanks [15:59:43] interestingly enough, we brought up doh500[34] and while they seem to be annoucing the IPs from the hosts themselves [15:59:49] I don't see the traffic in the path [16:00:15] sukhe@cr2-eqsin> show route 185.71.138.138 | match 103.102.166.14 [16:00:20] correct [16:00:24] also .5 for doh5002 [16:00:42] but nothing for 103.102.166.98 [doh5003] or 103.102.166.99 [doh5004] [16:00:46] so I wonder what's up there [16:01:08] root@doh5003:~# cat /etc/bird/anycast-prefixes.conf [16:01:11] 185.71.138.138/32 [16:01:30] anyway, not urgent sorry, but just putting this here in case I am missing something [16:06:51] sukhe: the switches will prefer the route from doh5001 which it is learning direct (single ASN in the path), than those from routed ganeti nodes (two ASNs in the path) [16:08:30] easiest is probably just to wait till we move the others over to routed ganeti too [16:08:39] aah ok! thank you for clarifying [16:08:45] moritzm: +1 on moving ahead with the decomm then [16:08:49] otherwise we could do some custom policy - either outbound on the doh nodes on the older setup, or inbound for them on the CRs [16:09:07] if you want to test/validate the new doh is ok before moving ahead we can add a static route so some traffic will hit it to check [16:09:27] topranks: yeah not required I think, I just wanted to understand why I am not seeing a path to the new VMs at all [16:09:32] but it makes sense now after you told me why [16:09:49] I thought at least I would see the path in the output, even if it's not the path it will take [16:11:58] you'll see it alright: [16:12:04] https://www.irccloud.com/pastebin/5djfrOK9/ [16:12:17] "terse" is nice to see the as path clearly on JunOS [16:12:18] aaah terse [16:12:26] thanks for putting up with me topranks :) [16:12:26] it's there without it but less clear [16:12:28] * sukhe notes [16:14:58] sukhe: thanks for confirming, I'll decom these tomorrow morning [17:48:25] topranks: looks like your GRE tunnel was not done for nothing :) https://librenms.wikimedia.org/device/device=2/tab=port/port=83462/ [17:55:52] 07Puppet, 06SRE, 03Readers Essential Work (Simplify MobileFrontend): Certain mobile devices are (possibly) not being redirected to our mobile site - https://phabricator.wikimedia.org/T388032#11862498 (10Jdlrobson-WMF) 05Stalled→03Declined Declining for now. [18:05:20] FIRING: [3x] PKICertificateExpiry: Intermediate certificate in the trust chain for discovery expires in 5d 19h 49m 25s - https://wikitech.wikimedia.org/wiki/PKI/CA_Operations - TODO - https://alerts.wikimedia.org/?q=alertname%3DPKICertificateExpiry [18:13:22] XioNoX: haha ok, I told them I wanted a longer outage though hardly seems worth it :) [22:05:20] FIRING: [3x] PKICertificateExpiry: Intermediate certificate in the trust chain for discovery expires in 5d 15h 49m 25s - https://wikitech.wikimedia.org/wiki/PKI/CA_Operations - TODO - https://alerts.wikimedia.org/?q=alertname%3DPKICertificateExpiry