[07:48:42] akosiaris: Do you have any idea what might cause etags to be missing "sometimes" from responses from the REST endpoints we just re-routed? I can't figure out the pattern... [07:48:57] Please have a look at the conversation at the bottom of https://phabricator.wikimedia.org/T374683 [09:08:04] 10netops, 06Infrastructure-Foundations, 10observability, 10Prod-Kubernetes, and 2 others: Prevent BGP alerts triggering when K8s host maintenance is being done - https://phabricator.wikimedia.org/T384731#10495846 (10JMeybohm) [09:48:36] topranks: BTW, could be T384774 related to T374401? [09:48:37] T384774: Jan 2025 - Magru core router connectivity blips - https://phabricator.wikimedia.org/T384774 [09:48:37] T374401: Transient DOWN alert on cr2-magru - https://phabricator.wikimedia.org/T374401 [09:49:04] 10netops, 06Infrastructure-Foundations: Jan 2025 - Magru core router connectivity blips - https://phabricator.wikimedia.org/T384774#10496019 (10Vgutierrez) This looks quite similar to T374401 [09:49:55] vgutierrez: thanks that does add more context [09:50:27] I don't think we isolated it to the link between the CRs in magru on that older ticket, but it seems fairly likely to me it was something similar [09:50:43] I'm gonna update JunOS on them to the latest stable and reboot them [09:50:57] monitor how it goes from then. it did flip once since Saturday again [09:53:18] topranks: thx <3 [10:03:21] 10netops, 06Infrastructure-Foundations: Jan 2025 - Magru core router connectivity blips - https://phabricator.wikimedia.org/T384774#10496093 (10cmooney) >>! In T384774#10496019, @Vgutierrez wrote: > This looks quite similar to T374401 Yes. We didn't isolate things to the link between he routers on that occas... [10:04:05] 10netops, 06Infrastructure-Foundations: Jan 2025 - Magru core router connectivity blips - https://phabricator.wikimedia.org/T384774#10496097 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=5e91bf43-5eb3-4e7a-9b94-8ae6c366da3f) set by cmooney@cumin1002 for 2 days, 0:00:00 on 4 host(s) and th... [10:34:48] 06Traffic, 06Data-Engineering, 13Patch-For-Review: Rollout haproxykafka on all hosts - https://phabricator.wikimedia.org/T378578#10496276 (10Fabfur) [10:56:48] 10netops, 06Infrastructure-Foundations: Jan 2025 - Magru core router connectivity blips - https://phabricator.wikimedia.org/T384774#10496377 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=400d1435-bc43-43ba-813d-df761b30f623) set by cmooney@cumin1002 for 2:00:00 on 2 host(s) and their serv... [11:26:03] volans: https://gerrit.wikimedia.org/r/c/operations/puppet/+/1114352 requires a spicerack.service.serviceLVS update, right? [11:27:23] vgutierrez: in a meeting, will check in few minutes [11:27:31] sure, no rush [11:29:38] 10netops, 06Infrastructure-Foundations, 06SRE: Upgrade core routers to Junos 23.4R2 - https://phabricator.wikimedia.org/T364092#10496529 (10cmooney) [11:30:26] 10netops, 06Infrastructure-Foundations, 06SRE: Upgrade core routers to Junos 23.4R2 - https://phabricator.wikimedia.org/T364092#10496532 (10cmooney) [11:30:58] vgutierrez: correct you need something like https://gerrit.wikimedia.org/r/c/operations/software/spicerack/+/975273 [11:51:37] 10netops, 06Infrastructure-Foundations: Jan 2025 - Magru core router connectivity blips - https://phabricator.wikimedia.org/T384774#10496657 (10cmooney) Ok both routers have been upgraded to JunOS 23.4R2 and reset. I've extended the downtime until Wednesday morning. Let's make a call in 24h on whether they... [12:31:21] 10netops, 06Infrastructure-Foundations, 10ops-magru: Jan 2025 - Magru core router connectivity blips - https://phabricator.wikimedia.org/T384774#10496715 (10RobH) [14:52:19] 06Traffic: Consider using HTTPS RR for canonical domains - https://phabricator.wikimedia.org/T384839 (10Vgutierrez) 03NEW [14:55:34] 06Traffic: Consider using HTTPS RR for canonical domains - https://phabricator.wikimedia.org/T384839#10497257 (10Vgutierrez) p:05Triage→03Medium [14:59:39] 06Traffic: Consider using HTTPS RR for canonical domains - https://phabricator.wikimedia.org/T384839#10497281 (10ssingh) An example of such a record looks like: ` dig defo.ie TYPE65 +short 1 . ipv4hint=213.108.108.101 ech=AED+DQA8SwAgACDVz+ZgifkNNlAAO8ptWxnOJWppLWLX3izYexys+rPSYgAEAAEAAQANY292ZXIuZGVmby5pZQAA i... [20:21:34] 06Traffic, 06Commons, 10MediaWiki-Uploading, 06SRE: HTTP 503 error when uploading images on Wikimedia Commons - https://phabricator.wikimedia.org/T383274#10498666 (10RLazarus)