[05:40:43] 10netops, 10Infrastructure-Foundations, 10SRE: Configure bgp-error-tolerance on Juniper routers - https://phabricator.wikimedia.org/T340111 (10akosiaris) FYI, same mitigation applies to https://supportportal.juniper.net/s/article/2023-08-29-Out-of-Cycle-Security-Bulletin-Junos-OS-and-Junos-OS-Evolved-A-craft... [06:33:51] 10netops, 10Infrastructure-Foundations, 10SRE: xe-3/2/1: down -> Transport: cr1-esams:xe-0/0/7 (Lumen, BDFS2448 80ms 10Gbps wave) {#2013} - https://phabricator.wikimedia.org/T345138 (10ops-monitoring-bot) ===== Automated diagnostic for Netbox circuit ID 33 --- **Interface cr1-esams:xe-0/0/7** - admin-status... [06:35:50] 10netops, 10Infrastructure-Foundations, 10SRE: xe-3/2/1: down -> Transport: cr1-esams:xe-0/0/7 (Lumen, BDFS2448 80ms 10Gbps wave) {#2013} - https://phabricator.wikimedia.org/T345138 (10ayounsi) 05Open→03Resolved a:03ayounsi RFO sent by email. [08:34:44] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Configure bgp-error-tolerance on Juniper routers - https://phabricator.wikimedia.org/T340111 (10cmooney) Agreed this seems to make sense, and Juniper are advising it: https://supportportal.juniper.net/s/article/2023-08-29-Out-of-Cycle-Secu... [09:58:11] 10netops, 10Cloud-VPS, 10Infrastructure-Foundations, 10SRE, and 2 others: Move cloud vps ns-recursor IPs to host/row-independent addressing - https://phabricator.wikimedia.org/T307357 (10aborrero) [10:08:28] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Add per-output queue monitoring for Juniper network devices - https://phabricator.wikimedia.org/T326322 (10ayounsi) @cmooney I came across https://www.juniper.net/documentation/us/en/software/junos/interfaces-telemetry/topics/ref/statement/... [10:10:20] 10netops, 10Infrastructure-Foundations, 10SRE: Configure bgp-error-tolerance on Juniper routers - https://phabricator.wikimedia.org/T340111 (10ayounsi) 05Open→03Resolved a:03ayounsi All done. [10:10:24] 10netops, 10Infrastructure-Foundations, 10SRE: Configure bgp-error-tolerance on Juniper routers - https://phabricator.wikimedia.org/T340111 (10ayounsi) Relevant: https://blog.benjojo.co.uk/post/bgp-path-attributes-grave-error-handling [10:46:01] 10netops, 10Infrastructure-Foundations, 10SRE: Use mgmt_junos on all network devices - https://phabricator.wikimedia.org/T327862 (10ayounsi) 05Resolved→03Open Re-opening as the fasw got upgraded since, so we can enable `mgmt_junos` [10:56:47] 10Traffic, 10SRE: Varnish mobile redirection misses some sites - https://phabricator.wikimedia.org/T344175 (10Fabfur) a:03Fabfur [12:03:09] 10netops, 10Infrastructure-Foundations, 10SRE: Use mgmt_junos on all network devices - https://phabricator.wikimedia.org/T327862 (10ayounsi) 05Open→03Resolved Nevermind, still doesn't work on the fasw. [12:12:47] 10netops, 10Infrastructure-Foundations, 10SRE, 10SRE-tools: Setup zero touch provisioning (ZTP) for network devices - https://phabricator.wikimedia.org/T336485 (10ayounsi) Before running homer, the cookbook needs to call the `sre.network.tls` cookbook with the device's name as parameter to add the TLS cert... [13:36:12] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host doh6001.wikimedia.org with OS bookworm [13:38:16] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Add per-output queue monitoring for Juniper network devices - https://phabricator.wikimedia.org/T326322 (10ayounsi) I rolled the certificate to all the cloudsw, cr, and asw devices. I enabled gnmic on all the cloudsw and asw devices. I conf... [14:14:45] 10Traffic, 10SRE: Varnish mobile redirection misses some sites - https://phabricator.wikimedia.org/T344175 (10Fabfur) Hi, thanks for reporting this! Can you provide us a full list of sites that shows this behavior? It would be extremely helpful to us to patch that regex(es)... Thanks a lot! [14:24:15] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host doh6001.wikimedia.org with OS bookworm completed: - doh6001 (**PASS**) - Downtimed on Icinga/Al... [15:08:22] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host doh4001.wikimedia.org with OS bookworm [15:34:48] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10ssingh) [15:49:39] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host doh4001.wikimedia.org with OS bookworm completed: - doh4001 (**PASS**) - Downtimed on Icinga/Al... [15:51:45] 10netops, 10Infrastructure-Foundations, 10SRE: Juniper ZTP fails on certain devices due to DHCP binding on management router - https://phabricator.wikimedia.org/T345273 (10cmooney) p:05Triage→03Medium [15:52:04] 10netops, 10Infrastructure-Foundations, 10SRE: Juniper ZTP fails on certain devices due to DHCP binding on management router - https://phabricator.wikimedia.org/T345273 (10cmooney) [15:52:09] 10netops, 10Infrastructure-Foundations, 10SRE, 10SRE-tools, 10Patch-For-Review: Setup zero touch provisioning (ZTP) for network devices - https://phabricator.wikimedia.org/T336485 (10cmooney) [15:53:20] 10netops, 10Infrastructure-Foundations, 10SRE: Juniper ZTP fails on certain devices due to DHCP binding on management router - https://phabricator.wikimedia.org/T345273 (10cmooney) [15:59:02] 10netops, 10Infrastructure-Foundations, 10SRE: Juniper ZTP fails on certain devices due to DHCP binding on management router - https://phabricator.wikimedia.org/T345273 (10cmooney) [16:02:48] 10netops, 10Infrastructure-Foundations, 10SRE: Juniper ZTP fails on certain devices due to DHCP binding on management router - https://phabricator.wikimedia.org/T345273 (10cmooney) [16:12:15] 10netops, 10Infrastructure-Foundations, 10SRE: Juniper ZTP fails on certain devices due to DHCP binding on management router - https://phabricator.wikimedia.org/T345273 (10cmooney) [16:57:06] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Juniper ZTP fails on certain devices due to DHCP binding on management router - https://phabricator.wikimedia.org/T345273 (10ayounsi) Could we use `forward-only` everywhere once we move to DHCP option 97 with {T304677} ? [17:01:41] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Juniper ZTP fails on certain devices due to DHCP binding on management router - https://phabricator.wikimedia.org/T345273 (10cmooney) >>! In T345273#9131609, @ayounsi wrote: > Could we use `forward-only` everywhere once we move to DHCP opti... [17:19:56] 10Traffic, 10Movement-Insights, 10SRE: Varnish mobile redirection misses some sites - https://phabricator.wikimedia.org/T344175 (10nshahquinn-wmf) a:05Fabfur→03nshahquinn-wmf Yes, definitely. It might take me a few days since I accidentally deleted the code that I used to get the list 😅, but it won't be... [17:20:24] 10Traffic, 10Movement-Insights, 10SRE: Varnish mobile redirection misses some sites - https://phabricator.wikimedia.org/T344175 (10nshahquinn-wmf) p:05Triage→03Medium [17:22:12] 10Traffic, 10Movement-Insights, 10SRE: Varnish mobile redirection misses some sites - https://phabricator.wikimedia.org/T344175 (10Fabfur) If you want in the meantime we can start with this first list of domains and then add the others [17:29:31] 10Traffic, 10Movement-Insights, 10SRE: Varnish mobile redirection misses some sites - https://phabricator.wikimedia.org/T344175 (10nshahquinn-wmf) @Fabfur I don't think there's any reason to do that. It will be easier for you to do it all at once, and it's already been like this for years without causing any... [17:46:10] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host doh2001.wikimedia.org with OS bookworm [17:47:59] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10ssingh) [18:19:49] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host doh2001.wikimedia.org with OS bookworm completed: - doh2001 (**PASS**) - Downtimed on Icinga/Al... [21:58:42] 10Traffic, 10Data-Engineering-Icebox, 10SRE, 10WMF-General-or-Unknown, 10Developer Productivity: Requests for /static get an invalid WMF-Last-Access cookie for wikipedia.org on non-Wikipedia requests - https://phabricator.wikimedia.org/T261803 (10Krinkle) [21:59:05] 10Traffic, 10Data-Engineering-Icebox, 10SRE, 10WMF-General-or-Unknown, and 2 others: Requests for /static get an invalid WMF-Last-Access cookie for wikipedia.org on non-Wikipedia requests - https://phabricator.wikimedia.org/T261803 (10Krinkle)