[08:11:23] 10Traffic, 10SRE, 10envoy, 10serviceops: Remove tls_minimum_protocol_version from envoy config - https://phabricator.wikimedia.org/T337453 (10JMeybohm) [08:11:39] 10Traffic, 10SRE, 10envoy, 10serviceops: Remove tls_minimum_protocol_version from envoy config - https://phabricator.wikimedia.org/T337453 (10JMeybohm) p:05Triage→03Low [08:45:08] 10netops, 10Infrastructure-Foundations, 10SRE, 10SRE-tools: Add network devices fingerprints to known_hosts - https://phabricator.wikimedia.org/T327643 (10jbond) > Netbox would be better. +1 this would also allow use to have them in the netbox-hiera pipeline which in turn makes it easier to add them all to... [08:49:56] 10Traffic, 10SRE, 10envoy, 10serviceops: Remove tls_minimum_protocol_version from envoy config - https://phabricator.wikimedia.org/T337453 (10Joe) It would be great if envoy fixed the TLS 1.3 to work well when two envoys talk to each other - we should check if that's been solved in the latest versions. [09:32:24] 10Traffic, 10SRE, 10envoy, 10serviceops: Remove tls_minimum_protocol_version from envoy config - https://phabricator.wikimedia.org/T337453 (10JMeybohm) >>! In T337453#8879233, @Joe wrote: > It would be great if envoy fixed the TLS 1.3 to work well when two envoys talk to each other - we should check if tha... [09:44:39] 10netops, 10Infrastructure-Foundations, 10SRE, 10SRE-tools: DHCP traffic to install server is missing - https://phabricator.wikimedia.org/T337345 (10cmooney) >>! In T337345#8878207, @Jclark-ctr wrote: > @ayounsi the provisioning script is still failing in row e/f. dbproxy1026 dbproxy1027 I tested there... [15:05:02] 10netops, 10Infrastructure-Foundations, 10SRE: Migrate row E/F network aggregation to dedicated Spine switches - https://phabricator.wikimedia.org/T322937 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=03f7b2ab-bdea-4c56-ac41-3ec30004db4a) set by cmooney@cumin1001 for 0:30:00 on 2 host(s... [15:17:29] 10Traffic, 10MW-on-K8s, 10SRE, 10serviceops, and 2 others: Migrate group0 to Kubernetes - https://phabricator.wikimedia.org/T337490 (10Clement_Goubert) [15:17:59] 10Traffic, 10MW-on-K8s, 10SRE, 10serviceops, and 2 others: Migrate group0 to Kubernetes - https://phabricator.wikimedia.org/T337490 (10Clement_Goubert) p:05Triage→03High [15:18:11] 10Traffic, 10MW-on-K8s, 10SRE, 10serviceops, and 2 others: Migrate group0 to Kubernetes - https://phabricator.wikimedia.org/T337490 (10Clement_Goubert) [15:21:10] 10netops, 10Infrastructure-Foundations, 10SRE: Migrate row E/F network aggregation to dedicated Spine switches - https://phabricator.wikimedia.org/T322937 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=cf76e0ba-8648-48a0-beed-fe7b60f79656) set by cmooney@cumin1001 for 0:30:00 on 2 host(s... [15:28:02] 10Traffic, 10SRE, 10serviceops, 10Platform Team Initiatives (API Gateway): Handle edge cache invalidation for the api gateway - https://phabricator.wikimedia.org/T324200 (10elukey) The ML team is serving its Lift Wing model servers via the API gateway, so we'd benefit as well to have edge caching :) [15:34:05] 10netops, 10Infrastructure-Foundations, 10SRE: Migrate row E/F network aggregation to dedicated Spine switches - https://phabricator.wikimedia.org/T322937 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=8f44dd48-0cac-4bfd-907a-512dfa686d40) set by cmooney@cumin1001 for 0:30:00 on 2 host(s... [15:56:09] 10Traffic, 10PyBal, 10Release-Engineering-Team, 10Scap, and 3 others: High rate of errors and increased latency on uncached MediaWiki requests due to infrastructure outage - https://phabricator.wikimedia.org/T337497 (10jcrespo) [15:56:18] 10Traffic, 10PyBal, 10Release-Engineering-Team, 10Scap, and 3 others: High rate of errors and increased latency on uncached MediaWiki requests due to infrastructure outage - https://phabricator.wikimedia.org/T337497 (10jcrespo) p:05Triage→03High [15:57:18] 10netops, 10Infrastructure-Foundations, 10SRE: Migrate row E/F network aggregation to dedicated Spine switches - https://phabricator.wikimedia.org/T322937 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=c43be552-7ced-4f58-99c1-a10b5984bf3a) set by cmooney@cumin1001 for 0:30:00 on 2 host(s... [15:57:34] 10Traffic, 10PyBal, 10Release-Engineering-Team, 10Scap, and 3 others: High rate of errors and increased latency on uncached MediaWiki requests due to infrastructure outage - https://phabricator.wikimedia.org/T337497 (10jcrespo) [15:58:20] 10Traffic, 10PyBal, 10Release-Engineering-Team, 10Scap, and 3 others: High rate of errors and increased latency on uncached MediaWiki requests due to infrastructure outage - https://phabricator.wikimedia.org/T337497 (10jcrespo) [16:08:40] 10Traffic, 10PyBal, 10Release-Engineering-Team, 10Scap, and 3 others: High rate of errors and increased latency on uncached MediaWiki requests due to infrastructure outage - https://phabricator.wikimedia.org/T337497 (10jcrespo) [16:09:38] 10Traffic, 10PyBal, 10Release-Engineering-Team, 10Scap, and 3 others: High rate of errors and increased latency on uncached MediaWiki requests due to infrastructure outage - https://phabricator.wikimedia.org/T337497 (10jcrespo) [16:10:02] 10Traffic, 10PyBal, 10Release-Engineering-Team, 10Scap, and 3 others: High rate of errors and increased latency on uncached MediaWiki requests due to infrastructure outage - https://phabricator.wikimedia.org/T337497 (10jcrespo) [16:11:38] 10netops, 10Infrastructure-Foundations, 10SRE: Migrate row E/F network aggregation to dedicated Spine switches - https://phabricator.wikimedia.org/T322937 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=37545969-c51e-450d-9ef0-5fadfd151520) set by cmooney@cumin1001 for 0:30:00 on 3 host(s... [16:13:02] 10Traffic, 10PyBal, 10Release-Engineering-Team, 10Scap, and 3 others: High rate of errors and increased latency on uncached MediaWiki requests due to infrastructure outage - https://phabricator.wikimedia.org/T337497 (10jcrespo) [16:13:14] 10Traffic, 10PyBal, 10Release-Engineering-Team, 10Scap, and 3 others: High rate of errors and increased latency on uncached MediaWiki requests due to infrastructure outage - https://phabricator.wikimedia.org/T337497 (10jcrespo) [16:13:43] 10Traffic, 10PyBal, 10Release-Engineering-Team, 10Scap, and 3 others: High rate of errors and increased latency on uncached MediaWiki requests due to infrastructure outage - https://phabricator.wikimedia.org/T337497 (10jcrespo) [16:29:23] 10Traffic, 10PyBal, 10Release-Engineering-Team, 10SRE, and 4 others: High rate of errors and increased latency on uncached MediaWiki requests due to infrastructure outage - https://phabricator.wikimedia.org/T337497 (10jcrespo) [16:48:53] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Create Quality of Service design for WMF internal networks - https://phabricator.wikimedia.org/T316358 (10cmooney) [16:49:46] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Expose sub-rated circuit speeds to Homer templates - https://phabricator.wikimedia.org/T328313 (10cmooney) 05Open→03Resolved Merged and shapers set on codfw to eqsin link. [17:06:18] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, 10ops-eqiad: Q2:(Need By: TBD) Rows E/F network racking task - https://phabricator.wikimedia.org/T292095 (10cmooney) We migrated a bunch of network <-> network links today without issue (crossed them out in above table). Didn't touch the LVS's aft... [17:41:42] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Migrate row E/F network aggregation to dedicated Spine switches - https://phabricator.wikimedia.org/T322937 (10cmooney) Step 2 - Move CR Uplinks has now been completed. We are also 50% of the way through steps 3 and 4. Will continue with...