[07:51:31] 06Traffic: puppet restarts nginx instead of reloading it on ncredir instances - https://phabricator.wikimedia.org/T383599#10456871 (10Vgutierrez) p:05Low→03Triage thanks for working on this @BCornwall [09:44:24] 10netops, 10Ceph, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Configure DSCP marking for cloudceph* hosts - https://phabricator.wikimedia.org/T371501#10457073 (10cmooney) >>! In T371501#10453986, @dcaro wrote: > We still have to restart all the osd daemon processes to pick up the config chan... [09:47:38] FIRING: [4x] LVSRealserverMSS: Unexpected MSS value on 208.80.153.240:443 @ cp2028 - https://wikitech.wikimedia.org/wiki/LVS#LVSRealserverMSS_alert - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=2&var-site=codfw&var-cluster=cache_upload - https://alerts.wikimedia.org/?q=alertname%3DLVSRealserverMSS [09:48:19] ^^ that's a side effect of upgrading haproxy [09:52:38] RESOLVED: [4x] LVSRealserverMSS: Unexpected MSS value on 208.80.153.240:443 @ cp2028 - https://wikitech.wikimedia.org/wiki/LVS#LVSRealserverMSS_alert - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=2&var-site=codfw&var-cluster=cache_upload - https://alerts.wikimedia.org/?q=alertname%3DLVSRealserverMSS [09:59:23] 10netops, 06Infrastructure-Foundations: Publish, and maintain ASPA records for valid AS14907 upstreams - https://phabricator.wikimedia.org/T372161#10457124 (10cmooney) Thanks for keeping up to date on this @Southparkfan! >>! In T372161#10449848, @Southparkfan wrote: > I understand hosting our own CA setup is... [10:37:33] 06Traffic: Upgrade haproxy to 2.8.13 on cp hosts - https://phabricator.wikimedia.org/T383111#10457218 (10Vgutierrez) [10:39:13] 06Traffic, 10PyBal: check_pybal_ipvs_diff crashes if a pooled realserver is missing its PTR record - https://phabricator.wikimedia.org/T383661 (10Vgutierrez) 03NEW [10:53:53] 06Traffic, 10PyBal: check_pybal_ipvs_diff crashes if a pooled realserver is missing its PTR record - https://phabricator.wikimedia.org/T383661#10457286 (10Vgutierrez) p:05Triage→03Medium [11:09:23] 06Traffic, 06SRE, 13Patch-For-Review: varnishmtail metric loss due to mtail not reading from pipe fast enough - https://phabricator.wikimedia.org/T293879#10457334 (10fgiunchedi) I'm untagging o11y here since things seem stable and there's no action ATM, please reach out if things change! [11:25:39] 10netops, 06Infrastructure-Foundations: Multiple unreachable hosts in eqiad - https://phabricator.wikimedia.org/T382772#10457362 (10cmooney) 05Open→03Resolved I've tried to work out what went on here but wasn't really able to find anything. The common factor is //cr1-eqiad//, which connects to cloudsw... [11:36:35] 10netops, 10Hiddenparma, 06Infrastructure-Foundations, 10Prod-Kubernetes, 07Kubernetes: Allow reaching services on the aux k8s cluster bypassing the CDN - https://phabricator.wikimedia.org/T382269#10457395 (10cmooney) >>! In T382269#10421893, @akosiaris wrote: > Calico Open Source version doesn't support... [15:24:32] 10netops, 10Hiddenparma, 06Infrastructure-Foundations, 10Prod-Kubernetes, 07Kubernetes: Allow reaching services on the aux k8s cluster bypassing the CDN - https://phabricator.wikimedia.org/T382269#10458292 (10CDanis) I am wondering if we really need the ability to expose aux services with public IPs. In... [18:18:31] 06Traffic, 06SRE, 13Patch-For-Review: Define a schema for analytics pipeline ingestion - https://phabricator.wikimedia.org/T383392#10459472 (10nshahquinn-wmf) Viewing and editing this task is not actually restricted. [18:20:49] 10netops, 10Ceph, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Configure DSCP marking for cloudceph* hosts - https://phabricator.wikimedia.org/T371501#10459488 (10dcaro) Just finished restarting all the osd daemons, all the traffic should now being tagged correctly 👍 [19:57:11] 06Traffic, 10DNS, 06MediaWiki-Platform-Team, 06SRE, 05SUL3: Set up auth.wikimedia.org - https://phabricator.wikimedia.org/T377187#10460143 (10Tgr) [20:31:43] 06Traffic, 10DNS, 06MediaWiki-Platform-Team, 06SRE, 05SUL3: Set up auth.wikimedia.org - https://phabricator.wikimedia.org/T377187#10460255 (10Tgr) 05Open→03Resolved Working as expected: * https://auth.wikimedia.org/enwiki/wiki/Special:UserLogin, https://auth.wikimedia.org/dewiki/wiki/Special:Use... [21:50:14] 10netops, 10Ceph, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Configure DSCP marking for cloudceph* hosts - https://phabricator.wikimedia.org/T371501#10460515 (10cmooney) >>! In T371501#10459488, @dcaro wrote: > Just finished restarting all the osd daemons, all the traffic should now being t... [22:18:33] 10netops, 06Infrastructure-Foundations, 10observability, 10Observability-Alerting, 06SRE: Alertmanager rule for network interface errors? - https://phabricator.wikimedia.org/T335350#10460558 (10cmooney) 05Open→03Resolved a:03cmooney >>! In T335350#10456238, @andrea.denisse wrote: > Hi @cmooney,... [23:15:05] 06Traffic, 06Data-Engineering, 10Experimentation Lab Radar: Cookie % has been rejected because it is foreign and does not have the "Partitioned" attribute - https://phabricator.wikimedia.org/T375256#10460720 (10VirginiaPoundstone) [23:15:13] 06Traffic, 06Data-Engineering, 06Data-Engineering-Radar, 10Observability-Logging, 13Patch-For-Review: Move analytics log from Varnish to HAProxy - https://phabricator.wikimedia.org/T351117#10460721 (10VirginiaPoundstone)