[01:13:58] 06Traffic: Misleading error message when accessing an invalid URL at upload.wikimedia.org - https://phabricator.wikimedia.org/T381232#10573983 (10Pppery) [09:11:09] FIRING: LVSHighRX: Excessive RX traffic on lvs2013:9100 (eno12399np0) - https://bit.ly/wmf-lvsrx - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=lvs2013 - https://alerts.wikimedia.org/?q=alertname%3DLVSHighRX [09:16:09] RESOLVED: LVSHighRX: Excessive RX traffic on lvs2013:9100 (eno12399np0) - https://bit.ly/wmf-lvsrx - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=lvs2013 - https://alerts.wikimedia.org/?q=alertname%3DLVSHighRX [09:33:05] 10netops, 06Infrastructure-Foundations, 06SRE: gNMIc connection not working for cloudsw2-d5-eqiad - https://phabricator.wikimedia.org/T387018#10574276 (10ayounsi) The switch is running a too old junos version for `analytics-agent`. I tried `cloudsw2-d5-eqiad> restart SDN-Telemetry gracefully` instead, but th... [09:51:09] 10netops, 06Infrastructure-Foundations, 06SRE: gNMIc connection not working for cloudsw2-d5-eqiad - https://phabricator.wikimedia.org/T387018#10574369 (10cmooney) >>! In T387018#10574276, @ayounsi wrote: > The switch is running a too old junos version for `analytics-agent`. I tried `cloudsw2-d5-eqiad> restar... [09:59:04] 10netops, 06Infrastructure-Foundations, 06SRE: gNMIc connection not working for cloudsw2-d5-eqiad - https://phabricator.wikimedia.org/T387018#10574426 (10ayounsi) Enabling traceoptions shows a `no shared cipher` error on the switch : ` Feb 24 09:33:58 ssl_transport_security.c:948: Handshake failed with fatal... [12:12:48] FIRING: PuppetFailure: Puppet has failed on cp4037:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [12:54:18] o/ I have another migration to remove restbase from the path to hit mobileapps/page content service for hewiki I'd like to deploy today if suitable. testwiki is already migrated and looks fine but obviously gets a lot less API traffic :) [12:54:22] https://gerrit.wikimedia.org/r/c/operations/puppet/+/1117508 [12:57:48] RESOLVED: PuppetFailure: Puppet has failed on cp4037:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [13:39:13] * vgutierrez looking [13:42:23] hnowlan: looks good from our side [13:53:26] 10netops, 06Infrastructure-Foundations, 10ops-magru, 06SRE: cr2-magru errors on xe-0/1/0 (EdgeUno Transit) - https://phabricator.wikimedia.org/T387006#10575129 (10ayounsi) > Our datacenter engineering team has concluded the on-site activity, and no problems were found on our side. Could you please confirm... [14:11:38] Dear lovely traffic team, I'm working on some major changes on thumbnailing on mediawiki that would make the values steps (T360589) basically these will be the steps: [20, 40, 60, 120, 250, 330, 500, 960] and if a user requests a value in between in mediawiki, mediawiki parses and turns the upload.wikimedia.org url to nearest value from the step (that's is larger) and then downsize it in the browser (via width attr). [14:11:38] T360589: De-fragment thumbnail sizes in mediawiki - https://phabricator.wikimedia.org/T360589 [14:12:00] That means the traffic to upload cluster will increase a bit but also cache hit ratio should go really up [14:12:21] (and you can heavily throttle non-step values to upload.wikimedia.org) [14:13:12] Do you have any concerns or notes? [14:13:34] of course I'm going to roll it out gradually to avoid melting thumbor/swift/upload [14:25:29] Amir1: looks good to me fwiw. vgutierrez, fabfur? [14:26:34] looks nice! [14:27:09] Amir1: nice one :D [14:28:05] fingers crossed I won't see any major issues [14:28:17] Amir1: whoever is on call will take care [14:28:23] this is the perfect week to break stuff [14:28:23] ^ [14:29:56] lol [14:47:02] 06Traffic, 10Maps, 06SRE: Allow Wikimedia Maps usage on schoolwiki.in - https://phabricator.wikimedia.org/T383210#10575269 (10ssingh) 05Open→03Resolved @Gnoeee: This has been rolled out and should now be live. Please feel free to re-open this task if there are any issues. Thank you! [14:51:36] vgutierrez: thanks - I think we're going to warm some caches first but I'll give a heads-up when we're rolling it out [14:51:51] hnowlan: ack [14:54:15] 06Traffic, 10Maps, 06SRE: Allow Wikimedia Maps usage on schoolwiki.in - https://phabricator.wikimedia.org/T383210#10575282 (10Ranjithsiji) @ssingh Thank you for doing this. This will be helpfull to schoolwiki. I will check with the server engineer of school wiki to test this. And we will implement this c... [15:32:40] 10netops, 06Infrastructure-Foundations, 13Patch-For-Review: cr2-esams:interface ae1 present under protocol ospf but not configure - https://phabricator.wikimedia.org/T386766#10575439 (10cmooney) p:05Triage→03Low [16:31:20] 10netops, 06Infrastructure-Foundations: cr2-esams:interface ae1 present under protocol ospf but not configure - https://phabricator.wikimedia.org/T386766#10575784 (10Papaul) 05Open→03Resolved Complete [17:41:46] 06Traffic, 06DC-Ops, 10ops-eqiad: Q3:test NIC for lvs1019 - https://phabricator.wikimedia.org/T387145 (10RobH) 03NEW [17:41:50] 06Traffic, 06DC-Ops, 10ops-eqiad: Q3:test NIC for lvs1019 - https://phabricator.wikimedia.org/T387145#10576084 (10RobH) [17:53:47] 06Traffic, 06DC-Ops, 10ops-eqiad: Q3:test NIC for lvs1019 - https://phabricator.wikimedia.org/T387145#10576115 (10Vgutierrez) per T381118#10436453 it should be lvs1017 or lvs1018, not lvs1019