[07:56:31] 10Traffic, 10SRE: pontoon.traffic.eqiad1.wikimedia.cloud unable to run puppet agent due to certificate mismatch - https://phabricator.wikimedia.org/T310303 (10fgiunchedi) I took a look at the puppet master at `pontoon.traffic.eqiad1.wikimedia.cloud` and got puppet to run, however now a self-signed error is sho... [10:27:04] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Finalise design extension of WMCS networks to new cloudsw in Eqiad rows E/F - https://phabricator.wikimedia.org/T304989 (10cmooney) No problem @nskaggs I'm off today but I can put some more verbose instructions together next week and link t... [14:30:47] 10Traffic, 10Librarization, 10MediaWiki-extensions-CentralNotice, 10SRE, and 4 others: Split GeoIP into a new component - https://phabricator.wikimedia.org/T102848 (10Krinkle) [14:56:39] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, and 3 others: Q3:(Need By: TBD) rack/setup/install 2 new labstore hosts - https://phabricator.wikimedia.org/T302981 (10Andrew) @papaul, do you have interest in working on this more or should I take back the task? I'm thinking we should probably cut o... [14:57:39] 10Traffic, 10Wikimedia-production-error: 503 Service Unavailable - https://phabricator.wikimedia.org/T310368 (10AlexisJazz) [15:26:56] (HAProxyEdgeTrafficDrop) firing: (3) 26% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [15:31:16] 10Traffic, 10SRE, 10Wikimedia-Incident: 503 Service Unavailable - https://phabricator.wikimedia.org/T310368 (10TheresNoTime) > Visit any Wikimedia project 2 minutes ago, any page //unable to reproduce currently — time machine broken// ( **/j** ) [15:31:56] (HAProxyEdgeTrafficDrop) resolved: (6) 27% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [15:34:49] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, and 3 others: Q3:(Need By: TBD) rack/setup/install 2 new labstore hosts - https://phabricator.wikimedia.org/T302981 (10Papaul) @andrew agree. I think the same partman recipe can do it by just removing the section below ` # setup the SDB disk with a s... [15:46:40] 10Traffic, 10SRE, 10Wikimedia-Incident: 503 Service Unavailable - https://phabricator.wikimedia.org/T310368 (10AlexisJazz) >>! In T310368#7995017, @TheresNoTime wrote: >> Visit any Wikimedia project 2 minutes ago, any page > //unable to reproduce currently — time machine broken// ( **/j** ) I tried to repo... [15:48:13] 10Traffic, 10SRE, 10Wikimedia-Incident: 503 Service Unavailable - https://phabricator.wikimedia.org/T310368 (10fgiunchedi) 05Open→03Resolved a:03fgiunchedi There was indeed a brief moment of unavailability (retroactively-posted incident at https://www.wikimediastatus.net/incidents/5k90l09x2p6k) I'm op... [15:58:14] 10Traffic, 10SRE, 10Wikimedia-Incident: 503 Service Unavailable - https://phabricator.wikimedia.org/T310368 (10AlexisJazz) >>! In T310368#7995052, @fgiunchedi wrote: > There was indeed a brief moment of unavailability (retroactively-posted incident at https://www.wikimediastatus.net/incidents/5k90l09x2p6k) >... [16:07:41] 10Traffic, 10SRE, 10Wikimedia-Incident: 503 Service Unavailable - https://phabricator.wikimedia.org/T310368 (10fgiunchedi) >>! In T310368#7995061, @AlexisJazz wrote: >>>! In T310368#7995052, @fgiunchedi wrote: >> There was indeed a brief moment of unavailability (retroactively-posted incident at https://www.... [16:38:52] 10Traffic, 10SRE, 10Wikimedia-Incident: 503 Service Unavailable - https://phabricator.wikimedia.org/T310368 (10AlexisJazz) >>! In T310368#7995101, @CDanis wrote: >>>! In T310368#7995061, @AlexisJazz wrote: >> That just says "From 14:55 to 15:01 UTC users have been experiencing slow/unavailable access to Wiki... [18:34:22] 10Traffic, 10SRE, 10envoy, 10serviceops, 10Patch-For-Review: Upgrade Envoy to supported version - https://phabricator.wikimedia.org/T300324 (10RLazarus) [18:38:56] (VarnishPrometheusExporterDown) firing: Varnish Exporter on instance cp1089:9331 is unreachable - https://alerts.wikimedia.org/?q=alertname%3DVarnishPrometheusExporterDown [18:39:38] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: cp1089 memory errors on DIMM_B1 - https://phabricator.wikimedia.org/T310387 (10ssingh)