[00:34:57] (VarnishTrafficDrop) firing: 68% GET drop in text@esams during the past 30 minutes - https://grafana.wikimedia.org/d/000000180/varnish-http-requests?viewPanel=6 - https://alerts.wikimedia.org [00:44:57] (VarnishTrafficDrop) firing: (3) 61% GET drop in text@eqiad during the past 30 minutes - https://grafana.wikimedia.org/d/000000180/varnish-http-requests?viewPanel=6 - https://alerts.wikimedia.org [00:48:34] 10Traffic: Wikimedia sites down 18 Sept 2021 - https://phabricator.wikimedia.org/T291311 (10Nirmos) [00:48:40] 10Traffic, 10SRE: Wikimedia sites down 18 Sept 2021 - https://phabricator.wikimedia.org/T291311 (10Ivi104) can confirm, all projects, all languages. [00:49:16] 10Traffic, 10SRE: Wikimedia sites down 18 Sept 2021 - https://phabricator.wikimedia.org/T291311 (10JJMC89) [00:49:29] 10Traffic, 10SRE: Wikimedia sites down 18 Sept 2021 - https://phabricator.wikimedia.org/T291311 (10Urbanecm) This is known and actively being investigated. Please stand by. [00:49:57] (VarnishTrafficDrop) firing: (3) 68% GET drop in text@eqiad during the past 30 minutes - https://grafana.wikimedia.org/d/000000180/varnish-http-requests?viewPanel=6 - https://alerts.wikimedia.org [00:50:31] 10Traffic, 10SRE: Wikimedia sites down 18 Sept 2021 - https://phabricator.wikimedia.org/T291311 (10Sakura_emad) Confirm on Meta, ckb, en [https://isup.me/] [00:50:39] 10Traffic, 10SRE: Wikimedia sites down 18 Sept 2021 - https://phabricator.wikimedia.org/T291311 (10Tol) From T291312, some other errors I have experienced: * `upstream connect error or disconnect/reset before headers. reset reason: connection failure` * `upstream connect error or disconnect/reset before header... [00:51:24] 10Traffic, 10SRE: Wikimedia sites down 18 Sept 2021 - https://phabricator.wikimedia.org/T291311 (10MSG17) Some stats (thanks TNT on the unofficial Wikimedia Discord: https://grafana.wikimedia.org/d/000000479/frontend-traffic?orgId=1&from=1631915084782&to=1631925884782&var-site=All&var-cache_type=text&var-cache... [00:51:27] 10Traffic, 10SRE: Wikimedia sites down 18 Sept 2021 - https://phabricator.wikimedia.org/T291311 (10Urbanecm) p:05Triage→03Unbreak! [00:52:09] 10Traffic, 10SRE, 10Wikimedia-Incident: Wikimedia sites down 18 Sept 2021 - https://phabricator.wikimedia.org/T291311 (10AntiCompositeNumber) [00:52:30] 10Traffic, 10SRE, 10Wikimedia-Incident: 2021-09-18 Wikimedia sites down - https://phabricator.wikimedia.org/T291311 (10Peachey88) [00:55:06] (VarnishTrafficDrop) resolved: (3) 68% GET drop in text@eqiad during the past 30 minutes - https://grafana.wikimedia.org/d/000000180/varnish-http-requests?viewPanel=6 - https://alerts.wikimedia.org [01:00:49] 10Traffic, 10SRE, 10Wikimedia-Incident: 2021-09-18 Wikimedia sites down - https://phabricator.wikimedia.org/T291311 (10Tol) Enwiki, wikidata, and meta are all loading normally for me now (frontend & API). [01:06:41] 10Traffic, 10SRE, 10Wikimedia-Incident: 2021-09-18 Wikimedia sites down - https://phabricator.wikimedia.org/T291311 (10Thryduulf) en.wp (at least) was agonisingly slow shortly before it all went down, and even after en.wp came back up I wasn't able to allow Oauth access to login here. [01:10:19] 10Traffic, 10SRE, 10Wikimedia-Incident: 2021-09-18 Wikimedia sites down - https://phabricator.wikimedia.org/T291311 (10Ivi104) >>! In T291311#7363415, @Thryduulf wrote: > en.wp (at least) was agonisingly slow shortly before it all went down, and even after en.wp came back up I wasn't able to allow Oauth acce... [01:11:26] 10Traffic, 10SRE, 10Wikimedia-Incident: 2021-09-18 Wikimedia sites down - https://phabricator.wikimedia.org/T291311 (10LD) Can confirm : fr.wp went slow around 02:10 AM CEST, came back up around 03:00 AM CEST. No abnormal filtered detections from fr.wp's AbuseFilter. [01:47:43] 10Traffic, 10SRE, 10Wikimedia-Incident: 2021-09-18 Wikimedia sites down - https://phabricator.wikimedia.org/T291311 (10RLazarus) 05Open→03Resolved a:03RLazarus This should be fully resolved as of about 1:10 UTC, sorry for the trouble and thanks for all the reports. Because the root cause was a DoS vec... [02:07:13] 10Traffic, 10SRE, 10MW-1.35-notes (1.35.0-wmf.40; 2020-07-07), 10Patch-For-Review, and 2 others: Harmonise the identification of requests across our stack - https://phabricator.wikimedia.org/T201409 (10Krinkle) [02:07:39] 10Traffic, 10SRE, 10MW-1.35-notes (1.35.0-wmf.40; 2020-07-07), 10Patch-For-Review, and 2 others: Harmonise the identification of requests across our stack - https://phabricator.wikimedia.org/T201409 (10Krinkle) [05:31:57] (VarnishTrafficDrop) firing: 36% GET drop in text@codfw during the past 30 minutes - https://grafana.wikimedia.org/d/000000180/varnish-http-requests?viewPanel=6 - https://alerts.wikimedia.org [05:36:57] (VarnishTrafficDrop) resolved: 51% GET drop in text@codfw during the past 30 minutes - https://grafana.wikimedia.org/d/000000180/varnish-http-requests?viewPanel=6 - https://alerts.wikimedia.org [09:27:25] 10Traffic, 10SRE, 10Wikimedia-Incident: 2021-09-18 Wikimedia sites down - https://phabricator.wikimedia.org/T291311 (10Nehaoua) since 23:00 UTC i can't Save any modification and always this messages, i check abusefilter = 0 result {F34646785} {F34646789} [16:36:57] (VarnishTrafficDrop) firing: 67% GET drop in text@codfw during the past 30 minutes - https://grafana.wikimedia.org/d/000000180/varnish-http-requests?viewPanel=6 - https://alerts.wikimedia.org [16:41:57] (VarnishTrafficDrop) resolved: 68% GET drop in text@codfw during the past 30 minutes - https://grafana.wikimedia.org/d/000000180/varnish-http-requests?viewPanel=6 - https://alerts.wikimedia.org