[01:34:56] (EdgeTrafficDrop) firing: 64% request drop in text@esams during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=esams&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DEdgeTrafficDrop [01:35:57] ^ expecte [01:35:58] d [01:44:57] (EdgeTrafficDrop) resolved: 69% request drop in text@esams during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=esams&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DEdgeTrafficDrop [02:26:56] (EdgeTrafficDrop) firing: 51% request drop in text@esams during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=esams&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DEdgeTrafficDrop [02:56:56] (EdgeTrafficDrop) resolved: 63% request drop in text@esams during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=esams&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DEdgeTrafficDrop [12:26:39] 10netops, 10Data-Engineering, 10Data-Engineering-Kanban, 10Infrastructure-Foundations, and 3 others: Collect netflow data for internal traffic - https://phabricator.wikimedia.org/T263277 (10ayounsi) Documentation updated: https://wikitech.wikimedia.org/wiki/Netflow/sflow [12:40:54] 10netops, 10Data-Engineering, 10Infrastructure-Foundations, 10Product-Analytics, and 2 others: Maybe restrict domains accessible by webproxy - https://phabricator.wikimedia.org/T300977 (10BTullis) It's clear from the above that we have two distinct use cases that have emerged for the web proxies: | # | Na... [16:15:45] 10Traffic, 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: drmrs: initial geodns configuration - https://phabricator.wikimedia.org/T304089 (10BBlack) Note - we've made a last-minute change of plans about the timeline of the experiment, and decided to shorten it by one hour. We'll be r... [20:11:56] 10Traffic: Resolve issues with cp hosts and the reboot-single cookbook - https://phabricator.wikimedia.org/T305275 (10ssingh) [20:15:17] 10Traffic: Resolve issues with cp hosts and the reboot-single cookbook - https://phabricator.wikimedia.org/T305275 (10ssingh) p:05Triage→03Medium [22:11:57] (EdgeTrafficDrop) firing: 59% request drop in text@drmrs during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=drmrs&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DEdgeTrafficDrop [22:19:28] ^ expected [22:21:39] 10Traffic, 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: drmrs: initial geodns configuration - https://phabricator.wikimedia.org/T304089 (10BBlack) [22:39:58] 10Traffic, 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: drmrs: initial geodns configuration - https://phabricator.wikimedia.org/T304089 (10BBlack) Test concluded, and esams is re-pooled. More analysis and planning to follow next week I'm sure, but the basic highlights are: * We we... [22:56:57] (EdgeTrafficDrop) resolved: 69% request drop in text@drmrs during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=drmrs&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DEdgeTrafficDrop