[04:43:40] 10Traffic, 10Performance-Team, 10SRE, 10SRE-swift-storage, 10Patch-For-Review: Automatically clean up unused thumbnails in Swift - https://phabricator.wikimedia.org/T211661 (10ori) @fgiunchedi and I spoke about this today. Some notes: #### Work queue When Swift receives an object with an expiration, the... [08:48:56] (HAProxyEdgeTrafficDrop) firing: 51% request drop in text@eqiad during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=eqiad&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [08:53:56] (HAProxyEdgeTrafficDrop) resolved: 55% request drop in text@eqiad during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=eqiad&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [09:22:56] (HAProxyEdgeTrafficDrop) firing: 67% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=codfw&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [09:27:56] (HAProxyEdgeTrafficDrop) resolved: 67% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=codfw&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [13:35:50] 10netops, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (Kanban): More public IPs for codfw1dev - https://phabricator.wikimedia.org/T313977 (10cmooney) @Andrew I'm reluctant to allocate more space for WMCS in Codfw, when there is a /29 already allocated and not being used. So I've routed 185.... [14:01:23] 10Traffic, 10SRE, 10Patch-For-Review: per-backend-service concurrency limits in ATS-BE - https://phabricator.wikimedia.org/T306223 (10CDanis) [14:01:29] 10Traffic, 10SRE, 10Patch-For-Review: Package and deploy ATS 9.1.2 - https://phabricator.wikimedia.org/T309651 (10CDanis) [14:02:24] 10Traffic, 10SRE, 10Patch-For-Review: per-backend-service concurrency limits in ATS-BE - https://phabricator.wikimedia.org/T306223 (10CDanis) Awaiting {T309651} to continue testing [14:12:16] vgutierrez: thanks again! [15:14:16] (VarnishTrafficDrop) firing: Varnish traffic in esams has dropped 69.17728928666305% - https://wikitech.wikimedia.org/wiki/Varnish - https://grafana.wikimedia.org/d/000000180/varnish-http-requests?viewPanel=6 - https://alerts.wikimedia.org/?q=alertname%3DVarnishTrafficDrop [15:14:56] (HAProxyEdgeTrafficDrop) firing: 69% request drop in text@esams during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=esams&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [15:15:10] ori: thank you for chosing Traffic CDN services ;P [15:19:16] (VarnishTrafficDrop) resolved: Varnish traffic in esams has dropped 68.31469483843544% - https://wikitech.wikimedia.org/wiki/Varnish - https://grafana.wikimedia.org/d/000000180/varnish-http-requests?viewPanel=6 - https://alerts.wikimedia.org/?q=alertname%3DVarnishTrafficDrop [15:19:56] (HAProxyEdgeTrafficDrop) resolved: 69% request drop in text@esams during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=esams&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [15:54:35] (PurgedHighBacklogQueue) firing: Large backlog queue for purged on cp4026:2112 - https://wikitech.wikimedia.org/wiki/Purged#Alerts - https://grafana.wikimedia.org/d/RvscY1CZk/purged?var-datasource=ulsfo%20prometheus/ops&var-instance=cp4026 - https://alerts.wikimedia.org/?q=alertname%3DPurgedHighBacklogQueue [15:56:06] ^ expected [16:08:35] (PurgedHighEventLag) firing: High event process lag with purged on cp4026:2112 - https://wikitech.wikimedia.org/wiki/Purged#Alerts - https://grafana.wikimedia.org/d/RvscY1CZk/purged?var-datasource=ulsfo%20prometheus/ops&var-instance=cp4026 - https://alerts.wikimedia.org/?q=alertname%3DPurgedHighEventLag [16:09:35] (PurgedHighBacklogQueue) resolved: (2) Large backlog queue for purged on cp4026:2112 - https://wikitech.wikimedia.org/wiki/Purged#Alerts - https://grafana.wikimedia.org/d/RvscY1CZk/purged?var-datasource=ulsfo%20prometheus/ops&var-instance=cp4026 - https://alerts.wikimedia.org/?q=alertname%3DPurgedHighBacklogQueue [16:13:35] (PurgedHighEventLag) resolved: (2) High event process lag with purged on cp4026:2112 - https://wikitech.wikimedia.org/wiki/Purged#Alerts - https://grafana.wikimedia.org/d/RvscY1CZk/purged?var-datasource=ulsfo%20prometheus/ops&var-instance=cp4026 - https://alerts.wikimedia.org/?q=alertname%3DPurgedHighEventLag [16:24:35] (PurgedHighBacklogQueue) firing: Large backlog queue for purged on cp4026:2112 - https://wikitech.wikimedia.org/wiki/Purged#Alerts - https://grafana.wikimedia.org/d/RvscY1CZk/purged?var-datasource=ulsfo%20prometheus/ops&var-instance=cp4026 - https://alerts.wikimedia.org/?q=alertname%3DPurgedHighBacklogQueue [16:39:35] (PurgedHighBacklogQueue) resolved: (2) Large backlog queue for purged on cp4026:2112 - https://wikitech.wikimedia.org/wiki/Purged#Alerts - https://grafana.wikimedia.org/d/RvscY1CZk/purged?var-datasource=ulsfo%20prometheus/ops&var-instance=cp4026 - https://alerts.wikimedia.org/?q=alertname%3DPurgedHighBacklogQueue [16:42:01] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-eqiad: eqiad: upgrade row C and D uplinks from 4x10G to 1x40G - https://phabricator.wikimedia.org/T313463 (10Cmjohnson) a:03Jclark-ctr [19:40:05] vgutierrez: The manifests for pontoon are compiling now, so you should be good to go - got cptext and cpupload to run the puppet agent [19:41:40] Awesome [19:42:02] It also has its own puppetmaster right? [19:50:48] yep [19:51:07] traffic-pontoon.traffic.eqiad1.wikimedia.cloud [19:51:37] The pontoon-traffic-generaltesting branch is on there and ready to roll [19:52:01] s/branch/origin/ [20:16:34] 10netops, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (Kanban): More public IPs for codfw1dev - https://phabricator.wikimedia.org/T313977 (10Andrew) 05Open→03Resolved This works! ` +----------------------+--------------------------------------+ | Field | Value... [21:03:54] 10netops, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (Kanban): More public IPs for codfw1dev - https://phabricator.wikimedia.org/T313977 (10Andrew) 05Resolved→03Open These IPs are reachable from within codfw1dev but not from the greated Internet. @cmooney is that what you'd expect? It's...