[06:18:56] (EdgeTrafficDrop) firing: 57% request drop in text@ulsfo during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=ulsfo&var-cache_type=text - https://alerts.wikimedia.org [06:23:56] (EdgeTrafficDrop) resolved: 63% request drop in text@ulsfo during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=ulsfo&var-cache_type=text - https://alerts.wikimedia.org [06:55:56] (EdgeTrafficDrop) firing: 69% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=codfw&var-cache_type=text - https://alerts.wikimedia.org [07:05:56] (EdgeTrafficDrop) resolved: 69% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=codfw&var-cache_type=text - https://alerts.wikimedia.org [10:22:44] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Upgrade netflow VMs to Bullseye - https://phabricator.wikimedia.org/T297595 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by ayounsi@cumin1001 for hosts: `netflow2001.codfw.wmnet` - netflow2001.codfw.wmnet (**FAIL**) -... [10:36:46] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Upgrade netflow VMs to Bullseye - https://phabricator.wikimedia.org/T297595 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by ayounsi@cumin1001 for hosts: `netflow4001.ulsfo.wmnet` - netflow4001.ulsfo.wmnet (**PASS**) -... [12:01:34] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Upgrade netflow VMs to Bullseye - https://phabricator.wikimedia.org/T297595 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by ayounsi@cumin1001 for hosts: `netflow2001.codfw.wmnet` - netflow2001.codfw.wmnet (**PASS**) -... [12:24:11] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Upgrade netflow VMs to Bullseye - https://phabricator.wikimedia.org/T297595 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by ayounsi@cumin1001 for hosts: `netflow1001.eqiad.wmnet` - netflow1001.eqiad.wmnet (**PASS**) -... [12:44:53] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Upgrade netflow VMs to Bullseye - https://phabricator.wikimedia.org/T297595 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by ayounsi@cumin1001 for hosts: `netflow3001.esams.wmnet` - netflow3001.esams.wmnet (**PASS**) -... [12:57:56] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Upgrade netflow VMs to Bullseye - https://phabricator.wikimedia.org/T297595 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by ayounsi@cumin1001 for hosts: `netflow5001.eqsin.wmnet` - netflow5001.eqsin.wmnet (**PASS**) -... [13:56:13] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw, 10cloud-services-team (Kanban): connect 2nd cloudcontrol200x-dev NIC to vlan 2105 - https://phabricator.wikimedia.org/T297588 (10Papaul) @aborrero are we doing trunk so i can assign this task to netops? [14:02:54] 10Traffic, 10SRE, 10Patch-For-Review: Test envoyproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T271421 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by vgutierrez@cumin1001 for host cp4025.ulsfo.wmnet with OS buster [14:09:56] (VarnishPrometheusExporterDown) firing: Varnish Exporter on instance cp4025:9331 is unreachable - https://alerts.wikimedia.org [14:11:24] ^^host being reimaged [14:11:45] volans: we got a warning on cp3064, WARNING: Puppet has 1 failures. Last run 8 minutes ago with 1 failures. Failed resources (up to 3 shown): Exec[renew certificate - debmonitor__cp3064_esams_wmnet], is that expected / known? [14:14:16] 10Traffic: Frequent backend server errors (503), happened several times in the last 2 days - https://phabricator.wikimedia.org/T297544 (10Marostegui) [14:15:51] 10netops, 10Data-Engineering, 10Infrastructure-Foundations, 10SRE, 10Traffic-Icebox: Collect netflow data for internal traffic - https://phabricator.wikimedia.org/T263277 (10ayounsi) [14:15:58] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Upgrade netflow VMs to Bullseye - https://phabricator.wikimedia.org/T297595 (10ayounsi) 05Open→03Resolved a:03ayounsi All done! [14:19:31] 10netops, 10Data-Engineering, 10Infrastructure-Foundations, 10SRE, 10Traffic-Icebox: Collect netflow data for internal traffic - https://phabricator.wikimedia.org/T263277 (10ayounsi) Tests are successful: I tested it by configuring sflow on the non-yet-prod asw1-b12-drmrs switch: `lang=diff [edit protoc... [14:19:56] (VarnishPrometheusExporterDown) resolved: Varnish Exporter on instance cp4025:9331 is unreachable - https://alerts.wikimedia.org [14:50:56] (VarnishPrometheusExporterDown) firing: Varnish Exporter on instance cp4025:9331 is unreachable - https://alerts.wikimedia.org [14:55:56] (VarnishPrometheusExporterDown) resolved: Varnish Exporter on instance cp4025:9331 is unreachable - https://alerts.wikimedia.org [14:56:45] 10Traffic, 10Browser-Support-Firefox: Firefox: Referrer Policy: Less restricted policies, including ‘no-referrer-when-downgrade’, ‘origin-when-cross-origin’ and ‘unsafe-url’, will be ignored soon for the cross-site request - https://phabricator.wikimedia.org/T293109 (10MatthewVernon) [15:09:37] 10Traffic, 10SRE, 10Patch-For-Review: Test envoyproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T271421 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by vgutierrez@cumin1001 for host cp4025.ulsfo.wmnet with OS buster completed: - cp4025 (**FAIL*... [15:09:42] 10Traffic, 10SRE, 10Patch-For-Review: Test envoyproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T271421 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by vgutierrez@cumin1001 for host cp4025.ulsfo.wmnet with OS buster executed with errors: - cp40... [15:14:35] 10Traffic: Image requests sending neither "Last-Modified" nor "ETag" HTTP headers. - https://phabricator.wikimedia.org/T295556 (10MatthewVernon) [15:23:51] vgutierrez: sorry was in a meeting [15:23:52] looking [15:23:59] np [15:24:04] it seems to be autofixed [15:24:12] but I would 301 that to jbond [15:24:17] as debmonitor uses cfssl [15:27:48] bblack: quick question that you might be able to help me with (and partially this may just be due to my own lack of puppet knowledge) [15:28:02] I'm trying to work out how the vlan sub-interfaces get set up on the LVS hosts. [15:28:14] I can see the data is here: https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/hieradata/common/lvs/interfaces.yaml [15:28:30] And it's processed by this: https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/modules/profile/manifests/lvs/tagged_interface.pp [15:28:48] But I'm missing the link that transforms that to the contents of /etc/network/interfaces on the host [15:28:50] heh, I'm just asking that at the same time on -sre [15:31:52] 10netops, 10Infrastructure-Foundations, 10SRE-tools, 10serviceops: Support services VIPs with not marked as VIP in Netbox - https://phabricator.wikimedia.org/T295793 (10MatthewVernon) [15:45:44] 10Traffic: Upgrade pybal-test200[23] from Stretch to Buster - https://phabricator.wikimedia.org/T297187 (10MatthewVernon) [17:24:32] 10netops, 10Data-Engineering, 10Infrastructure-Foundations, 10SRE, 10Traffic-Icebox: Collect netflow data for internal traffic - https://phabricator.wikimedia.org/T263277 (10JAllemandou) Am I right in assuming that this data has the same schema as the original `netflow`? [17:54:32] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw, 10cloud-services-team (Kanban): connect 2nd cloudcontrol200x-dev NIC to vlan 2105 - https://phabricator.wikimedia.org/T297588 (10aborrero) 05Open→03Stalled Yes, we will be doing trunk. Thanks @Papaul I think we're fine here from DCops side f... [17:54:54] 10netops, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (Kanban): cloud: decide on general idea for having cloud-dedicated hardware provide service in the cloud realm & the internet - https://phabricator.wikimedia.org/T296411 (10aborrero) [17:55:57] 10netops, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (Kanban): cloud: decide on general idea for having cloud-dedicated hardware provide service in the cloud realm & the internet - https://phabricator.wikimedia.org/T296411 (10aborrero) 05Open→03Stalled We just re-shifted team priorities... [17:58:15] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, and 2 others: Q1:(Need By: TBD) rack/setup/install cloudswift100[12] - https://phabricator.wikimedia.org/T289882 (10aborrero) 05Open→03Stalled FYI network details for these servers are blocked on {T296411}, which is in turn stalled, so marking th... [21:04:37] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw, 10cloud-services-team (Kanban): connect 2nd cloudcontrol200x-dev NIC to vlan 2105 - https://phabricator.wikimedia.org/T297588 (10Papaul) a:05Papaul→03None [22:06:00] 10Traffic: Image requests sending neither "Last-Modified" nor "ETag" HTTP headers. - https://phabricator.wikimedia.org/T295556 (10Ade56facc) Title of ticket is wrong, it should be: **Image responses without "Last-Modified" and "ETag" HTTP headers.**