[00:01:29] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:Install cp11[00-15] and rotate into production - https://phabricator.wikimedia.org/T349244 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by fabfur@cumin1001 for host cp1112.eqiad.wmnet with OS bullseye completed: - cp1112 (**PASS**) - Remo... [04:06:45] (VarnishHighThreadCount) firing: Varnish's thread count on cp5027:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://grafana.wikimedia.org/d/wiU3SdEWk/cache-host-drilldown?viewPanel=99&var-site=eqsin&var-instance=cp5027 - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [04:11:45] (VarnishHighThreadCount) resolved: Varnish's thread count on cp5027:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://grafana.wikimedia.org/d/wiU3SdEWk/cache-host-drilldown?viewPanel=99&var-site=eqsin&var-instance=cp5027 - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [10:44:54] heyhey - I'd like to roll out the last (!!) AQS2 service to ATS https://gerrit.wikimedia.org/r/c/operations/puppet/+/970367 [10:45:14] Internal test: curl -H "Host: wikimedia.org" https://rest-gateway.discovery.wmnet:4113/wikimedia.org/v1/metrics/edits/aggregate/enwiki/all-editor-types/all-page-types/daily/20220101/20220103 [10:45:34] After this I'll clean up the /metrics/ regexes to something simpler [11:36:28] Hello, we have recently onboarded new servers to the druid public cluster and thus need to add them to the existing druid-public-broker VIP. [11:36:28] requesting some help/review with this here https://gerrit.wikimedia.org/r/c/operations/puppet/+/962250 [11:53:20] 10Traffic, 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Migrate lvs2011 and lvs2012 to new top-of-rack switches - https://phabricator.wikimedia.org/T348178 (10cmooney) >>! In T348178#9227051, @ayounsi wrote: >> Secondary Link Migration > Looking at link usage, it's fine to drop the seconda... [13:36:16] hnowlan: the EU folks who typically take care of this are out today, but happy do it tomorrow :) [13:36:34] stevemunene: what kind of help are you looking for? hth [13:38:33] hi sukhe I need help with the deployment [13:50:02] sukhe: ah, cool. thanks! [13:50:12] stevemunene: ok, I will take a look after the meeting and roll it out. do you want to do it today I assume? [13:52:45] (HAProxyRestarted) firing: HAProxy server restarted on cp1105:9100 - https://wikitech.wikimedia.org/wiki/HAProxy#HAProxy_for_edge_caching - https://grafana.wikimedia.org/d/gQblbjtnk/haproxy-drilldown?orgId=1&var-site=eqiad%20prometheus/ops&var-instance=cp1105&viewPanel=10 - https://alerts.wikimedia.org/?q=alertname%3DHAProxyRestarted [13:52:47] (VarnishPrometheusExporterDown) firing: (2) Varnish Exporter on instance cp1113:9331 is unreachable - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/000000304/varnish-dc-stats?viewPanel=17 - https://alerts.wikimedia.org/?q=alertname%3DVarnishPrometheusExporterDown [14:31:04] thanks sukhe, today or tomorrow works [14:35:35] stevemunene: ok, let's do tomorrow just to be safe. please ping me again and happy to take care of it [14:35:56] great thanks [15:25:44] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:Install cp11[00-15] and rotate into production - https://phabricator.wikimedia.org/T349244 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host cp1113.eqiad.wmnet with OS bookworm [15:26:22] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:Install cp11[00-15] and rotate into production - https://phabricator.wikimedia.org/T349244 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host cp1113.eqiad.wmnet with OS bookworm executed with errors: - cp1113 (**FAIL**... [15:27:59] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:Install cp11[00-15] and rotate into production - https://phabricator.wikimedia.org/T349244 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host cp1113.eqiad.wmnet with OS bullseye [15:35:28] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:Install cp11[00-15] and rotate into production - https://phabricator.wikimedia.org/T349244 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host cp1113.eqiad.wmnet with OS bullseye executed with errors: - cp1113 (**FAIL**... [15:37:00] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:Install cp11[00-15] and rotate into production - https://phabricator.wikimedia.org/T349244 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host cp1113.eqiad.wmnet with OS bullseye [15:59:28] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:Install cp11[00-15] and rotate into production - https://phabricator.wikimedia.org/T349244 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host cp1113.eqiad.wmnet with OS bullseye executed with errors: - cp1113 (**FAIL**... [15:59:39] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:Install cp11[00-15] and rotate into production - https://phabricator.wikimedia.org/T349244 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host cp1113.eqiad.wmnet with OS bullseye [16:36:40] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:Install cp11[00-15] and rotate into production - https://phabricator.wikimedia.org/T349244 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host cp1113.eqiad.wmnet with OS bullseye completed: - cp1113 (**PASS**) - Remov... [16:51:24] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:Install cp11[00-15] and rotate into production - https://phabricator.wikimedia.org/T349244 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host cp1114.eqiad.wmnet with OS bullseye [16:51:41] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:Install cp11[00-15] and rotate into production - https://phabricator.wikimedia.org/T349244 (10BCornwall) [17:01:52] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:Install cp11[00-15] and rotate into production - https://phabricator.wikimedia.org/T349244 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host cp1114.eqiad.wmnet with OS bullseye executed with errors: - cp1114 (**FAIL**... [17:02:13] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:Install cp11[00-15] and rotate into production - https://phabricator.wikimedia.org/T349244 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host cp1114.eqiad.wmnet with OS bullseye [17:12:45] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:Install cp11[00-15] and rotate into production - https://phabricator.wikimedia.org/T349244 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host cp1114.eqiad.wmnet with OS bullseye executed with errors: - cp1114 (**FAIL**... [17:13:03] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:Install cp11[00-15] and rotate into production - https://phabricator.wikimedia.org/T349244 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host cp1114.eqiad.wmnet with OS bullseye [17:30:43] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-eqiad: Move asw2-c8-eqiad to spares - https://phabricator.wikimedia.org/T349798 (10Jclark-ctr) [17:50:41] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:Install cp11[00-15] and rotate into production - https://phabricator.wikimedia.org/T349244 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host cp1114.eqiad.wmnet with OS bullseye completed: - cp1114 (**PASS**) - Remov... [18:01:57] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:Install cp11[00-15] and rotate into production - https://phabricator.wikimedia.org/T349244 (10BCornwall) [18:02:33] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:Install cp11[00-15] and rotate into production - https://phabricator.wikimedia.org/T349244 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host cp1115.eqiad.wmnet with OS bullseye [18:17:16] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:Install cp11[00-15] and rotate into production - https://phabricator.wikimedia.org/T349244 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host cp1115.eqiad.wmnet with OS bullseye executed with errors: - cp1115 (**FAIL**... [18:17:19] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:Install cp11[00-15] and rotate into production - https://phabricator.wikimedia.org/T349244 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1001 for host cp1115.eqiad.wmnet with OS bullseye [18:55:53] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:Install cp11[00-15] and rotate into production - https://phabricator.wikimedia.org/T349244 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1001 for host cp1115.eqiad.wmnet with OS bullseye executed with errors: - cp1115 (**FAIL*... [20:28:52] 10netops, 10Infrastructure-Foundations, 10SRE, 10Traffic-Icebox, 10Patch-For-Review: Create Generalised blocking strategy - https://phabricator.wikimedia.org/T270618 (10BCornwall) Thank you for all the work on this ticket and for creating it. I notice that this is a very broad topic and think it would be... [20:31:41] 10Traffic, 10netops, 10Infrastructure-Foundations, 10Patch-For-Review, 10User-jbond: varnish filtering: should we automatically update public_cloud_nets - https://phabricator.wikimedia.org/T270391 (10BCornwall) [22:03:19] 10netops, 10Infrastructure-Foundations, 10SRE: Announce internal/core routes from CRs to L3 switches - https://phabricator.wikimedia.org/T344547 (10cmooney) 05Open→03Resolved a:03cmooney Patch is merged everywhere. Looks ok. For instance switch in esams connected to backup LVS now sends traffic to pr... [22:09:46] 10Traffic, 10netops, 10Infrastructure-Foundations, 10SRE: Add new codfw private vlan sub-interfaces to lvs2013 and lvs2014 - https://phabricator.wikimedia.org/T348225 (10cmooney) a:03cmooney [22:51:50] 805 [23:00:51] 10Traffic: Track WMF owned non-canonical domains - https://phabricator.wikimedia.org/T247618 (10BCornwall) p:05Medium→03Low [23:02:58] 10Traffic, 10WMF-Legal, 10Patch-For-Review, 10Privacy: Add no-transform to Cache-Control header - https://phabricator.wikimedia.org/T218618 (10BCornwall) a:05BCornwall→03None [23:55:24] 10Traffic, 10User-MoritzMuehlenhoff: Investigate Chrony as a replacement for ISC ntpd - https://phabricator.wikimedia.org/T177742 (10BCornwall) 05Invalid→03Open p:05Medium→03Low Looks like the desire to move to chrony is still there as it was recently discussed in traffic's circle