[00:35:23] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host cp1080.eqiad.wmnet with OS bullseye [01:31:41] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host cp1080.eqiad.wmnet with OS bullseye executed with errors: - cp1080 (**FAIL**) - Removed from Pu... [01:38:33] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp1081.eqiad.wmnet with OS bullseye [02:23:11] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host cp1080.eqiad.wmnet with OS bullseye [02:27:33] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp1081.eqiad.wmnet with OS bullseye completed: - cp1081 (**PASS**) - Downtimed on Icinga/Alertm... [02:28:45] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ssingh) [03:09:52] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host cp1080.eqiad.wmnet with OS bullseye completed: - cp1080 (**PASS**) - Removed from Puppet and Pu... [03:20:49] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [03:21:48] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host cp1083.eqiad.wmnet with OS bullseye [03:22:04] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host cp1082.eqiad.wmnet with OS bullseye [04:11:15] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host cp1082.eqiad.wmnet with OS bullseye completed: - cp1082 (**PASS**) - Downtimed on Icinga/Alertm... [04:11:43] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host cp1083.eqiad.wmnet with OS bullseye completed: - cp1083 (**PASS**) - Downtimed on Icinga/Alertm... [04:25:05] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [04:25:35] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host cp1085.eqiad.wmnet with OS bullseye [04:25:38] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host cp1084.eqiad.wmnet with OS bullseye [04:47:16] (VarnishPrometheusExporterDown) firing: Varnish Exporter on instance cp1084:9331 is unreachable - TODO - TODO - https://alerts.wikimedia.org/?q=alertname%3DVarnishPrometheusExporterDown [05:13:22] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host cp1085.eqiad.wmnet with OS bullseye completed: - cp1085 (**PASS**) - Downtimed on Icinga/Alertm... [05:15:24] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host cp1084.eqiad.wmnet with OS bullseye completed: - cp1084 (**WARN**) - Downtimed on Icinga/Alertm... [05:16:58] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [07:45:28] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin: eqsin hosts are not rebooting when running sre.hosts.reimage cookbook - https://phabricator.wikimedia.org/T327812 (10ayounsi) Install servers are being migrated to Bullseye in {T327867} so even though the observed issue is probably not related, it would be better to... [08:01:18] 10Traffic, 10netops, 10DBA, 10Data-Engineering-Planning, and 11 others: codfw row A switches upgrade - https://phabricator.wikimedia.org/T327925 (10Marostegui) [08:30:44] 10netops, 10Infrastructure-Foundations, 10SRE, 10serviceops: Optimize k8s same row traffic flows - https://phabricator.wikimedia.org/T328523 (10ayounsi) >> However using Calico's numAllowedLocalASNumbers config knob will be needed, as all the nodes from a given cluster use the same AS#. > > You could als... [08:34:14] 10netops, 10Infrastructure-Foundations, 10SRE: Allow managing drmrs DHCP settings with Homer - https://phabricator.wikimedia.org/T328737 (10MoritzMuehlenhoff) [08:47:16] (VarnishPrometheusExporterDown) firing: Varnish Exporter on instance cp1084:9331 is unreachable - TODO - TODO - https://alerts.wikimedia.org/?q=alertname%3DVarnishPrometheusExporterDown [09:37:16] (VarnishPrometheusExporterDown) resolved: Varnish Exporter on instance cp1084:9331 is unreachable - TODO - TODO - https://alerts.wikimedia.org/?q=alertname%3DVarnishPrometheusExporterDown [10:42:26] 10netops, 10Infrastructure-Foundations, 10SRE: Improve Homer output when Juniper device rejects config - https://phabricator.wikimedia.org/T328747 (10cmooney) p:05Triage→03Low [10:56:17] 10Traffic, 10netops, 10DBA, 10Data-Engineering-Planning, and 11 others: codfw row A switches upgrade - https://phabricator.wikimedia.org/T327925 (10MoritzMuehlenhoff) [11:07:18] 10Traffic, 10SRE, 10Patch-For-Review: Add DP cookie for pageview filtering - https://phabricator.wikimedia.org/T315676 (10Vgutierrez) @jcross @Htriedman we had some issues after merging the Differential Privacy CR this morning and I reverted it shortly after. https://gerrit.wikimedia.org/r/c/operations/puppe... [11:41:43] 10netops, 10Infrastructure-Foundations, 10SRE: Improve Homer output when Juniper device rejects config - https://phabricator.wikimedia.org/T328747 (10cmooney) [11:41:53] 10netops, 10Infrastructure-Foundations, 10SRE, 10serviceops: Calico and BFD - https://phabricator.wikimedia.org/T328338 (10ayounsi) [12:03:42] 10netops, 10Infrastructure-Foundations, 10SRE, 10serviceops: Calico and BFD - https://phabricator.wikimedia.org/T328338 (10cmooney) > Unfortunately, as mentioned in https://blog.ipspace.net/2021/09/graceful-restart.html "BGP Graceful Restart (RFC 4724) looks like it’s been designed by cowboys" as there is... [13:05:45] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp1087.eqiad.wmnet with OS bullseye [13:11:50] 10Traffic, 10Prod-Kubernetes, 10PyBal, 10SRE, 10serviceops: Proposal: simplify set up of a new load-balanced service on kubernetes - https://phabricator.wikimedia.org/T238909 (10akosiaris) 05Open→03Declined I am gonna tentatively set this as `declined`. The Service IPs announcement path led to nowh... [13:15:15] 10netops, 10Infrastructure-Foundations, 10SRE, 10serviceops: Calico and BFD - https://phabricator.wikimedia.org/T328338 (10ayounsi) 05Open→03Resolved a:03ayounsi After a discussion with @akosiaris the initial BFD need was for an Anycast experiment and as explained in T238909#8585199 this is not in sc... [13:15:23] 10Traffic, 10Prod-Kubernetes, 10PyBal, 10SRE, 10serviceops: Proposal: simplify set up of a new load-balanced service on kubernetes - https://phabricator.wikimedia.org/T238909 (10ayounsi) [13:52:01] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp1087.eqiad.wmnet with OS bullseye completed: - cp1087 (**PASS**) - Downtimed on Icinga/Alertmanager - Disabled Pu... [13:55:17] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ssingh) [16:47:28] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host cp1089.eqiad.wmnet with OS bullseye [16:47:42] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host cp1086.eqiad.wmnet with OS bullseye [17:34:29] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host cp1086.eqiad.wmnet with OS bullseye completed: - cp1086 (**PASS**) - Downtimed on Icinga/Alertmanager - Disabled Pu... [17:35:05] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [17:35:34] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host cp1088.eqiad.wmnet with OS bullseye [17:36:48] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host cp1089.eqiad.wmnet with OS bullseye completed: - cp1089 (**PASS**) - Downtimed on Icinga/Alertmanager - Disabled Pu... [17:39:38] 10HTTPS, 10Traffic, 10SRE, 10Tracking-Neverending: HTTPS Plans (tracking / high-level info) - https://phabricator.wikimedia.org/T104681 (10BCornwall) [17:39:45] 10HTTPS, 10SRE, 10Traffic-Icebox: Enable HSTS on store.wikimedia.org for HTTPS - https://phabricator.wikimedia.org/T128559 (10BCornwall) 05Open→03Stalled a:03BCornwall Wow, seven years! Hello to those still around. :) Who is in charge of shopify/store.wikimedia.org nowadays? It would be nice if we cou... [17:39:51] 10HTTPS, 10SRE, 10Traffic-Icebox: Enable HSTS on store.wikimedia.org for HTTPS - https://phabricator.wikimedia.org/T128559 (10BCornwall) p:05Medium→03Low [17:40:19] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [17:53:34] 10HTTPS, 10SRE, 10Traffic-Icebox: Enable HSTS on store.wikimedia.org for HTTPS - https://phabricator.wikimedia.org/T128559 (10Dzahn) >>! In T128559#8586016, @BCornwall wrote: > Wow, seven years! Hello to those still around. :) > > Who is in charge of shopify/store.wikimedia.org nowadays? It would be nice if... [18:00:22] 10HTTPS, 10SRE, 10Traffic-Icebox: Enable HSTS on store.wikimedia.org for HTTPS - https://phabricator.wikimedia.org/T128559 (10Dzahn) @BCornwall found this "contact: Merchandise@wikimedia.org (response platform), Khansen-ctr@wikimedia.org (Store associate), or Shust@wikimedia.org (Store Manager) " on https:... [18:19:26] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host cp1088.eqiad.wmnet with OS bullseye completed: - cp1088 (**WARN**) - Downtimed on Icinga/Alertmanager - Disabled Pu... [18:23:42] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [18:23:59] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host cp1090.eqiad.wmnet with OS bullseye [18:55:59] 10HTTPS, 10SRE, 10Traffic-Icebox: Enable HSTS on store.wikimedia.org for HTTPS - https://phabricator.wikimedia.org/T128559 (10SHust) Update: We expect to have more info mid-next week. Thanks, everyone for your patience! [19:10:57] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host cp1090.eqiad.wmnet with OS bullseye completed: - cp1090 (**PASS**) - Downtimed on Icinga/Alertmanager - Disabled Pu... [19:44:59] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [19:53:15] 10netops, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (FY2022/2023-Q3): Configure cloudsw1-b1-codfw and migrate cloud hosts in codfw B1 to it - https://phabricator.wikimedia.org/T327919 (10cmooney) I tried to enable the CR uplinks from the new cloudsw but there is a bit of a snag. The CR do... [20:30:42] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [21:15:38] 10Varnish, 10Continuous-Integration-Infrastructure, 10SRE, 10Traffic-Icebox, 10Patch-For-Review: Make CI run Varnish VCL tests - https://phabricator.wikimedia.org/T128188 (10BCornwall) 05Open→03Stalled Hi, @hashar! It's been quite a while but is there still any intention to add the CI integration? [21:36:59] 10Domains, 10SRE, 10Traffic-Icebox: Register wiki(m|p)edia.ro - https://phabricator.wikimedia.org/T222080 (10CRoslof) a:05BCornwall→03CRoslof Thanks for flagging these domain names. I'll make sure they're in our queue to review for trademark enforcement. We should be able to use the [[ https://en.wikiped... [22:15:02] 10Traffic, 10netops, 10DBA, 10Data-Engineering-Planning, and 11 others: codfw row A switches upgrade - https://phabricator.wikimedia.org/T327925 (10colewhite) [23:16:13] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin: eqsin hosts are not rebooting when running sre.hosts.reimage cookbook - https://phabricator.wikimedia.org/T327812 (10BCornwall) @ssingh, we can agree that this was a NIC issue, yeah? If so, this can be marked as resolved since upgrading the NIC firmware allowed us t...