[08:44:47] brett: thank you, I'll take a look [08:54:16] https://gerrit.wikimedia.org/r/c/operations/alerts/+/866264/ [09:23:21] Hello we shall be proceeding with the varnishkafka certs T323771. Disabling puppet on all cp hosts for a while, kindly let me know if there's any issue [09:23:21] T323771: Update varnishkafka client certificate for authenticating to kafka-jumbo - https://phabricator.wikimedia.org/T323771 [09:47:29] Hi, Heads up, today the train goes to group2 which changes default thumb size in search (basically in everything, mobile, Special:Search, new vector) and might increase cache misses in upload, if anything goes terribly there, ping me. It should actually improve cache hits afterwards (numbers to come later) [10:21:12] Re enabling puppet on cp hosts T323771 [10:21:12] T323771: Update varnishkafka client certificate for authenticating to kafka-jumbo - https://phabricator.wikimedia.org/T323771 [10:35:57] !log batch restarting varnishkafka-eventlogging.service in batches of 3 30 seconds in between [10:35:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:56:46] !log batch restarting varnishkafka-statsv.service in batches of 3 30 seconds in between T323771 [10:56:49] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:56:50] T323771: Update varnishkafka client certificate for authenticating to kafka-jumbo - https://phabricator.wikimedia.org/T323771 [11:10:18] !log batch restarting varnishkafka-webrequest.service in batches of 3 30 seconds in between T323771 [11:10:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:10:22] T323771: Update varnishkafka client certificate for authenticating to kafka-jumbo - https://phabricator.wikimedia.org/T323771 [12:08:04] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin: ganeti500[567] implementation tracking for serviceops - https://phabricator.wikimedia.org/T324610 (10MoritzMuehlenhoff) p:05Triage→03Medium [15:28:00] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin, 10Patch-For-Review: Q2:rack/setup/install eqsin refresh - https://phabricator.wikimedia.org/T322048 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by jmm@cumin2002 for hosts: `ganeti5002.eqsin.wmnet` - ganeti5002.eqsin.wmnet (**WARN**) - Downti... [16:34:10] 10Traffic, 10SRE: Replace edge cache conftool entries 'varnish-fe' and 'ats-tls' with singular 'cdn' - https://phabricator.wikimedia.org/T324336 (10BBlack) 05Open→03Resolved a:03BBlack This is completed now. AFAIK all relevant scripts/automations/etc were updated to match. The conftool `service` keys f... [17:31:21] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [18:56:35] (PurgedHighEventLag) firing: High event process lag with purged on cp5018:2112 - https://wikitech.wikimedia.org/wiki/Purged#Alerts - https://grafana.wikimedia.org/d/RvscY1CZk/purged?var-datasource=eqsin%20prometheus/ops&var-instance=cp5018 - https://alerts.wikimedia.org/?q=alertname%3DPurgedHighEventLag [19:01:35] (PurgedHighEventLag) firing: (10) High event process lag with purged on cp5018:2112 - https://wikitech.wikimedia.org/wiki/Purged#Alerts - https://alerts.wikimedia.org/?q=alertname%3DPurgedHighEventLag [19:06:35] (PurgedHighEventLag) firing: (14) High event process lag with purged on cp5018:2112 - https://wikitech.wikimedia.org/wiki/Purged#Alerts - https://alerts.wikimedia.org/?q=alertname%3DPurgedHighEventLag [19:11:35] (PurgedHighEventLag) resolved: (9) High event process lag with purged on cp5018:2112 - https://wikitech.wikimedia.org/wiki/Purged#Alerts - https://alerts.wikimedia.org/?q=alertname%3DPurgedHighEventLag