[02:56:13] 10Traffic, 10DC-Ops, 10SRE, 10ops-ulsfo: Q2:rack/setup/install eqsin refresh - https://phabricator.wikimedia.org/T322048 (10RobH) The elevation google sheet has been updated with 11 of the 16 new cp hosts wired up. We couldnt wire up the last 5 due to msws only being 24 port (oversight by me in planning t... [09:42:08] 10Traffic, 10SRE, 10Upstream: Wikipedia on flow with no http request, still responds with a Bad Request 400 - https://phabricator.wikimedia.org/T323263 (10Vgutierrez) 05Stalled→03In progress We were missing one config option in our HAProxy setup: `option http-ignore-probes`, after enabling it, HAProxy be... [10:03:44] 10Traffic, 10SRE, 10Patch-For-Review, 10Upstream: Wikipedia on flow with no http request, still responds with a Bad Request 400 - https://phabricator.wikimedia.org/T323263 (10Vgutierrez) 05In progress→03Resolved a:03Vgutierrez fix has been merged and it's being deployed, it should be available fleet... [10:19:28] XioNoX, topranks do we have insights on what's going on between eqsin and codfw? https://grafana.wikimedia.org/goto/jEGNrIOVk?orgId=1 we got some performance regression there during the last 11 days and ongoing [10:20:17] vgutierrez: I guess related to https://phabricator.wikimedia.org/T322529 [10:20:26] and created congestion on the telia link [10:21:09] I'll send an email to Arelion/telia [10:27:14] ouch [10:27:18] thanks for the update XioNoX <3 [10:33:59] 10Traffic, 10SRE, 10serviceops, 10Patch-For-Review: _etcd-client SRV record missing for conftool cluster - https://phabricator.wikimedia.org/T320397 (10Vgutierrez) ping? [10:44:48] 10Traffic, 10Data Pipelines, 10Data-Engineering-Planning, 10Foundational Technology Requests, and 2 others: Add a webrequest sampled topic and ingest into druid/turnilo - https://phabricator.wikimedia.org/T314981 (10elukey) After a few unsuccessful tries (due to me+Friday combination), me and Filippo rolle... [10:47:47] 10Traffic, 10MW-on-K8s, 10SRE, 10serviceops, and 3 others: Stop spamming SAL with helmfile on scap deployments - https://phabricator.wikimedia.org/T323296 (10Clement_Goubert) Merge request on `scap` to pass the `SUPPRESS_SAL` variable https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/26 [10:50:02] vgutierrez: what's the impact? [10:50:13] should I mention it in my email to Arelion? [10:56:17] https://grafana.wikimedia.org/goto/7zyF3Sd4z?orgId=1 [10:56:51] the ramp up on TTFB on 2022-11-13 matches the one in the latency graph [10:57:22] it's also noticeable at p50.. from ~210ms to ~280ms [10:58:06] it's also noticeable on cache misses as expected, https://grafana.wikimedia.org/goto/PkuhqSO4k?orgId=1 [10:58:20] vgutierrez: email sent [10:58:26] thx [10:58:31] it's great to be able to link our dashboards to the vendors :) [10:58:37] indeed [11:10:44] I'm also glad the full mesh icmp probing is useful! [11:14:58] 10Traffic, 10SRE: Improve handling/logging of HAproxy emergency log messages - https://phabricator.wikimedia.org/T306236 (10Vgutierrez) 05In progress→03Resolved a:03Vgutierrez [11:18:25] 10Traffic, 10SRE: Rename role::cache::(text|upload)_haproxy to role::cache::(text|upload) - https://phabricator.wikimedia.org/T323365 (10Vgutierrez) [11:19:01] 10Traffic, 10SRE: Rename role::cache::(text|upload)_haproxy to role::cache::(text|upload) - https://phabricator.wikimedia.org/T323365 (10Vgutierrez) p:05Triage→03Medium [11:26:37] FIY, starting sunset of apple-search [11:36:30] claime: ack [11:37:13] godog: prometheus::class_config::class_parameters needs to be exhaustive or could I list just one parameter that tells between instances? [11:37:43] vgutierrez: IIRC the latter, one parameter to tell classes apart should be enough [11:37:49] nice! [11:37:51] there should be examples around too [11:49:46] godog: hmm so I need to match profile::cache::haproxy::extra_certificates [11:49:54] for upload is pretty easy as it's undef [11:50:16] for text is a PITA [11:50:19] https://www.irccloud.com/pastebin/clSLuQFB/ [11:50:56] I guess I need to dump it there :_) [11:51:45] I'm about to switch apple-search to service_setup and remove it from the pools, I'm supposed to then "Ask on #wikimedia-traffic which are the backup LVS server for the LVS class of your service on both datacentres" [11:51:46] So I am [11:51:49] :) [11:52:06] both DCs meaning eqiad & codfw, right? [11:52:12] yep [11:52:30] Well that's what I infer from https://wikitech.wikimedia.org/wiki/LVS#Remove_a_load_balanced_service [11:52:34] so lvs1020 and lvs2010 claime [11:52:45] ack [13:16:00] vgutierrez: I think I'm missing context, what's the original problem ? [13:18:15] cleaning up duplicated roles [13:18:53] Just addressed it by targeting profile::cache::haproxy [13:19:27] I'll rollback to role::cache::text|upload after completing the cleanup [13:19:38] *nod* makes sense [13:20:06] in the past we've also used dummy/placeholder classes for the same purpose [13:20:18] not pretty but good enough™ [16:06:15] 10Traffic, 10DC-Ops, 10SRE, 10ops-ulsfo: Q2:rack/setup/install eqsin refresh - https://phabricator.wikimedia.org/T322048 (10ssingh) [16:07:43] 10Traffic, 10DC-Ops, 10SRE, 10ops-ulsfo: Q2:rack/setup/install eqsin refresh - https://phabricator.wikimedia.org/T322048 (10ssingh) [19:11:16] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin, 10Patch-For-Review: Q2:rack/setup/install eqsin refresh - https://phabricator.wikimedia.org/T322048 (10RobH) [19:12:33] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin, 10Patch-For-Review: Q2:rack/setup/install eqsin refresh - https://phabricator.wikimedia.org/T322048 (10RobH) [19:20:01] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin, 10Patch-For-Review: Q2:rack/setup/install eqsin refresh - https://phabricator.wikimedia.org/T322048 (10RobH) [19:21:20] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin, 10Patch-For-Review: Q2:rack/setup/install eqsin refresh - https://phabricator.wikimedia.org/T322048 (10RobH) [19:28:53] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin, 10Patch-For-Review: Q2:rack/setup/install eqsin refresh - https://phabricator.wikimedia.org/T322048 (10RobH) [19:32:29] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin, 10Patch-For-Review: Q2:rack/setup/install eqsin refresh - https://phabricator.wikimedia.org/T322048 (10RobH) [20:00:24] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin, 10Patch-For-Review: Q2:rack/setup/install eqsin refresh - https://phabricator.wikimedia.org/T322048 (10RobH) [20:01:03] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin, 10Patch-For-Review: Q2:rack/setup/install eqsin refresh - https://phabricator.wikimedia.org/T322048 (10RobH) [20:09:46] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin, 10Patch-For-Review: Q2:rack/setup/install eqsin refresh - https://phabricator.wikimedia.org/T322048 (10RobH) [20:22:08] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin, 10Patch-For-Review: Q2:rack/setup/install eqsin refresh - https://phabricator.wikimedia.org/T322048 (10RobH) [20:57:05] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin, 10Patch-For-Review: Q2:rack/setup/install eqsin refresh - https://phabricator.wikimedia.org/T322048 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp5017.eqsin.wmnet with OS buster [22:06:06] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin, 10Patch-For-Review: Q2:rack/setup/install eqsin refresh - https://phabricator.wikimedia.org/T322048 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp5017.eqsin.wmnet with OS buster completed: - cp5017 (**PASS**) -... [23:28:07] 10Traffic, 10SRE: strip non session cookies before cache lookup in ATS - https://phabricator.wikimedia.org/T316338 (10Krinkle)