[09:20:42] <zpapierski>	 wdqs1013 went down again during the weekend
[09:21:15] <zpapierski>	 it always (or super often) that host that does it
[09:21:22] <dcausse>	 yes, happens quite regularly sadly
[09:21:49] <zpapierski>	 should we take it out of ciculation and investigate?
[09:22:11] <dcausse>	 1013 & 1012 yes, I bet it's because, being powerful, the run more queries increasing chance to be hit by a deadly one
[09:22:29] <zpapierski>	 ah, I see
[09:22:31] <zpapierski>	 makes sense
[09:22:47] <zpapierski>	 would be nice to confirm it, though
[09:22:58] <zpapierski>	 maybe we should have a different memory configuration for them
[09:23:12] <zpapierski>	 wdqs1012 doesn't die nearly as often 
[09:24:09] <zpapierski>	 hmm, actually that might be an interesting dig - which hosts contribute most often to the general instability
[09:25:06] * zpapierski is off to play around with PromQL
[09:25:43] <dcausse>	 indeed, thanks for looking!
[10:23:50] <zpapierski>	 and the winner is indeed wdqs1013
[10:28:17] <zpapierski>	 https://grafana-rw.wikimedia.org/d/000000489/wikidata-query-service?orgId=1&var-cluster_name=wdqs&refresh=1m&forceLogin=true&from=1644748087771&to=1644834487771&viewPanel=36
[10:29:35] <zpapierski>	 it's not exactly what I wanted to do (I want to have a more specific host availabilty, instead of SLO here), but I'm assuming it follows the same trend
[10:30:18] <zpapierski>	 funnily enough, wdqs1012 is actually a best performing host (SLO wise) among eqiad ones
[10:30:24] <zpapierski>	 wdqs1013 is anything but
[10:30:46] <zpapierski>	 which means that theory of those being the most exposed might not hold
[10:37:42] <zpapierski>	 how do you sort this thing...
[11:10:12] <dcausse>	 lunch
[11:10:19] <ejoseph>	 lunch
[12:16:02] <zpapierski>	 lunch
[14:14:34] <inflatador>	 greetings
[14:16:59] <dcausse>	 o/
[14:57:28] <zpapierski>	 o/
[16:02:10] <gehel>	 ejoseph: triage meeting: https://meet.google.com/qho-jyqp-qos
[16:34:19] <ebernhardson>	 zpapierski: i wonder on thinking again, did we actually need session return-to? Is w[cd]qs a SPA with only a single url?
[16:42:15] <zpapierski>	 I think it is, but might be wrong
[17:11:29] <dcausse>	 dinner
[18:13:19] <ebernhardson>	 o dpm
[18:13:50] <ebernhardson>	 i don't see anything about boosters in https://ec.europa.eu/info/live-work-travel-eu/coronavirus-response/safe-covid-19-vaccines-europeans/eu-digital-covid-certificate_en is it saying the eu requires a 2-dose every 270 days? Currently noone is allowed a fourth shot here afaik
[18:56:54] <Trey314159>	 The quantity of "200kg of sugar" (440 lbs) came up in the Ask a Language Nerd meeting today and someone (from Poland) said they didn't think they'd eaten that much sugar in their life. mpham helpfully suggested I might have (can't argue with that). I looked it up, though, and the average American eats 150 lbs of sugar per year (up from 125 in the 70s and just 2 two hundred years ago). So 200kg is ~3 years for an average American. 
[18:56:54] <Trey314159>	 In Poland, it's 40kg/88lbs per year—so 5 years' worth. (Since it's Valentine's Day, I'm working on getting my numbers up!)
[19:04:07] <ebernhardson>	 hmm, how do i keep forgetting to turn puppet back on on the wcqs hosts....i always wonder if it does anything because `sudo enable-puppet foobar` emits nothing 
[19:21:02] <ebernhardson>	 ahh, the problem is i never realized i'm supposed to provide the same message when disabling and enabling. No wonder i've left it disabled a few times on accident...
[19:21:30] <ebernhardson>	 (and it doesn't tell you it refused to turn back on, puppet patch up that adds an echo)
[19:25:16] <ebernhardson>	 ryankemper: something i just noticed, in eqiad :9243 we have `persistent.cluster.remote.omega.seeds` and `persistent.search.remote.omega.seeds`  in the cluster settings, one refers to new hosts and one refers to old.  I suspect it's still using the old one since `curl https://search.svc.eqiad.wmnet:9243/omega:sowiktionary_content` fails to find an index, suggests its still looking at the
[19:25:19] <ebernhardson>	 old masters
[19:25:51] <ebernhardson>	 not entirely sure, but one way or the other only one set of conf should exist in the cluster settings :)
[19:27:30] <ryankemper>	 ebernhardson: ah, that might explain some of the issues with the alert firing https://phabricator.wikimedia.org/T301511#7708316
[19:28:06] <ryankemper>	 the check script looks at `cluster|search`, but I didn't think about the other side (the actual setting being wrong)
[19:31:27] <ryankemper>	 ebernhardson: https://github.com/wikimedia/puppet/blob/58392896c66ea8669c3e52d27845be716c2c6c6a/modules/icinga/manifests/monitor/elasticsearch/cirrus_settings_check.pp#L13 here's what i meant about the check looking at either `cluster|search`
[19:35:52] <ebernhardson>	 ryankemper: i think the name changed depending on the elasticsearch version, double checking
[19:36:24] <ryankemper>	 the commit message says something to that effect, https://github.com/wikimedia/puppet/commit/d6bfc99435ef0b024599438565c09c903c69cbca
[19:40:22] <ebernhardson>	 ryankemper: i think once fixed these should decline, these should all be cross-cluster searches that are failing (fairly invisible to users, they just don't get a sidebar): https://logstash.wikimedia.org/goto/4743bc53e694043b26ff941b70d78c8a
[20:01:29] <gehel>	 ebernhardson: https://meet.google.com/stp-swkd-iho
[20:02:08] <ebernhardson>	 omw
[23:26:51] <mpham>	 Trey314159: that is a horrifying amount of sugar! Do you think it includes corn syrup?
[23:29:56] <mpham>	 I created this (placeholder) ticket to figure out what we need to do to kill ApiFeatureUsage: https://phabricator.wikimedia.org/T301724. There's a question about whether we want to undeploy it or sunset it, and I'm not sure what the difference is or which we want