[08:25:08] we might want to re-shard ruwikinews, 60Gb for the content shard, ~70G for the general one [08:43:06] unsure to understand why I can't see all the indices in elasticsearch_indices_store_size_bytes_primary{}, we might filter them in explicitly but can't find where [11:02:01] lunch [12:57:25] greetings [13:15:55] dcausse gehel this is probably nothing, but if y'all want to look over this and let me know if this is a real concern or not: https://phabricator.wikimedia.org/T313726 [13:20:52] o/ [13:20:55] inflatador: sure looking [13:22:27] inflatador: it's a known issue, all security features are currently non-free so we can't install them. But once we're on opensearch we can revisit this [13:23:49] dcausse was going to check if nginx would help, pretty sure it can restrict methods [13:24:30] https://nginx.org/en/docs/http/ngx_http_core_module.html#limit_except [13:25:01] inflatador: yes it could and that's what we do for cloudelatic but unfortunately we use non-GET requests for some read operations IIRC [13:26:16] I'm sure we could do better that the default Allow-all we have atm tho [13:26:43] and that could be done at the ferm level perhaps? [13:27:48] we want to keep things readable from everywhere, so that has to be at the HTTP layer (I don't think that ferm talks HTTP). nginx is a better place to ad such restriction. [13:28:07] There's an open question on whether we need more restrictive access or not... [13:28:30] and if we go for more restrictions, we should really have authentication, which is a whole other story [13:29:28] I don't really want to get into auth, just make sure we're aware of everywhere we allow certain methods [13:31:04] if we are, and we don't need to make any changes, that's OK too [13:31:39] inflatador: I think we should do something but it might be better to do it with the opensearch security features [13:41:00] dcausse ACK, will also talk it over with e-bernhardson and ryan-kemper once they get in [13:47:19] dropping off my son, back in ~15 [14:00:42] back [15:01:36] triage is starting: https://meet.google.com/eki-rafx-cxi [15:02:24] ejoseph: ^ [15:03:32] i'll pause cindy for the moment so it doesn't throw anyone off [15:57:36] wow, it really is running slow through ruwikinews. [16:01:23] wikidatawiki and commonswiki mostly completed, looks like commonswiki_file on eqiad has to be retried, but the rest finished [16:02:21] also looks like i missed setting the doc type in archive indices :( at least those are fast to reindex everywhere [16:04:03] ruwikinews is only managing 5k docs per 30s poll, i suppose don't usually notice how much slower indexing into giant shards is since we usually have quite a few shards [16:05:59] i'm going to kill the ruwikinews reindexes and intend to try again after merging the increased shard count [16:08:40] scheduled patch for utc late backport window [17:15:08] lunch, back in ~1h [17:18:57] dropped the custom logging levels from cluster state in all clusters from eqiad and codfw, logstash doesn't seem to be yelling. Suggests the deprecation cleanup work so far was a success [17:20:56] (only the org.elasticsearch.deprecation ones) [17:21:33] \o/ [17:40:27] dcausse: follow up to magnus's free solo with alex honnold: https://www.youtube.com/watch?v=9eFFouLvEOI [18:18:05] mpham: thanks! will watch this tonight :) [18:26:51] back [18:36:43] hmm, found one more deprecation coming out of wikidata, only seems to trigger when sorting mode is set to random (since we put the whole query into filter context) [18:37:17] also wonder if this user really intended to use sort mode of random, because they also sent pagination queries (which are ignored with random sort) [19:02:14] do we use the nodes info API at all? https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-nodes-info.html . I'm seeking a way to get the most truthy list of nodes currently in the cluster [19:02:55] it's all there, mainly just wondering if anyone knows any fancy ways to filter my request so I only get a few fields back [19:02:58] inflatador: usually i use /_cat/nodes for that kind of info, i don't think we've used the nodes api much but it should give the same information with more flexibility and more available information [19:04:10] ebernhardson thanks, _cat/nodes is probably enough info, also i missed the obvious examples listed on that page ;( [19:04:30] inflatador: i think the only upstream filters are the set of metric's available, but haven't used it enough to say. I would probably pipe it through jq (or process in python) if fancier things are needed [19:05:01] jq can do fancy things as well, but i suspect its a write-once kind of language where editing an existing complicated jq parse is difficult :) [19:05:25] yeah, that's exactly what I'm trying to avoid ;) [19:19:56] lunch [19:57:23] back [20:59:21] hmm, this mediawiki-core repo was cloned without all the details necessary to `git pull --ff-only`, but i don't really want to delete the directory structure (has extensions, skins, etc.). Wonder if i can clone into a new repo and move the .git over...lets find out :) [21:20:49] turns out, yes you can (make sure you checkout the same commit in the "new" repo before moving over .git) [21:45:13] oh nice, I was kinda curious [21:45:25] * inflatador is sure I'll need to do that at some random time in the future [22:29:03] heh, we closed the CirrusSearchIndexTooOld phab ticket this morning, so alertmanager helpfully reopened it for the one index that failed :P It's working on trying again, should be finished tonight hopefully [22:29:19] or not really reopened, but opened a new one [23:21:02] * ebernhardson now realizes part of the reason codfw is ahead of eqiad in reindexing is because it failed ~10 wikis in a row. oh well starting those back up