[03:49:00] <MatmaRex>	 something seems very wrong with the search on wikipedia. pleasae have a look at https://phabricator.wikimedia.org/T393663 as soon as you can.
[08:20:12] <dcausse>	 search flowing to opensearch@codfw
[08:38:03] <dcausse>	 lesson learned, it's not good to leave completion index updates disabled for too long :/
[09:31:25] <dcausse>	 wdqs users are starting to notice the impact of a change we made to the rdf output ( https://www.wikidata.org/wiki/Wikidata:Report_a_technical_problem/WDQS_and_Search#SPARQL_query_for_family_name_(P734)_of_Adam_(Q70899)_returns_both_%22no_value%22_and_%22unknown_value%22 )
[09:31:42] <gehel>	 dcausse: any follow up on the completion issues? or are we good
[09:31:50] <dcausse>	 gehel: we're good
[09:32:37] <dcausse>	 I'll write a small script to reconcile affected wdqs items but might be good to schedule T386098 in the next couple of weeks
[09:32:38] <stashbot>	 T386098: Run a full data-reload on wdqs-main, wdqs-scholarly and wdqs to capture new blank node labels - https://phabricator.wikimedia.org/T386098
[09:40:53] <dcausse>	 err... completion index problem might not be solely related to updates being disabled...
[09:41:14] <dcausse>	 eqiad has only 5000886, codfw 10336191
[09:43:43] <dcausse>	 could be a bad run and then we stopped updating... but leaving a partial index live is not good :/
[09:44:11] <dcausse>	 will file a task to investigate what happened
[09:48:28] <dcausse>	 sigh... not finding logs of the past eqiad run in mwmaint1002... 
[09:51:39] <dcausse>	 we had weird behaviors in the past with scrolls in mixed clusters (https://github.com/elastic/elasticsearch/issues/25158) perhaps something similar happened and a bunch of pages got skipped?
[09:59:30] <dcausse>	 perhaps T363521?
[09:59:31] <stashbot>	 T363521: Completion suggester can promote a bad build - https://phabricator.wikimedia.org/T363521
[10:07:17] <dcausse>	 only seeing https://logstash.wikimedia.org/app/discover#/doc/logstash-*/logstash-mediawiki-1-7.0.0-1-2025.05.07?id=OJUeq5YBfOjk-Vo1yy77 but that should have stopped the script and not promote the index
[10:21:33] <dcausse>	 lunch
[10:30:45] <gehel>	 lunch 2
[13:17:31] <inflatador>	 <o/
[13:20:50] <dcausse>	 o/
[13:54:36] <inflatador>	 Created T393709 to talk about hosting autocomplete indices somewhere else
[13:54:37] <stashbot>	 T393709: Consider hosting autocomplete indices in a separate OpenSearch cluster - https://phabricator.wikimedia.org/T393709
[14:06:25] <ebernhardson>	 \o
[14:07:10] <dcausse>	 o/
[14:10:54] <inflatador>	 .o/
[14:19:45] <inflatador>	 CR for fixing up conftool after it changes: https://gerrit.wikimedia.org/r/c/operations/puppet/+/1143589
[14:20:14] <inflatador>	 err...with the changes for cirrussearch eqiad, that is
[14:39:00] <dcausse>	 trying to use pyspark as a quick&dirty script to extract a couple lines to stdout... not really great, passing the script as stdin but it's still in in a kind of interactive mode
[14:40:48] <inflatador>	 I'm not seeing our `cirrus_check_settings.json` files in `/etc/opensearch/production-${n}` anymore. Checked 4 hosts so far. Wonder if we goofed up https://gerrit.wikimedia.org/r/c/operations/puppet/+/1140519 somehow?
[14:41:48] <inflatador>	 oh wait, nm. That file's only on the masters
[14:44:59] <dcausse>	 realized that I can just run spark3-submit script.py...
[14:54:07] <dcausse>	 should probably use pyarrow and open parquet files for these simple data extraction...
[15:05:00] <ebernhardson>	 yea spark3-submit is the ticket, but indeed parquet can be even easier, although sometimes HADOOP_CLASSPATH needs to be set for it
[15:49:06] <inflatador>	 I finally got tired of typing `curl xyz` over and over again to get node status/health and made a few crappy bash functions. This is just a starting point, if y'all think of a better way to do this LMK. PRs welcome! https://gitlab.wikimedia.org/repos/search-platform/searchme#
[15:51:00] <ebernhardson>	 pondering https://gerrit.wikimedia.org/r/c/operations/puppet/+/1142693 ... it seems the intent is we should have a discovery-dns for each cluster? so search.discovery.wmnet, search-chi.discovery.wmnet, etc.
[15:55:33] <dcausse>	 yes.. port is not a thing apparently?
[15:56:27] <ebernhardson>	 i think joe is saying port is a thing, but discovery-dns should be per-cluster. If they are different clusters, they should use different names
[15:56:43] <dcausse>	 sure
[16:07:17] <inflatador>	 workout, back in ~40
[16:54:01] <inflatador>	 back
[16:58:53] <inflatador>	 re: the envoy patch, I think setting up multiple discovery records is a good idea as well. I'll get a ticket started for that
[16:59:27] <ebernhardson>	 inflatador: i made a patch already, docs are pretty short suggest this is all thats needed: https://gerrit.wikimedia.org/r/c/operations/dns/+/1143617
[17:00:11] <inflatador>	 {◕ ◡ ◕}
[17:01:07] * inflatador wonders if this means we'll need new SAN names
[17:01:57] <ebernhardson>	 sadly, probably yes. more names
[17:02:53] <inflatador>	 Yup, confirmed. No problem, CFSSL makes it easy
[17:05:35] <ebernhardson>	 i waffled, but in the end added search-chi as well, it seems like ideally we should move away from the unprefixed name
[17:13:56] <inflatador>	 Yeah, agreed
[17:35:15] <dcausse>	 3k completion/s, 1.2k fulltext/s not sure we can sustain that :)
[17:36:06] <ryankemper>	 yeah I wish the vatican could have waited for the opensearch upgrade to finish up
[17:45:33] <inflatador>	 :P
[17:46:13] <inflatador>	 Looks like things are on the down-trend. Interestingly, we never got the hot spot problems on individual hosts like we've been seeing lately in eqiad
[17:55:41] <dcausse>	 dinner
[17:59:25] <ebernhardson>	 how odd... https://commons.wikimedia.org/wiki/Special:MediaSearch?search=deepcategory%3A%22Manufacturing+by+product%22&type=image works, but https://commons.wikimedia.org/w/index.php?search=deepcategory%3A%22Manufacturing+by+product%22&title=Special:MediaSearch&type=image does not
[18:08:06] <ebernhardson>	 hmm, it's not the url that matters, can fail on same url. I suspect it's failing and the warning isn't being forwarded to the user
[18:13:10] <inflatador>	 lunch, back in ~40
[20:30:39] <ryankemper>	 inflatador: https://gerrit.wikimedia.org/r/c/operations/puppet/+/1143670 patch to shift over some wdqs-full hosts to wdqs-main
[20:31:03] <inflatador>	 👀
[21:53:10] <inflatador>	 ryankemper here's my handoff, I'm heading out for the day: https://etherpad.wikimedia.org/p/handoff-wdqs-T388134
[21:53:10] <stashbot>	 T388134: Drop support for the full Wikidata graph from query.wikidata.org - https://phabricator.wikimedia.org/T388134