[07:10:06] o/ [08:42:11] o/ [09:54:22] Hi folks - on this https://phabricator.wikimedia.org/T401590 ... I went poking around in `event.mediawiki_cirrussearch_request` and it looks like about 1/3 of relevant searches in May 2025 are fulltext searches, and 85% of those are via Special:Search. [09:54:22] If we make the changes in the ticket it will affect a LOT of searches, and instrumentation via MediaSearch is going to be mostly irrelevant. Any thoughts? Do we still have a search satisfaction score that we could compare before/after? [09:55:35] here's my "relevant searches" query fwiw [09:55:37] ``` [09:55:47] https://www.irccloud.com/pastebin/0Sw38L34/ [10:37:01] cormacparle: looking, I think we have statisfaction but not a dashboard that allows to filter commons [10:37:40] actually we do [10:37:51] should be "Search Metrics, Web" in superset [10:38:39] looking at the default ns filter on Special:Search if this includes category by default [10:40:30] yes, by default it searches for Main, File, Help, Category, Creator, Institution so yes it'll likely affect Special:Search quite a lot, but hopefully for the good? [10:41:54] lunch [11:50:55] hmmm ok so basically we could make the change and make sure that fulltext search abandonment for Commons doesn't spike? [11:54:31] hmm looks like it's not exactly stable anyway - it went from from ~32% to ~43% yesterday [12:12:50] cormacparle: I think this is due to a data collection we had here (T400834), the drop on jul 17 corresponds to the bug, needs to check when the fix was deployed but possibly yesterday with the train to group1 [12:12:51] T400834: SearchSatisfaction event msToDisplayResults validation error 2025-07 - https://phabricator.wikimedia.org/T400834 [12:13:01] *data collection issue [12:13:39] so yes I'd be for trying and see the impact in this dashboard [12:16:18] yes T400834 is marked for mw 1.45.0-wmf14 which was deployed yesterday on commons I think [12:26:32] ok cool - I guess let's wait til next week to allow the graph to settle down and then I'll make the config change and deploy it and we can keep an eye on the result [12:30:32] sounds good [13:15:28] ooh I just remembered I'm on holiday next week [13:15:35] maybe the week after [13:24:20] cormacparle: no worries, we can certainly take care of this if you want :) [13:40:04] \o [13:44:49] o/ [13:45:03] hmm, all reindexers i started up last night are having backfiller problems :S On the upside, this now cleanly picks up and reindxes that are in progress, even if they finish and just hang around in k8s state [13:45:08] on the downside, something doens't work :P [13:46:10] i should just finish removing threads and change it to POST_VERIFY -> PENDING_BACKFILL -> BACKFILLING -> FINISHED [13:46:55] i wonder if maybe somehow it has multiple backfill threads running... [13:47:40] yea, from the way logs are printing, i have competing backfill threads in the same process :S [13:49:30] separately, i wonder if the backfill release should be more like mw-script, with a hash in the name. The problem would be if i track he reindex pod in the state, there is nothing unique about it [13:50:18] if it picks up the reindex pod on startup there wouldn't be many guarantees that it's the same reindex pod we were expecting [13:56:59] but i guess i kinda liked the limit that we couldn't accidently deploy 18 separate backfill pods at once :P [14:10:04] a unique tag could help to bail out saying that a backfill is in a "stale" state or that someone else is running something I suppose, does not necessarily have to be in the release name tho? [14:10:29] hmm, yea i suppose. if we can just inject a unique annotation like the mwscript comment that would work [14:15:59] we don't have a keyword to filter on page edit timestamp, could be useful to add I think, growth is working on a new feature to flag tone problems in pages but don't want to surface pages that have been edited recently in case they're just WIP [14:16:53] context is T392283#11077823 [14:16:53] T392283: Q1 FY2025-26 Goal: Apply the Tone Check model to published articles, to learn whether we can build a pool of high-quality structured tasks for new editors - https://phabricator.wikimedia.org/T392283 [14:17:12] hmm, yea we should be able to support that [14:19:27] extracted page_ids from our SUP error stream filtering on what seems to be "hard to parse pages", over 5 months I captured 461 unique page ids, I would have guessed much more [14:19:42] interesting, that is indeed much less than i would have thought [14:28:41] https://logstash.wikimedia.org/goto/98ef62b78a2292bd41b13257bfbd4562 does this link work for y'all? it should be showing log messages from today@14:12-14:20 UTC [14:29:11] inflatador_: yup [14:29:45] i wonder what we need to do to get the ECS fields [14:30:16] ebernhardson I'm toying around with some changes now, context is the replies from c-white in https://gerrit.wikimedia.org/r/c/operations/puppet/+/1178613 [14:31:56] yup, it's the same issue from when i got it going before. It has info, but not all the standard fields. It's not a biggie, but i use the `host` field a decent bit for filtering (ex: host: cirrussearch*) [14:32:05] we could use type:opensearch i guess [14:33:42] I'll try to get it working at least as well as it was before we migrated (fingers crossed) [14:54:57] staff mtg or retro? [14:55:06] I'm going to staff mtg [14:55:25] Looks like log4j has some ECS bits built-in, not sure if it works w/our version https://logging.apache.org/log4j/2.x/manual/json-template-layout.html [14:55:28] have a quick question for the retro then staff meeting? [14:56:04] cc ebernhardson, pfischer, Trey314159 ^ [14:56:07] sure [14:57:12] inflatador_: if i had to guess, i would look into the OpenSearchJsonLayout that we set as appender.ship_to_logstash.layout.type, that controls the output formatting and fields. It might be that there is some ECS variant of that class, or an ECS flag or something [14:57:22] or maybe opensearch is bitter and doesn't want ECS since it has elsatic in the name :P [14:59:19] taking a quick look in the class, can specify extra fields but nothing mentioning ECS. And it's the only Layout class in that dir (org.opensearch.common.logging) [14:59:29] yes... surprising... [15:00:38] for flink we download co.elastic.logging ecs-logging-core [15:01:07] ahh, maybe we would need to do the same to provide the "real" ecs layout [15:01:39] pfischer: retro for 5min (https://meet.google.com/eki-rafx-cxi?authuser=0)? [15:17:18] I think we might need to do that...it also looks the log4j templated JSON thing is its own jar https://mvnrepository.com/artifact/org.apache.logging.log4j/log4j-layout-template-json/2.17.2 ;( [15:20:11] alternate methods include using rsyslogd to inject ECS fields, but i don't know we want to get into that [15:24:21] yes, sounds fragile :/ [15:38:06] it also looks like puppetdb (which is also a java application) is using a`liblogstash-logback-encoder-java` package and there's a logback.xml (which we use too https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/modules/query_service/templates/logback.xml.erb ). Could that be a piece of the puzzle? [15:39:05] it's a bit awkward, logback is like a log4j successor [15:40:18] i'm not 100% sure, but i think logback would require opensearch to be using logback, but it uses log4j [15:40:53] i think we want the ecs-logging-java / ecs-logging-core david mentioned about [15:40:55] above [15:41:20] ACK...so would we just add that to our `wmf-opensearch-search-plugins` plugins package? [15:42:54] i'm not sure :S It's not really a plugin, but the package could simply put the file in place anyways [15:43:37] Ah, good point. We could also roll a separate package if that's more appropriate [15:43:38] i would almost be tempted to make it a dedicated package, should be easy enough to re-use the ci bits from the plugins. but maybe even more debian packages is overkill [15:44:27] I'd go with a separate pkg, that way other opensearch roles could use it [15:45:26] yea makes sense [15:45:55] i'd have to look around for exact commands, but there should be a maven command that will go out and fetch the appropriate library to be included [15:49:09] this command works locally: JAVA_HOME=/usr/lib/jvm/default-java mvn org.apache.maven.plugins:maven-dependency-plugin:3.6.1:copy -Dartifact=co.elastic.logging:log4j2-ecs-layout:RELEASE -DoutputDirectory=$PWD -DdestFileName=log4j2-ecs-layout.jar [15:49:33] yeah I saw it in the flink image...gerrit is unhappy or I'd link it [15:51:08] https://gerrit.wikimedia.org/r/plugins/gitiles/operations/docker-images/production-images/+/ec3785527a1dada3e022a57a50c43da269afe82d/images/flink/flink/maven-download.sh [15:51:52] ahh, that's better since it does the checksum and signature checks [15:58:36] * ebernhardson wonders, while watching the demo, if we should be using a larger query size. We were always caching 3 for related articles, but it could just as easily always to 6 or 8 and just trim to 3 for related articles [15:58:58] it's really not more expensive to emit a couple more results from the same query [16:00:22] OK, created T401933 for the ECS logging stuff. Feel free to change the description or add any comments [16:00:23] T401933: OpenSearch: emit ECS-compatible logs - https://phabricator.wikimedia.org/T401933 [16:01:08] workout, back in ~40 [16:49:53] back [17:23:39] dinner [17:57:45] lunch, back in ~45 [18:44:08] back [19:42:37] break, back in ~20 [21:58:33] been back, but heading out for the day now