[06:48:55] o/ [07:13:38] o/ [08:27:10] cindy ran out of space, will try to cleanup [08:43:50] sigh... panic: EOF from gitlab.wikimedia.org/repos/releng/cli/internal/config.LoadFromDisk ... [08:58:37] had to update mwcli [08:59:09] cleaned up all docker artifacts and process accounting log [09:05:43] going to drop the cirrus-integ03 instance [09:47:28] errand+lunch [13:06:21] o/ [13:23:33] \o [13:23:56] sounds like cindy needs some extra daily cleanup cron's or something? [13:25:17] o/ [13:25:48] ebernhardson: possibly? the majority was I think docker left-overs [13:26:31] the process accounting (was several gigs) thing should definitely be logrotated or something [13:26:44] or disabled if we don't need it [13:26:55] hmm, we probably don't need the process accounting indeed [13:29:26] ack will try to disable this [13:36:32] surprisingly cron is not installed... [13:37:21] * inflatador_ is using irccloud as a bouncer now [13:54:50] Hi folks - I created https://phabricator.wikimedia.org/T401590 just now. Your thoughts would be welcome! [13:57:07] is mjolnir-kafka-bulk-daemon.service safe to restart on the search-loader hosts? [13:57:30] brouberol: yea it will just not ack anything in-flight with kafka and do it again after restarting [13:57:49] I'm in the process of decommissioning the kafka-jumbo1007->9 hosts, and their config has been updated. They need restarted to stop connecting to these brokers [13:58:30] cormacparle: changing weights shouldn't be a big problem, do you want to run a test to see the results of the change? [13:58:46] their = the mjolnir-kafka-bulk-daemon services, I mean [13:59:32] brouberol: can see in grafana if it's doing anything (https://grafana.wikimedia.org/d/000000591/elasticsearch-mjolnir-bulk-updates?orgId=1&from=now-7d&to=now&timezone=utc), but even if it's doing something it's safe to restart [13:59:46] thanks, that's good to know ! [14:00:42] for the most part it gets urls from a kafka topic, pipes the content of the url into elasticsearch, then acks the url in kafka. If it restarts it just doesn't ack and imports it again [14:00:49] s/elasticsearch/opensearch/ [14:03:44] understood [14:32:40] inflatador: I haven’t yet finished the last data xfers, will kick them off in an hour or so [14:34:58] ryankemper np, been focused on other stuff myself ;) [14:57:23] ebernhardson: maybe? The ask from the community here is pretty nebulous, and I don't know really how to define "success", except for "if there's a category that matches the search term then show it" [14:59:59] cormacparle: i'm guessing would be looking at interaction rates with MediaSearch? Perhaps % of queries w/clicks, click position, scroll (if the pagination triggered). But not sure it would move enough, or how that is all instrumented [15:05:55] inflatador_: meeting? [15:06:42] Trey314159 oops, brt [15:40:58] ebernhardson: hmmm ok ... so basically making sure we don't have any unintended side-effects. Will have a look at the instrumentation ... [16:02:21] cormacparle: yea, i mean we can ship things without testing them in that way. Perhaps instead we run a simpler offline test that shows stats about result set changes, or we can just wing it [16:02:46] cormacparle: there is also the question of if CirrusSearchNamespaceWeights gets used in MediaSearch, i dont see 'weight' anywhere in the php code, but it might still be getting it from a cirrus component [16:03:08] break, back in ~15 [16:06:18] * ebernhardson is trying to decide how exactly the mwscript/kubectl/etc abstraction should go in the new setup...somehow not obvious [16:16:00] ebernhardson: dcausse: When using AirFlow connection IDs via macro, the tests that render the DAG complain, that the connection ID is not defined. Is there a way to mock them? Obviously no other DAG tries to obtain the connection object and read data from it… [16:17:44] pfischer: conftest.py is where mocks are setup [16:18:32] seems like we mock "datahub_kafka_jumbo" connection we could perhaps add kafka-main connections too [16:43:21] ebernhardson: mediasearch falls back to fulltext search unless we're searching only in NS_FILE [16:44:29] If there an easy way to see the proportion of searches that aren't on NS_FILE? And if it's small then we could justify just winging it? [16:45:26] we're getting some saneitizer alerts for eqiad and cloudelastic, do we need to take a look at these? [16:48:14] cormacparle: hmm, it's queryable from hadoop but not necessarily easily. Also there is some hadoop issue happening this morning [16:48:46] the event.mediawiki_cirrussearch_request events should have the info [16:49:40] ok cool, I'll have a look tomorrow to give hadoop a chance to settle down [17:16:29] hmm, maybe it's easier since it's a self-contained state machine with no side effecs, but claude.ai did a pretty decent job of turning my updated state into a test suite. tbh it's probably more comprehensive than the test i wrote for the previous impl [17:16:49] i still have to read the whole thing and make sure it makes sense though :P But it did find a bug in my retry handling (off-by-1) [17:17:30] and it covers all the cases i had in the test for previous impl, plus a variety more [17:49:27] dinner [18:05:34] lunch, back in ~40 [19:25:50] sorry, been back awhile [19:56:23] * ebernhardson sighs...patch is far too big. +1030,-695. And not done yet [19:56:35] somehow i thought it was getting simpler, but the extra 400 lines of code suggests maybe not :P [19:56:47] there are probably still bits remaining to rip out though