[07:46:22] was about to merge https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CirrusSearch/+/804446/ but stopped when I saw that cindy had not voted and the comment on the old cindy vm [07:47:25] keeping it around in case it's useful for testing, tried connecting to the new cindy host but I'm not sure I understand the logs well (can't spot where the tests are run yet) [09:48:27] lunch [10:26:57] lunch [12:12:40] errand [12:42:37] dcausse: I updated https://gitlab.wikimedia.org/repos/search-platform/cirrus-streaming-updater/-/merge_requests/2 now with blubber’s copies and file:/ URIs for schema repo/config. Could you have another look, please? [12:46:13] pfischer: sure! [12:48:23] pfischer: when you have a couple minutes I have 2 trivial patches https://gerrit.wikimedia.org/r/c/wikidata/query/rdf/+/908565 and https://gerrit.wikimedia.org/r/c/wikidata/query/rdf/+/908839 for review [12:57:09] ottomata I'll be 5-10m late to the flink working session [12:58:34] o/ [13:25:22] o/ [13:25:32] dcausse: sure. [14:00:16] dcausse ~5m late to pairing [14:08:48] pfischer (and all): as we discussed a more European friendly unmeeting, I've scheduled one for next week. I've also invited a few people from outside the team who have expressed interest. [14:31:43] @team: could you please fill the standup notes early today? I'll have to summarize it still this evening as tomorrow is a holiday - https://etherpad.wikimedia.org/p/search-standup [14:32:17] And take that as a reminder that you should NOT be working tomorrow! [14:39:14] really? [14:40:47] \o [14:40:53] dcausse: yea, tomorrow is earth day [14:41:00] o/ [14:41:15] cool :) [14:41:32] also, it seems my cindy rework died in the middle of the night, `Bot password creation failed. Does this appid already exist for the user perhaps?`. Thats not supposed to happen when we delete the databases between runs :P [14:42:16] ah ok, that's why I was not seeing anything related to the tests [14:43:11] also it's not voting right now, i set 000 permissions for ~/.ssh/id_rsa which prevents the final connection from going through [14:43:40] i haven't decided where the code goes...for now i have it in https://gitlab.wikimedia.org/ebernhardson/cirrus-integration-test-runner/-/tree/work/ebernhardson/initial-commit?ref_type=heads [14:44:57] under /repos/search-platform/? [14:45:19] i wanted to try getting CI in gitlab to build and run the test suite too, but it doesn't look like i can do docker-in-docker easily (addshore has it somethow for mwcli, but i think he's using custom cloud runners he maintains) [14:45:24] yea i suppose [14:46:37] yes perhaps using the digitalocean gitlab runners? [14:46:48] ebernhardson: indeed, and I wrote it up in a markdown file for easy setup [14:48:07] https://gitlab.wikimedia.org/repos/releng/cli/-/blob/main/CI.md [14:48:19] https://wikitech.wikimedia.org/wiki/GitLab/Gitlab_Runner/Cloud_Runners [14:48:30] but that'd mean triggering gitlag CI from gerrit [14:49:24] i wasn't necessarily trying to run the gerrit checks in gitlab ci, but to have a ci for this repo that builds and runs the tests too to verify they work [14:49:42] ah ok [14:49:56] i suppose doing it in gitlab would be interesting though, maybe it would be possible to convert the barrybot.py script from running the thing to hitting some gitlab api url [14:50:22] addshore: thanks, doesn't look too bad. [14:51:32] dcausse: i dunno if you'll have time, but it would be curious if the ./create-env bit of that will work for cirrussearch development, this is essentially an extension of what i had hacked together for my own dev needs [14:52:12] sure, will launch it [14:52:38] note that you need a special version of mwcli,with https://gitlab.wikimedia.org/repos/releng/cli/-/merge_requests/401 [14:52:56] also was not sure if wanted me to merge https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CirrusSearch/+/804446 or still keep it for debugging? [14:52:58] oh i guess thats the ol dmr [14:53:10] https://gitlab.wikimedia.org/repos/releng/cli/-/merge_requests/402 is the new one [14:53:18] ok [14:53:41] dcausse: hmm, i think we can merge the cirrus patch but there might still be some changes. I've spent most of yesterday re-running individual features that seem to fail intermittently and trying to reduce them [14:54:28] Africa test still fails intermittently and i've no idea why... [14:54:35] ok shipping it then [14:54:48] is it hitting the local env? [14:55:16] Africa is the some test I thought it was enabled only for beta [14:55:21] yea, but it gets the search results page instead of landing directly on the África page [14:55:21] s/some/smoke [14:55:51] it's in the smoke so it should run both for the beta tests and the all-features test [14:55:57] ok [14:56:27] patch up for WDQS/bullseye experimentation if anyone has time to look: https://gerrit.wikimedia.org/r/c/operations/puppet/+/910507/ [14:56:54] school pick-up back in 10min [15:27:14] it's funny to see Makefiles again :) [15:27:46] but when I see them I want to type make all but looks like that's no longer a standard :) [15:28:54] inflatador: done [15:29:21] I've never had fun reading Makefiles ;) [15:31:01] gehel cool, thanks [15:31:41] re: makefiles, one of my friends really likes "just" as an alternative https://github.com/casey/just [15:37:28] ebernhardson: quickly reading the script before running it I see MW="./mw --no-interaction" in env [15:37:56] I see no ./mw in the root folder [15:38:11] dcausse: yea from the readme you need to fetch a version of the mwcli, because it's not merged yet you'll need to download and compile the go project: https://gitlab.wikimedia.org/repos/releng/cli/-/merge_requests/402 [15:38:21] dcausse: or you can copy from cirrus-integ03 the one i compiled [15:38:45] I built it locally, it made an image lemme see if I can find the bin [15:39:28] the --no-interaction part just avoids it asking any questions [15:42:13] have 3 warnings so far WARNING: The NETWORK_SUBNET_PREFIX variable is not set. Defaulting to a blank string., WARNING: The PORT variable is not set. Defaulting to a blank string. and ERROR: The Compose file '/home/nomoa/.config/mwcli/mwdd/default/base.yml' is invalid because: [15:42:21] networks.dps.ipam.config.subnet is invalid: should use the CIDR format [15:42:54] it's cloning stuff now [15:42:57] hmm, i think those warnings are normal for mwcli's first run. It has a bit of an odd startup, if i try and set env vars before doing anything it fails [15:43:08] so i first do a `stop`, but that causes those warnings [15:43:17] but running the stop makes it initialize enough things that we can set env vars [15:43:38] it's all a bit fragile still sadly :( [15:44:23] it did not fail tho it's still running [15:44:33] doing composer stuff now [15:45:01] once running it should be available on http://cirrustestwiki.mediawiki.mwdd.localhost:8080/ [15:47:20] ebernhardson: it worked! [15:48:36] it even has wiki wow :) [15:49:39] wikidata I mean [15:54:31] nice! and yes it has a super bare wikidata install, i haven't fully gotten the phpunit test suite running here yet though. The hope was to be able to run WBCS tests against that one [15:54:42] i mean it runs, but there are some fails i haven't looked into [15:56:32] yes even on my mw-docker setup some wbcs tests are failing (language related iirc) [15:56:46] how do I see what images are being used? [15:57:17] the only special images are listed in the `env` file, our elastic image and a specific version of fresh to run the test suite [15:57:47] otherwise its the default ones from mwcli, `docker ps` should give their names [15:58:45] changing ELASTICSEARCH_IMAGE requires running create-env again? [15:59:04] yes, cindy's process is to checkout the code and ./create-env for each patch [15:59:10] ok [15:59:42] so we could even control what version of the image we want to use from CirrusSearch [15:59:56] from the CirrusSearch codebase I mean [16:00:00] yea, we would whitelist some images or some such and select from the commit message [16:00:04] or branch name or something [16:00:39] def way more flexible than vagrant :) [16:00:43] i ripped a bunch of functionality out of barrybot.py btw, it should be down to about the bare minimum now [16:01:06] indeed, and hopefully by deleting everything between runs we avoid space issues [16:01:14] yes [16:01:59] the overnight error was odd though ... destroy was claiming to destroy everything but then failing to delete the network. docker reporteed no containers exist, nothing attached to network. starting a new instance got an old elasticsearch data volume. But once i restarted the docker daemon it was all happy again [16:02:03] Dinner [16:02:11] not a great sign if the docker daemon is freaking out after a day of runs :( but will see [16:03:56] :/ [16:31:28] re: java8 on bullseye, looks like ' require ::profile::java::java_8' is no longer enough [16:35:43] lunch, back in ~1h [17:19:41] back [17:20:24] dinner [17:20:33] back [17:20:39] oops! Double post [19:11:46] ryankemper: I'm moving T333656 to our current work board (and to in progress) [19:11:47] T333656: Decommission query-preview.wikidata.org - https://phabricator.wikimedia.org/T333656 [19:11:57] ack [19:19:19] inflatador: is T331303 done? from the comment in etherpad, I would think so... [19:19:19] T331303: Create tools for banning/unbanning Elastic nodes - https://phabricator.wikimedia.org/T331303 [19:22:10] gehel if you want to add a ref to https://phabricator.wikimedia.org/T335066 and close it that's fine. I'd like to do a 'ban by row' feature eventually [19:22:39] (linked phab ticket should satisfy the 'Create an alert for nodes that remain banned past a certain time threshold' criterion [19:23:41] done [19:26:03] weekly status update published on https://wikitech.wikimedia.org/wiki/Search_Platform/Weekly_Updates/2023-04-22 (and Asana) [19:41:18] quick break, back in ~15 [19:54:04] back [19:56:27] inflatador / ryankemper: I see a data transfer, I assume that the java deployment worked ? [19:56:52] Did you remove the condition and activate the java profile on all servers ? [19:57:01] gehel Y, java is installed [19:57:23] Cool! [19:58:03] yes, some errors around blazegraph or whatever not being able to start up so we kicked off a data xfer to see if it'll work fully once that `data_loaded` flag is set [19:58:16] gehel: we have not (yet) removed the condition for the non-bullseye servers [19:58:36] on the other hand, the data transfer keeps failing [19:58:49] I'm going to try a different host if this one fails [19:59:19] Might be a good idea to not deploy that on all hosts just before a long weekend:) [20:35:25] don't worry, not doing that ;) [21:38:36] data xfer for wdqs2009.codfw.wmnet is still going...will check tomorrow but for the most part, see ya Monday!