[07:17:24] zpapierksi: Could you take a look? https://gerrit.wikimedia.org/r/c/wikidata/query/rdf/+/710567 [07:17:40] zpapierski: ^ [07:17:52] sure [07:25:25] two small comments [07:25:38] let me know once you address them and I'll review again [07:35:19] Thanks! [07:36:57] dcausse, zpapierski: looks like wdqs2003 has a full disk. Can you depool and check with ryankemper for a data reset ? See -sre for more [07:37:15] looks like it's me [10:08:01] zpapierski: uploaded patch. I created a list and used mkString on it. [10:15:34] sorry for the delay, my laptop had crashed. IntelliJ does that sometimes. [10:29:39] you don't really need to care (and I forgot to mention this earlier, but scala has a string interpolation [10:29:44] https://docs.scala-lang.org/overviews/core/string-interpolation.html [10:31:10] you know which version of scala you run those on? [10:41:31] we are on 2.11.something. So interpolation is probably supported [11:15:30] zpapierski: uploaded. looks much neater. thanks! [11:27:18] and +2ed, have fun! [11:28:09] :) [12:02:40] ebernhardson1: https://gerrit.wikimedia.org/r/c/wikidata/query/rdf/+/710567 was merged. Can we now deploy rdf-spark-tools and restart 'process_sparql_query_hourly' job? [12:20:06] break [13:39:14] @ryankemper One of my main reasons for wanting to use AWS ES is that I've never worked with ES before and I have an urgent need to get it up and running quickly since I need to get CirrusSearch deployed ASAP. I also want it to be HA and AWS makes it easy to set up a multi-AZ configuration very quickly. [13:41:26] Another issue I have is that upgrading MW, say 1.35 to 1.36 (which is my next project) is a real PITA for me because I have to create duplicates of basically the entire environment by hand (automation with Terraform is the next big project) so that means creating EC2 instances, cloning db clusters, etc. and now duplicating (and then indexing) the [13:41:26] ES cluster as well. [13:43:21] That said, once it's working (and I can easily disable it for a time if needed), I can then also spend time learning it better and possibly spinning up my own, especially in a way that is easy to duplicate via Terraform, Helm, etc. [13:44:47] Right now I'm just in the "get stuff working" phase because search has been causing us serious performance problems ever since our MW 1.35 upgrade in March, so this is just to get over that hurdle (on top of the resizing of the Aurora cluster for the affected database) and give me breathing room for the complete redesign of my wiki environment next [13:44:47] year. [14:57:27] \o [14:57:40] tanny411: should be able to, i can do that later today if it's ready? [14:57:58] tanny411: will just need to make sure the jar reference in the patch is correct [14:59:09] ebernhardson1: sure, that works! [15:00:16] it's been merged. I think rdf-spark-tools needs to be bumped and then re-do the jobs. You'd know better :p [15:00:17] * ebernhardson1 was expecting a meeting now, but today is tuesday. no meeting :) [15:00:52] tanny411: ok, we probably need to do a release since this was just merged. I think that is done through a jenkins job but i've never run it [15:00:57] dcausse: know which button to push? [15:02:01] yeah, he is out this week. He did the release for me last time, so I wouldnt know that part. [15:02:58] hmm, pretty sure its a jenkins job..i guess poke around and find something named appropriately :) [15:15:06] should be https://integration.wikimedia.org/ci/job/wikidata-query-rdf-maven-release-docker/build/ [15:36:18] ebernhardson: dcausse is on vacation, something I can help with here? [15:36:35] (catching up on chat) [15:36:45] huh, not sure how this is released [15:37:22] probably via standard rdf deploy repo, lemme check [15:37:45] zpapierski: i think it's just that job, i've run it will update the jar in the analytics repo to match [15:38:23] I meant the jar release, I'm assuming you need it somewhere [15:38:54] can you point me to that repo? I can probably figure based on it what to do [15:39:45] zpapierski: we just need it on the analytics instance for hadoop jobs, that's done by updating the .jar in the repo [15:40:04] in wikimedia/discovery/analytics. the tools came from wikidata/query/rdf [15:40:16] ah, so it's a release then [15:40:22] not from deploy [15:40:31] yea [15:42:29] I see that release was already done, jars should be in archiva, version 0.3.81 [15:42:53] so somebody clicked the button :) [15:43:17] (of course I mean the release plan we created for rdf [15:43:59] yea i clicked it :) I'm not sure how the jar is otherwise deployed, i imagine something uses it :) [15:44:17] yeah, that plan is meant for releasing jars to archiva [15:44:22] did it finish already? [15:45:28] looks like it's there - https://archiva.wikimedia.org/#artifact-details-download-content/org.wikidata.query.rdf/rdf-spark-tools/0.3.81 [15:53:15] Thank you so much!! [15:58:28] hmm, we have a Search Platform <> CTO meeting tomorrow. Are we expecting someone else to show up, or cancel? [17:05:25] justinl: makes sense to me! aws es is totally fine for getting something up and running. longer term using either terraform or k8s to spin up aws ec2 instances to run elasticsearch on directly is ideal (but to your point takes some serious effort) [17:14:20] ebernhardson: Can you check out the last few replies here, please?: https://www.mediawiki.org/w/index.php?title=Topic:We4sil15zp23xbxy I'm trying to give advice to an external MediaWiki user on calling UpdateSearchIndexConfig.php and I'd like a second set of eyes to double check. Thanks! [17:31:18] Trey314159: thats about right, the user didn't provide the output of the update script but i would guess it told them they need --reindexAndRemoveOk or other relevant command line args. [17:32:10] without the args they called the script with or the output of the script it's hard to say for sure. The important parts are `--reindexAndRemoveOk --indexIdentifier now` [17:32:53] Thanks, Erik. I pointed to the docs, which use those params, but I'll say it explicitly. [17:38:07] ryankemper That's definitely the plan. I'm intended to move the wiki compute components into AWS EKS (never used it, barely have used k8s at all, just a bit a couple of years ago) so that might also be a good place for spinning up our own ES, not sure though. It'll definitely take some serious thought and testing. [22:08:50] Is Error rate on the WDQS grafana the number of timeouts we have? https://grafana.wikimedia.org/d/000000489/wikidata-query-service?orgId=1&from=now-6M&to=now&refresh=1d [22:19:38] mpham: hmm, the rate is derived from 'Number of queries in error since the start of the application'. Looking a little into the blazegraph code, they increment it for queries with 'abnormal termination'. It's probably more than just timeouts, but hard to say what exactly they consider abnormal [22:19:51] i could imagine syntax errors in the source queries being there, but maybe not as well :S [22:21:07] ah, ok thanks. Do we know if we're tracking WDQS timeouts anywhere? I found one on the WDQS UI dashboard, but that would just be for the UI [22:23:01] hmm, not sure. Looking at the names/descriptions of the stats we export from blazegraph, it's not reported there at least :S [22:24:15] glancing at the UI timeouts, indeed that looks to be something the ui is directly recording [22:29:41] hmm, ok. thanks for checking