[02:35:30] Hi, heads up that we are moving away reqs to wdqs evaluating regexes to shellbox in a week or two. It'll reduce around 5K/min (up to 50K reqs/min) from wdqs [06:18:16] Amir1: nice! thanks for the heads up :) [06:36:15] hello folks [06:36:22] elastic2043 is down again :( [06:36:31] I acked the alert and re-opened https://phabricator.wikimedia.org/T281327 [06:48:27] elukey: thanks! [06:52:56] aand wcqs is up again [06:53:04] dcausse: how to best verify skolemization? [06:55:59] zpapierski: a query like select * { sdc:M106076433 wdt:P170 ?o } should return https://commons.wikimedia.org/.well-known/genid/XYZ [06:56:15] and not things like: t8237469 [06:57:23] this will test that the munger had run with --skolemize [06:59:28] then adding a filter like FILTER ( wikibase:isSomeValue(?o)) to the same query should work and will test that blazegraph has been launched with proper settings [07:00:44] ok, so bad news :( [07:01:01] https://tinyurl.com/y3wax4nk [07:01:08] I see those t* values [07:01:16] :/ [07:01:18] ahh [07:01:22] I know what I did [07:01:25] my fault [07:01:38] next time it will run correctly (next week) [07:02:18] or actually [07:02:23] no, I run with --skolemize [07:02:25] weird [07:02:45] we can check the munged files [07:03:18] what to look for? [07:03:53] for ".well-known" [07:04:09] or _:XYZ [07:04:15] the latter are blank nodes [07:04:51] and the former? [07:04:54] opening the first chunk I'm sure there will be either one of these very soon [07:05:23] the former would mean that --skolemize did work properly but the import did take some other dump files for some reasons [07:07:41] java.lang.IllegalStateException: [https://commons.wikimedia.org/.well-known/genid/] must be part of the vocabulary [07:07:42] wdt:P170 _:73ca7d9dc9a9206336270d8276398bf6 ; ? [07:07:52] these are blank nodes [07:08:03] what did you paste? [07:08:26] a query failing when using wikibase:isSomeValue(?o) [07:10:23] the RWStore is probably still pointing at this old vocab, checking [07:12:40] hm.. no [07:14:51] ah /etc/wcqs/RWStore.wcqs.properties [07:15:21] com.bigdata.rdf.store.AbstractTripleStore.vocabularyClass=org.wikidata.query.rdf.blazegraph.WikibaseVocabulary$V004 should be V005 :( [07:15:32] so it is pointing to the old one [07:15:39] config change should fix the issue? [07:16:10] removing -DwikibaseSomeValueMode=skolem will fix the problem immediately [07:16:35] but also will remove skolemization? [07:16:40] yes [07:16:59] I still didn't announce service being back up, so we can experiment for a bit [07:17:19] either we extend the downtime and get to the bottom of this skolemization problem or we cancel enablin skolem for now [07:18:27] Let's give ourselves some time [07:18:45] we need to get it running there eventually, otherwise no streaming updater for WCQS [07:26:03] makes sense, we first need to understand why the munger did not skolemize the blank nodes [07:26:29] might be a bug with fedaration [07:27:36] zpapierski: I'm on some reviews and then on k8s, would you mind looking into this? [07:27:45] I am [07:27:48] thanks! [07:33:33] but that might be a short investigation, I'm not seeing any obvious ways of changing properties of an existing namespace :( [07:34:16] hmm, maybe a cli override... [07:39:52] for changing the vocabulary? [07:40:12] for changing anything other than defaults, but I keep looing [07:41:17] funny how much of the things are find online about blazegraph are actually about us :) [07:41:28] (also not really hepful :) ) [07:42:02] yes, I often encounter bugs created by Stas :) [07:42:39] s/created/reported to be precise :) [07:44:15] right :) [07:48:00] API is a no go, it assumes I want to update with query, even when run on properties endpoint [07:50:49] ok, I'm probably overinvesting here - I'll remove the skolemization for now, but update the property store. Since namespace is created anew each time, after the next update it should be fine [07:51:22] and since yesterday new dump arrived, I might as well start the next update right now - so we should be able to try it out on Friday [07:53:41] ah, it WAS my fault, I didn't update WCQS after your change to RWStore.properties [07:54:20] I thought that RWStore.wcqs.properties was managed by puppet [07:55:44] ah, right, that's the different one that what WCQS uses [07:55:51] got confused, too many of those [07:57:52] I'm assuming inlineURIFactory is also required [07:58:31] dcausse: I don't want to self merge, care to do the honors? https://gerrit.wikimedia.org/r/c/wikidata/query/rdf/+/698452 [07:59:07] sure [07:59:20] thanks! [07:59:44] (sorry I thought it was the deploy repo) [08:00:31] yes vocab and inlineURIFactory must be updated [08:02:54] zpapierski: the way this import has been done did include the --skolemize option like in the patch above? [08:08:57] yep, I did the manual update of the script before running it [08:09:31] and apparently blank node are there [08:20:58] gehel: when convenient? https://gerrit.wikimedia.org/r/c/operations/puppet/+/698949 [08:21:11] looking [08:21:41] thanks! [08:21:53] trivial enough: merged [08:24:41] so something not working in these scripts or in the code itself :/ [08:38:48] why? they had old settings in RWStore, isn't that the issue? [08:39:23] anyway, for now, after update (I didn't update in a while it seems), blazegraph sees no namespaces :( [08:44:18] puppet agent not working correctly either, weird [08:45:36] zpapierski: need some help on puppet? [08:45:53] sure [08:46:06] https://www.irccloud.com/pastebin/1Jbid6CO/ [08:46:23] that's what I got after updating to the recent production branch [08:46:42] yesterday it was fine, but I did not update puppet master for quite a while [08:47:01] I have no idea what pki is [08:48:05] interesting... on which server are you running? [08:48:15] zpapierski: there're two issues I think [08:48:56] gehel: puppetmaster is run from wdqspuppet.wikidata-query.eqiad.wmflabs [08:49:17] wcqs-beta-01.wikidata-query.eqiad.wmflabs is the instance that uses it [08:49:30] 1/ RWStore using the old vocab, 2/ the munger not skolemizing the blank nodes [08:49:46] ah, so the _:XYZ is actually incorrect? [08:50:09] strange, I can't SSH into that node [08:50:13] * gehel is checking ssh config [08:50:18] yes should be the skolem IRI: https://commons.wikimedia.org/.well-known/genid/XYZ [08:51:14] _:XYZ is a labelled blank node that we want to transform to https://commons.wikimedia.org/.well-known/genid/XYZ [08:51:16] gehel: it is weird, I'm having no such issues [08:51:20] zpapierski: you just need a hiera config, let me recall what was that's needed [08:51:20] I see [08:51:47] the PKI is our internal certificate service and there is an installation in wmcs too IIRC [08:52:13] yeah, but I can't (couldn't?) use WMCS because of our own secrets [08:52:22] zpapierski: ssh config updated, I can get in [08:53:03] volans: but if there's a configruation I can get into our puppetmaster, I can just store it in a local commit (like I do for oAuth secret now) [08:53:15] oh no, still another ssh issue [08:54:10] zpapierski: check if the labs/private repo checkout auto-sync is working correctly [08:54:17] that key is already defined there (with a dummy value) [08:55:13] if you have local modifications it can't auto-rebase itself with prod [08:55:18] *uncommitted local changes [08:55:52] fwiw I can't get in with a global wmcs root key, getting timeout, so I can't check it directly [08:56:20] I'm not sure what you mean - we have a manually created puppetmaster that uses the puppet repo, the public one. [08:57:10] I guess you followed https://wikitech.wikimedia.org/wiki/Help:Standalone_puppetmaster [08:58:00] not sure, not my handiwork [08:58:03] the puppet repo is composed of 2 repos, the public puppet repo and a "private" one that resides on the puppetmsasters only in production. There is a dummy version of that (see labs/private.git in that page) that's used on all WMCS instances [08:58:15] ah, I see [08:58:20] *puppetmaster instances [08:58:38] it should be in /var/lib/git/labs/private [08:59:07] and there is a cron/timer every 10m or so that keeps the 2 repos in sync with master rebasing them [08:59:15] and it's there [08:59:22] as long as there are no local modifications or rebase conflicts [08:59:26] with commits as recent as June 8th [08:59:34] so I guess it workds... [09:00:01] if they happen then the sync stops working and either one of the 2 repos might fell behind [09:00:35] can you git grep profile::pki::client::auth_key there? [09:00:43] there should be 2 [09:01:12] and they are [09:01:18] s/they/there [09:01:39] and the puppet repo too is in sync? /var/lib/git/operations/puppet [09:02:28] yep, also around June 8th [09:03:04] that should be now [09:03:07] *that one [09:03:18] we commit to puppet all the time [09:03:26] * volans just merged one patch [09:04:26] yep now it is here too [09:04:34] on June 9th [09:05:32] and now puppet runs correctly! [09:06:38] huh? why? [09:07:04] I did rebase again, but that added a single, non relevant commit [09:07:10] yes, it does work [09:07:13] check /var/log/git-sync-upstream.log [09:07:20] it should tell you why the sync was not working properly [09:07:29] the cron should be in /var/spool/cron/crontabs/root [09:07:36] with [09:07:38] */10 * * * * /usr/local/bin/git-sync-upstream >>/var/log/git-sync-upstream.log 2>&1 [09:08:01] if you had local modifications or rebase conflicts it would have not being able to update the copy [09:08:11] and so you got an updated private repo but not the puppet repo [09:08:13] no, the private repo wasn't touched [09:08:20] in either of those 2 [09:08:25] even public was done with a commit on top, so rebase works properly [09:08:35] unless there is a conflict [09:08:43] true [09:09:12] anyway, since that works, I need to verify wcqs [09:28:23] early lunch [09:54:20] next time we do a reindex we should drop the ores_articletopics and ores_articletopic fields [10:02:21] Errand [10:05:50] Trey314159, ebernhardson if you run the next round of reindex please add --fieldsToDelete ores_articletopics,ores_articletopic to the list of UpdateSearchIndexConfig.php script options (updated T147505) [10:05:51] T147505: [tracking] CirrusSearch: what is updated during re-indexing - https://phabricator.wikimedia.org/T147505 [10:17:59] lunch [11:54:29] dcausse: I'll leave wcqs alone for now, I disabled the skolemization for blazegraph, but updated RWstore so it should create next index with proper settings [11:54:47] munging itself doesn't require a live wcqs service, so no need to block it [12:03:16] Hi, is is possible to configure the stemming filter from Cirrus or Elastica? I couldn't find any setting on CirrusSearch/docs/settings.txt [12:38:36] dcausse: I've started another update - I have a feeling that the issue was there was the old version of the munger before - now it's updated, we'll see soon if that's the issue [12:53:01] zpapierski: sure, is it importing right now? [12:53:36] to check if the system is properly running this query must return results: select * where {wd:M106076433 wdt:P170 ?o . filter(wikibase:isSomeValue(?o))} [12:55:20] absorto: few things can be configured (ICU folding) but mainly you set the language of the wiki and if you have the right plugins installed you'll get the configuration made for this language. Do you have something specific you would like to configure? [12:55:51] dcausse: it's not, munging is happening now, but first file is available so I can verify [12:57:42] there are still _:XYZ values :( [12:57:45] my bad I pasted a wrong query [12:57:47] select * where {sdc:M106076433 wdt:P170 ?o . filter(wikibase:isSomeValue(?o))} [12:57:49] :( [12:58:01] so it wasn't that [12:58:16] dcausse: yes, it is configured correctly. I am looking to make the queries return a little bit more results, and thought about maybe changing the stemming. e.g. searching for General should also return Generally [12:59:15] absorto: unfortunately there are no settings to configure synonyms [12:59:45] dcausse: something's wrong with the command line actually [12:59:58] java -cp /srv/wdqs-package/lib/wikidata-query-tools-0.3.70-jar-with-dependencies.jar org.wikidata.query.rdf.tool.Munge --from /srv/wdqs-data/latest-mediainfo.ttl.gz --to /srv/wdqs-data/munged/wikidump-%09d.ttl.gz --labelLanguage emize --singleLabel emize --skipSiteLinks --chunkSize 50000 --wikibaseHost commons.wikimedia.org --conceptUri http://www.wikidata.org --commonsUri [13:00:00] https://commons.wikimedia.org [13:00:09] I don't see the skolemize option being passed [13:00:38] https://www.irccloud.com/pastebin/6xuHIUnO/ [13:00:49] that's the original command [13:01:46] meh, thanks for your help, dcausse [13:02:06] --labelLanguage emize --singleLabel emize - this looks somewhat weird [13:02:27] --skol being expanded? [13:02:33] it should probably be after -- [13:02:48] yes --skolemize must be after -- [13:03:02] lemme check your patch [13:03:27] this looks correct: https://gerrit.wikimedia.org/r/c/wikidata/query/rdf/+/698452/2/dist/src/script/wcqs-data-reload.sh [13:03:29] don't bother I changed that [13:03:38] yeah, I was experimenting and I left that there [13:03:43] ah ok [13:04:39] and we're off [13:07:13] wdt:P170 . [13:07:20] that looks promising :) [13:08:30] ok, I'm leaving that going, once it's done, I'll restore blazegraph option for skolemization [13:09:55] \o/ [13:12:31] errand [13:33:47] dcausse: another thing that crossed my mind, would it help to use dictionary stemming? what is Cirrus using by default, algorithm stemming? [13:36:24] absorto: for english it's using kstem (http://ciir.cs.umass.edu/pubfiles/ir-35.pdf) [13:36:44] actually you can see what we configure on the wikis using: https://en.wikipedia.org/w/api.php?action=cirrus-settings-dump [13:36:57] search for the "text" entry [13:37:30] dcausse: perfect, thanks for the pointers [13:38:56] absorto: if you want to customize this you can hack some php code all this is being done in include/Maintenance/AnalysisConfigBuilder.php [13:40:18] dcausse: wow! this is going to be very handy [13:42:40] as for setting up dictionary for synomyns we have thought about that but never really start anything as we don't know how to properly maintain such dictionaries, and since we have so many languages to support it's a bit unfair to focus solely on english [13:42:51] Trey314159: might have more thoughts on this :) [13:49:07] Makes sense, I would be an enormous task [13:49:17] *it would [14:33:53] ebernhardson: 1-on-1? [14:34:06] gehel: oh, sorry i started responding to gerrit :P [14:34:14] :) [14:34:29] let's do our 1-on-1 as gerrit CRs :) [17:57:41] hmm, random idea we could probably completely avoid instrumenting mediasearch, beyond changing the api url, by hacking something into cirrus that turns on the test buckets even though it's an api request since we can ensure it's uncachable [19:27:08] meh...double checking the mjolnir bulk daemon graphs for the weekly data load, it looks like we did manage to fix the problem with losing hourly updates, but didn't manage to fix the problem where the weekly dump blocks the hourly ones... [19:27:42] stupid solution: run a separate daemon for the hourly topic [20:04:44] less stupid solution after googling: Check lag before polling and pause non-priority topics [20:53:37] ryankemper: I forgot why T270391 was related to T284479 (I have the tabs open as a reminder to do something but forgot what) [20:53:37] T284479: Cirrussearch: spike in pool counter rejections related to full_text and entity_full_text queries - https://phabricator.wikimedia.org/T284479 [20:53:38] T270391: varnish filtering: should we automatically update public_cloud_nets - https://phabricator.wikimedia.org/T270391 [20:55:26] mpham: the latter (T270391) is an existing effort / discussion around having an automatically maintained list of public cloud IPs (aws, gcp, ibm etc), which is the piece we need in place before we can implement a proper / elegant solution for the former [20:55:51] oh right, i remember now. thanks [20:56:01] 👍🏿