[10:39:57] lunch [16:07:58] errand [16:30:29] wcqs-beta being down seems to amount to: Wrapped by: org.openrdf.repository.RepositoryException: org.openrdf.sail.SailException: com.bigdata.rdf.sail.webapp.DatasetNotFoundException: namespace=wcqs20210914 [16:31:38] workbench only shows 20210831 and 20210907 [16:35:28] sed: couldn't flush /srv/wdqs-data/sedrZ7vQn: No space left on device [16:35:47] so, fell over because it ran out of disk (plausibly during a reload) [16:38:06] dcausse: i don't really know what to do to fix that, i assume something like delete wcqs20210831 and then promote wcqs20210907, not sure how to safely delete though [16:38:31] but i also have some memory that it's not that easy [16:49:45] i deleted the dumps and munged files, freed 50G which was enough to get going again. It's on the old data now though [17:17:34] ebernhardson: yes it happens.. deleting the old journal and restarting the import is the way to go [17:18:45] dcausse: ok, will do [17:19:25] dcausse: do i have to stop blazegraph or anything before deleting the journal? [17:22:02] shrug, it's going down anyways. can't hurt [17:22:04] (maybe :P) [17:23:20] Angie had a question: "Is there a way I can search for categories by number? e.g. I want to know which categories on a wiki have more than 100 articles?" [17:24:25] mpham: not from cirrus, https://quarry.wmcloud.org/ should be able to query that out of the sql databases (but the user has to know sql) [17:26:00] got it. thanks [17:29:25] data reload now running for wcqs-beta. Also it seems the ones i started on prod to run over the weekend failed, kicked them off again as well (with 1 hr delay between machines so we don't hammer dumps.wikimedia.org) [17:36:23] mpham: i dunno if it's exactly what they want, but heres a list: https://quarry.wmcloud.org/query/58761 [17:37:31] ebernhardson: oh thanks! i'll forward it on [20:58:44] hmm, comcast must be having fun today....100% packet loss to bast4003.wikimedia.org :P [21:00:10] equally having problems reaching bast2002. At least 1003 is reachable (on 100ms delay) [21:06:14] sigh...fun, it also means i can't access the wikis, because of course those all try and route through the same dc as bast4xxx :P [21:08:10] * ebernhardson takes a 10 minute break and hopes it fixes itself :) [21:17:24] seems better, hopefully [22:24:04] Looks like wdqs1004 is lagging (again); also looks like a community member is helping us depool it? T290832 [22:24:04] T290832: wdqs1004 is lagging 5 hours more than all others - https://phabricator.wikimedia.org/T290832 [22:29:20] mpham: mutante/dzahn is an sre, mostly following the docs we put together for those alerts [22:29:44] oh woops. haha, ok that makes a little more sense now