[02:53:09] Some of them aren't merely spams; they're trolling (re @Jan_ainali: We had one a couple of days ago with about ten messages) [05:56:02] https://integration.wikimedia.org/ci/job/quibble-vendor-mysql-php74-selenium-docker/80983/console [05:58:35] How can this "script timeout" be solved? I'm currently working on FlaggedRevs, and seems it's caused by confliction with VisualEditor (very likely to be related to some backend deps (node_modules?). This is the only remaining test that cannot pass. [08:53:59] I have a question about a database I found for Entity Linking. [08:54:00] There are 5 mio rows with QIDs and I need to fetch the following from Wikidata for each one: [08:54:02] sv label [08:54:03] sv description [08:54:05] sv aliases [08:54:06] How do I best do that using Python? [08:54:08] Using the dumps in PAWS? [08:54:09] Which Wikidata dump reading library do you recommend? [08:54:11] https://github.com/egerber/spaCy-entity-linker/tree/master [10:38:54] the dumps are just json [10:48:27] I would probably use quarry for something like that though, the dumps are *huge* [10:49:33] (i.e. I'd generate a list of all swedish labels, etc, and then use that subset to extract the data I actually want) [10:57:13] if I were doing it from a dump anyway, I'd probably use `grep` (no point wasting time parsing an entity that doesn't have any swedish), `jq` (to extract the data) and `join` (to take the data I'd extracted and the list of ids I'm interested in and only return the intersection of them), but that's not python