[04:50:30] [Q] Does anyone know why my query at Quarry https://quarry.wmcloud.org/query/62008 took 551.95 seconds to find a single revision in the last month? What other alternatives do I have for getting the revision in the last month? [04:51:44] What other alternatives do I have for getting all the revisions performed in the last month?* [05:02:02] I'm going to go with "wikidata is big" [05:03:54] though I'd say it's unlikely the query plan will change much between LIMIT 1 and no limit [05:33:03] though for the last month, using recentchanges will work best https://quarry.wmcloud.org/query/62012 [05:38:54] https://quarry.wmcloud.org/query/62011 the query with revision ended up returning more rows, but that's because there are 31 days in January [10:10:46] rdrg109: MySQL can only use an index for a column if the query doesn’t do any operations/calculations on that column, so don’t do that [10:10:55] https://quarry.wmcloud.org/query/62021 finished in 0.09 seconds [10:12:04] I believe JOINs are also recommended over IN (SELECT …) conditions, i.e. https://quarry.wmcloud.org/query/62022, but it looks like MySQL was able to optimize that in this case [10:20:17] for some reason the query becomes slow when doing rev_timestamp >= DATE_FORMAT(…), I would’ve thought that should still be just as fast [10:20:29] but https://quarry.wmcloud.org/query/62023 works if you don’t want to hard-code the timestamp in the query [12:35:36] !log tools.lexeme-forms deployed c1d6a79ed2 (update Odia nongendered adjectives) [12:35:38] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.lexeme-forms/SAL [14:22:29] !log tools creating a cluster of 3 bullseye redis hosts for T278541 [14:22:33] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [14:22:33] T278541: Toolforge: migrate redis servers to Debian Buster or later - https://phabricator.wikimedia.org/T278541 [14:41:12] !log tools created a neutron port with ip 172.16.2.46 for a service ip for toolforge redis automatic failover T278541 [14:41:16] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [14:41:16] T278541: Toolforge: migrate redis servers to Debian Buster or later - https://phabricator.wikimedia.org/T278541 [14:59:43] I'm confused about how read-access to toolforge elasticsearch works. [15:00:26] It looks like it's keying off the HTTP request type (GET vs POST) to determine if you need to be authenticated or not. [15:01:02] So, I can do: [15:01:02] curl -s http://elasticsearch.svc.tools.eqiad1.wikimedia.cloud:80/spi-tools-dev-es-index/_count [15:01:05] and that works fine. [15:01:39] But if I try to do that inside python: [15:01:40] es = OpenSearch('elasticsearch.svc.tools.eqiad1.wikimedia.cloud:80') [15:01:40] print (es.count(index='/spi-tools-dev-es-index')) [15:01:57] the client is sending a POST request and that's getting blocked. [15:05:28] Assuming I'm understanding all that correctly, how do I do a search without giving my web server write access to elasticsearch? [15:15:21] Or is this what "Elasticsearch does not offer multi-tenant access control in its open source version." is getting at in https://wikitech.wikimedia.org/wiki/Help:Toolforge/Elasticsearch? [15:16:45] no multi-tenant access control means that every tool with read or write access has read or write access to everything [15:17:37] I'm assuming the GET/POST filtering is being done at some HTTP proxy layer outside of elasticsearch itself? [15:19:17] indeed, we have haproxy in front of elasticsearch doing load balancing + access control (= credentials required for any HTTP methods not in [GET HEAD OPTIONS]) [15:19:53] I believe search operations used GET in earlier ES versions, but moved to POST in ES7 I think [15:20:09] * AntiComposite rolls eyes [15:20:49] but, even if I figured out a way to force OpenSearch to use GET and ran my app with no credentials, anybody else who has write credentials could delete my data? [15:21:04] yes [15:21:10] [15:23:38] old man shakes his fist at the sky and rants about how stupid HTTP is in that it conflates both "allows large requests" with "implies write access" into POST. [15:24:04] * AntiComposite grumbles about people not using PUT [15:24:55] in theory we could try some limiting based on the URL path but I'm sure someone would find a way around that if they wanted since you aren't intended to be able to have access control restrictions on the open elasticsearch versions [15:25:23] Yeah, I get that. Freemium and all that. [15:25:53] "Hey kid, wanna try a database? The first hit's free. Try it, you'll like it" [15:26:28] opensearch might have some improvements in that regard, but I'm not sure when we're going to find time to upgrade to that [15:26:39] BTW, full marks for picking haproxy. haproxy rocks. [15:28:00] indeed! it's powering our kubernetes stack too, both web service and api server traffic [15:28:40] I ran it for a bunch of years. I don't think I've ever seen a piece of software that was so rock-solid. [15:31:23] I once attempted to get them to pick up a patch of mine, but they wouldn't. I was kind of annoyed that they blew me off, but I guess that's how you keep stuff stable :-) [15:31:25] https://github.com/roysmith/haproxy-xuri [15:41:37] AntiComposite, lucaswerkmeister: Thanks for the help! [18:21:34] Hi guys, I'm having problems with my pywikibot. It doesn't login anymore from jsub job. When I run the code from console, it works just fine. Any hints please? [18:26:09] I'm talking about pywikibot script running on toolforge... [18:27:04] Hey Camel1cz, does the .err file have some output? [18:27:40] (each grid job should create .out and .err files in your tool's home directory with logs; ideally, it will tell you what's wrong) [18:29:57] I'm seeing just the login errors. [18:30:20] Camel1cz: can you paste an example here please? [18:32:04] WARNING: No user is logged in on site wikipedia:cs [18:32:05] Logging in to wikipedia:cs as Camel1cz bot@Camel1cz_bot [18:32:05] WARNING: API warning (main): Subscribe to the mediawiki-api-announce mailing list [18:32:06] WARNING: API warning (login): Fetching a token via "action=login" is deprecated. Use "action=query&meta=tokens&type=login" instead. [18:32:06] ERROR: Received incorrect login token. Forcing re-login. [18:32:07] ERROR: Login failed (Failed) [18:35:10] urbanecm: I recreated user config and password files with no luck... it does work only from interactive shell, not from the grid [18:35:40] are you using a virtual environment or the shared pywikibot? [18:36:17] and also, which exact command do you use to start the bot on the grid? [18:37:25] AntiComposite: shared pywikibot [18:38:01] The problem appeared on January 28-th w/o any change from my side [18:40:04] Camel1cz: can you confirm we're talking about the c19dataczbot toolforge tool (and Camel1cz_bot on wiki)? [18:40:20] urbanecm: Yes [18:40:53] * urbanecm digs into logs [18:42:08] urbanecm: Thanks a lot! [18:46:59] the logs say "Bot password restrictions prevent this login." [18:47:25] Camel1cz: can you log in as the bot in your browser, go to special:BotPasswords and screenshot the configuration of the bot password you're using? [18:49:19] (I'm specifically interested if there is any IP range restriction in use) [18:50:12] (ie. the "Allowed IP ranges:" field) [18:51:14] urbanecm: originally I used the main bot account to log in. It stopped working and I created the "sub account" (sorry for the therminology) [18:51:25] I use the IP restriction to 185.15.56.48/32 [18:51:42] that's the issue :) [18:51:46] 185.15.56.48 is a toolforge bastion [18:51:56] the grid nodes all have separate IPs [18:52:12] Give me few sec, Will try it w/o IP restriction [18:52:30] sure [18:53:19] (all of Wikimedia Cloud Services would be currently 185.15.56.0/25 and 172.16.0.0/21, if you want to keep an ip restriction on the bot password) [18:53:55] * urbanecm waves to taavi [18:54:33] p/ [18:54:34] o/ [18:54:37] oops :D [18:55:44] to add to what taavi says, Toolforge is just one of theprojects running within Wikimedia Cloud Services. In other words, there are non-Toolforge hosts in the two ranges. In practice, it shouldn't make a big security difference though. [18:56:16] urbanecm: That's it, man! Thank you! (I'll never more try to be secure, I swear! :D [18:56:53] happy to help Camel1cz. PWB should probably have more meaningful error messages than "ERROR: Received incorrect login token. Forcing re-login." and "ERROR: Login failed (Failed)" [18:58:18] Hmm, would be nice. I had no idea it's something else than the login/password... anyway big thanks to you! Fakt díky! :) [18:58:47] Rád jsem pomohl 🙂 [18:58:53] and thanks for maintaining the covid bot Camel1cz