[09:39:49] Hi, i am directed here from #wikipedia-en-help [09:39:50] Would like to check if it is okay to web scrap Wikipedia for a school project? Community Member stw also pointed out to me that there is a api available and that scrapping would consume more resources. If i am allowed to scrap, would like to know how/what i can potentially implement into my code to make it consume the least amount of resource [09:39:50] possible. [09:39:51] For context: [09:39:51] I would be scrapping a list of the theme of songs there are, and navigate to each category to generate a random list of songs. [09:42:44] why not use the API instead? [09:54:16] Ah, for more context: [09:54:17] I originally wanted to scrap youtube but youtube does not allow scrapping unless you have their written permission (which i am trying to get) but chances of them getting back to me seems to be quite unlikely… To fufill the web scrapping criteria, i would need to scrap other websites. After looking through the websites needed for my project, [09:54:17] Wikipedia seems to be the only one that did not specifically states that they do not allow web scrapping other than “Engaging in automated uses of the Project Websites that are abusive or disruptive of the services, violate acceptable usage policies where available, or have not been approved by the Wikimedia community” [09:59:50] Why scrap? [10:02:43] If there are "web scrapping criteria" by your school, I'd hope these criteria also allow using an API instead [10:03:29] I will check that out in detail first about that, Thank You for the swift response! [16:35:10] Amir1: https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/976177 caused logins to break on beta wikis. [17:28:54] Dreamy_Jazz: fixing it is easy, just create the tables in beta cluster [17:29:08] I don't have rights to do that. [17:29:15] give me a second [17:29:25] Also does it work differently because it's on a shared cluster? [17:33:39] Dreamy_Jazz: try again [17:34:01] Thanks. Working for me now. [18:50:39] I'm pretty sure I know the answer to this, but there used to be a map of the database servers and what version of MariaDB they were on, that's no longer public right? [18:51:38] c: I think you mean "dbtree". that does not exist anymore. but there is a placeholder at https://dbtree.wikimedia.org/ [18:52:11] maybe https://noc.wikimedia.org/db.php is helpful? [18:52:45] i did find the second link lastnight and that does show the db names and how they're organized but doesn't show the versions used [18:54:41] c: you can get this information by git cloning the operations/puppet git repo [18:55:56] c: it's wmf-mariadb106 for most [20:30:26] https://doc.wikimedia.org/mediawiki-core/master/php/ supports Dark Mode now! [20:35:14] nice [21:49:12] @TheresNoTime @legoktm for boosting on the relevant accounts https://wikimedia.social/@wikimediafoundation/111455516252024144 [23:10:44] I am registering my displeasure with the Phabricator 2fa implementation [23:13:13] please take a number and wait for it to be called [23:13:42] only after the next cooldown though [23:13:48] current number is 1. waiting in line, ~2500 [23:14:18] I once tried to be smart and registered multiple 2fa devices there, with the idea that you can use TOTP device #2 while #1 was on cooldown. didn't work, it instead asked from all of them at once when trying to do anything, and I almost locked myself out of my account