[03:24:35] This conversation about beta never being resourced makes me feel like it is 2014 again 😊 [03:53:18] !log codesearch Deploying fix for T321347 [03:53:23] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Codesearch/SAL [03:53:23] T321347: Codesearch: "excludeFilePath" option should be preserved when switching - https://phabricator.wikimedia.org/T321347 [09:32:53] !log lucaswerkmeister@tools-sgebastion-10 tools.bridgebot Double IRC messages to other bridges [09:32:57] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.bridgebot/SAL [09:43:42] !log taavi@tools-sgebastion-11 tools.wikibugs toolforge jobs restart irc [09:43:47] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikibugs/SAL [10:35:07] !log qrank bump object storage quota: aborrero@cloudcontrol1005:~ $ sudo radosgw-admin quota set --quota-scope=user --uid=qrank\$qrank --max-size=40G (T360162) [10:35:11] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Qrank/SAL [10:35:12] T360162: Increase Object Storage quota for QRank - https://phabricator.wikimedia.org/T360162 [11:34:15] !log chlod@tools-sgebastion-10 tools.copyvios Blocked Arquivo-web-crawler UA to try solving intermittent lockups [11:34:18] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.copyvios/SAL [11:43:04] !log lucaswerkmeister@tools-sgebastion-10 tools.bridgebot Double IRC messages to other bridges [11:43:07] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.bridgebot/SAL [11:51:03] !log hoo@tools-sgebastion-10 tools.stewardbots ./stewardbots/StewardBot/manage.sh restart # RC reader not reading RC [11:51:07] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stewardbots/SAL [15:17:10] !log bd808@tools-sgebastion-10 tools.wikibugs-testing Restarted irc task after yet another failed overnight test of redis client stability. Last message at 12:59Z. [15:17:13] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikibugs-testing/SAL [15:20:24] On https://wsstats.toolforge.org/stats/all/alltime I'm hitting 429 status codes, however, I don't have the same issues on local, any idea where this is coming from ? [15:21:04] Also, is there a way to whitelist this specific page (rearchitecting the whole app is going to be a pita) [15:24:51] that is coming from the toolforge front proxy, and no, there is currently no mechanism to exclude specific pages from it [15:27:06] What are the limits (maybe I can just make it infinite scrolling 🤔) ? [15:41:02] andrewbogott: Can I get your +2 on https://gerrit.wikimedia.org/r/c/operations/puppet/+/1009784 ? [15:41:25] @sohom_datta: I think the nginx front proxy ratelimit is set to 100 requests/second/origin ip, but maybe I am reading the config incorrectly. [15:41:43] 50 per https://gerrit.wikimedia.org/g/cloud/instance-puppet/+/24c0ed9ba3b3a93960122a988d6e6d188faa12e8/tools/tools-proxy.yaml#2 [15:42:16] ah, thanks taavi. I was grepping the ops/puppet and not also that repo [15:42:35] codesearch indexes both :-) [15:42:50] Oh, that explains it :( (re @wmtelegram_bot: 50 per https://gerrit.wikimedia.org/g/cloud/instance-puppet/+/24c0ed9ba3b3a93960122a988d6e6d188faa12e8/tools/tools-proxy...) [15:43:11] Time to rip out the code [15:47:43] dancy: want me to merge it? [15:47:51] Yes please! [15:49:29] Thanks much [16:16:19] !log paws upgrade jupyterlab T360193 [16:16:23] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Paws/SAL [16:16:24] T360193: jupyterlab to 4.1.5 - https://phabricator.wikimedia.org/T360193 [16:21:22] jhathaway: can I enable puppet on vrt.ldap-dev.eqiad1.wikimedia.cloud ? [16:21:28] "Puppet is disabled. postfix - jhathaway" [16:22:45] Yes please do [16:23:45] jhathaway: there are 7 other hosts in that project that I can't reach with cumin, presumably because of having broken or disabled puppet. Any chance you could go through that list? [16:23:59] https://www.irccloud.com/pastebin/vH4FACwg/ [16:24:09] obviously deleting unused VMs is also an acceptable solution :) [16:33:49] !log lucaswerkmeister@tools-sgebastion-10 tools.lexeme-forms deployed 8f4985e682 (improve tests; should have no production impact but I pulled+restarted anyway ^^) [16:33:53] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.lexeme-forms/SAL [18:09:17] Hi all — bringing this question in from the WMF Slack for posterity in this channel. Initially I was having some trouble setting up an adequate python virtual environment in toolforge, trying to specify that I only want to use the open-source compliant/CPU accessible version of pytorch. I'm using an annoy embedding db [18:09:18] (https://github.com/spotify/annoy) to quickly do a vector-based natural language search of a dataset of ~260,000 public sparql queries, and the non-CPU pytorch is ~750mb and also pulls in a bunch of closed source nvidia dependencies. [18:09:18] I was eventually able to figure that out, but now I'm having some issues where the container for the tool doesn't have enough memory, so it dies during start-up. The venv is ~1gb, the annoy vector embedding db is ~750mb and the actual dataset is ~1gb. So far I've tried upping the memory to 1gb, upping to 3 cpus, etc. but still having a hard time [18:09:19] getting the site to start. [18:09:19] Next thing I'm trying is dropping all unnecessary fields from the dataset, which should hopefully take it down to <200mb. [18:09:35] !log anticomposite@tools-sgebastion-10 tools.stewardbots SULWatcher/manage.sh restart # SULWatchers disconnected [18:09:38] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stewardbots/SAL [18:14:44] re "pulls in a bunch of closed source nvidia dependencies [18:15:01] this is prohibited by https://wikitech.wikimedia.org/wiki/Help:Toolforge/Rules [18:15:14] All code in the Tools project must be published under an OSI approved open source license [18:16:10] I know — I purposefully pip uninstalled the resource-hungry version of pytorch and all nvidia dependencies and reinstalled the CPU-optimized version, which has no closed-source dependencies [18:19:43] as far as resources, you might be able to use the Grafana dashboard to see what the tool is actually using (dashboard linked from k8s status for the tool, e.g., https://k8s-status.toolforge.org/namespaces/tool-jjmc89-bot/) [18:21:24] I find that that doesn't quite sample fast enough to catch sharp spikes on startup, especially if the pod gets killed [18:22:00] yea, it isn't good with short spikes [18:22:36] Update on the pod getting killed — making the dataset significantly smaller by dropping unnecessary columns as allowed for it to actually start, which is great [18:23:12] I do think the annoy db vector search is still too CPU-intensive, though — I'll try making it smaller [19:41:56] !log lucaswerkmeister@tools-sgebastion-10 tools.lexeme-forms deployed 272a303c09 (Danish adverbs) [19:41:59] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.lexeme-forms/SAL [20:42:26] !log jjmc89@tools-sgebastion-11 tools.eranbot bump enwiki eswiki frwiki job cpu to 1 [20:42:30] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.eranbot/SAL