[10:16:04] Hi! I'm running `vagrant up` for the Mediawiki Vagrant but the curl install in the VM is failing (curl : Depends: libcurl4 (= 7.74.0-1.3+deb11u13) but 7.88.1-10+deb12u6~bpo11+1 is to be installed). I was wondering how to fix this?? Thanks so much!!! [10:23:58] Okay so I think it's fixed?? I just told it to downgrade lib curl (sudo apt-get install libcurl4=7.74.0-1.3+deb11u13) and it then let me install curl? Just putting this here incase anyone has the same problem and needs to know what to do. Thanks! [18:40:15] Seddon: we've got account vanish locks happening on the incorrect wiki: T380527 [18:40:16] T380527: AccountVanishRequest locks should always be done on Meta-Wiki - https://phabricator.wikimedia.org/T380527 [18:50:13] Why are you poking him? [18:55:56] his team did the work for vanishing [21:27:35] naive question (related to my previous question in here on 2024-11-11 if anyone wants to look at the logs) [21:27:44] is it possible to temporarily increase the job run rate of certain jobs on a wiki? [21:28:08] Commons currently has a relatively large backlog of (I believe) refreshLinks jobs, as a result of edits to several of the CC-* license templates (which are used on millions of files) [21:28:29] (and even more jobs will come in once some additional edit requests are fulfilled by admins – I was only able to fulfill the template-protected ones) [21:29:38] I'm guessing there might be something in the job queue/runner config that SRE can tweak... [21:29:39] Nikki estimated that, looking at how the number of results at https://commons.wikimedia.org/wiki/Special:Search/hastemplate:%22SDC_statement_has_value%22 is currently going down, it would take something like ten years at the current rate :/ [21:29:51] and I’m thinking of changes I’ve seen like https://gerrit.wikimedia.org/r/plugins/gitiles/operations/deployment-charts/+/b3f06857b2a719995dc8ac51677bc307acc08669, but I’m not sure if that’s the same system [21:30:29] (obligatory nitpick that any Special:Search-based count will additionally need some other job to run as well, cirrusSearchRefreshLinks or whatever it’s called) [21:30:45] (but refreshLinks is the one that would drop millions of rows from the templatelinks table and make the DBAs happier ^^) [21:32:33] ah, that's why some category changes are being slow [21:32:46] changeprop-jobqueue should be the thing I'm pretty sure lucaswerkmeister. [21:32:48] I guess filing something under WMF-JobQueue marking maybe high (not ubn) and see if you can get some SRE help [21:32:50] I think that folks like cdanis might be able to explain more/help reason about tuning [21:33:29] alright, I’ll make a task and CC them [21:33:39] and AntiComposite I wasn’t aware that this had this impact :/ probably worth noting on the task as well then [21:33:43] I picked cdanis to ping because they helped me with some job queue info not too long ago. [21:34:38] I'm out of work time today unfortunately but feel free to cc me and I'll try to get serviceops attn tomorrow. (also please tag the task serviceops) [21:35:08] will do, thanks! [21:36:14] * lucaswerkmeister accidentally closes the phabricator new task tab but is saved by firefox remembering form contents on ctrl+shift+t \o/ [21:36:57] yeah that would explain why https://commons.wikimedia.org/wiki/Category:Johann_Baptist_Hops isn't appeearing on https://commons.wikimedia.org/wiki/Category:Hops_(surname) [21:39:26] job queue monitoring is at (linked from ) [21:41:54] filed https://phabricator.wikimedia.org/T380544 [21:44:29] bd808: I expect there’s no total number of jobs to be seen anywhere? (I’m thinking also of T221224) [21:44:29] T221224: showJobs.php maintenance script useless and misleading in production - https://phabricator.wikimedia.org/T221224 [21:44:55] I think I remember reading that some job (might not have been refreshLinks – deletion/undeletion maybe?) enqueues its own successor jobs to continue work [21:45:09] so for all I know, this backlog of jobs might even be mostly “virtual” and not fully “materialized” yet [21:45:19] (which would mean you couldn’t meaningfully count it) [21:46:24] yeah. there is some fancy magic in refresh links jobs where there is a "partitioner" job and a "partitioned" job and I'm pretty sure there is also some deduping that happens at runtime [21:47:42] makes sense [21:48:01] we can’t have the edit read millions of affected page rows and enqueue millions of jobs in one request [21:48:06] I think it is probably more interesting to look at job backlog times and their trends. that's the time from a job submission to the job starting to run [21:49:30] that makes sense [21:49:46] https://grafana.wikimedia.org/d/LSeAShkGz/jobqueue?orgId=1&var-dc=codfw%20prometheus%2Fk8s&viewPanel=65 has 5.41 days for refreshLinks-partitioner [21:49:54] (that’s the p99 panel… and across all wikis, I assume) [21:50:35] also really smart of me to file that task on a thursday evening right [21:50:38] but better than sunday I guess [21:50:43] Th trend line on that is pretty flat for the last 30 days [21:51:14] hm ok [21:51:18] oh.. heh log scale [21:51:27] * bd808 looks more closely [21:54:29] linear scale shows it be quite variable over the last 30 days like 1.72 days - 6.97 days. [21:55:02] log scale graphs can be deceiving :) [21:56:36] In theory we are more able to scale hardware for things like this now too because of the flexibility Kubernetes [22:00:54] hm, that still sounds like the issue isn’t necessarily visible in the graph [22:01:19] p99 is not always the best place to look for problems [22:05:36] p50 was having a very bad time around the start of the month, but seems to be "normal" now (<3 minutes) [22:06:10] I’m trying to look at some of the other graphs (normal job backlog time, p50) but grafana is not being happy about it [22:06:15] there were some spikes on 2024-11-02 and 03 of 60 minute p50 [22:07:22] I am going to wander back to other things safe in the knowledge that nerds who actually know how to read these graphs will likely be looking in the next 24h :) [22:07:59] that is a very good idea :) [22:08:32] !bash I am going to wander back to other things safe in the knowledge that nerds who actually know how to read these graphs will likely be looking in the next 24h :) [22:08:32] lucaswerkmeister: Stored quip at https://bash.toolforge.org/quip/8B3FUJMBFk7ipym_D37O [23:36:48] ah, the good graph reader has arrived [23:37:07] indeed 😌 [23:37:26] the person who knows what the magic number 3 means