|
2024-11-21 10:16:04
|
<MolecularPilot>
|
Hi! I'm running `vagrant up` for the Mediawiki Vagrant but the curl install in the VM is failing (curl : Depends: libcurl4 (= 7.74.0-1.3+deb11u13) but 7.88.1-10+deb12u6~bpo11+1 is to be installed). I was wondering how to fix this?? Thanks so much!!!
|
|
2024-11-21 10:23:58
|
<MolecularPilot>
|
Okay so I think it's fixed?? I just told it to downgrade lib curl (sudo apt-get install libcurl4=7.74.0-1.3+deb11u13) and it then let me install curl? Just putting this here incase anyone has the same problem and needs to know what to do. Thanks!
|
|
2024-11-21 18:40:15
|
<JJMC89>
|
Seddon: we've got account vanish locks happening on the incorrect wiki: T380527
|
|
2024-11-21 18:40:16
|
<stashbot>
|
T380527: AccountVanishRequest locks should always be done on Meta-Wiki - https://phabricator.wikimedia.org/T380527
|
|
2024-11-21 18:50:13
|
<Reedy>
|
Why are you poking him?
|
|
2024-11-21 18:55:56
|
<JJMC89>
|
his team did the work for vanishing
|
|
2024-11-21 21:27:35
|
<lucaswerkmeister>
|
naive question (related to my previous question in here on 2024-11-11 if anyone wants to look at the logs)
|
|
2024-11-21 21:27:44
|
<lucaswerkmeister>
|
is it possible to temporarily increase the job run rate of certain jobs on a wiki?
|
|
2024-11-21 21:28:08
|
<lucaswerkmeister>
|
Commons currently has a relatively large backlog of (I believe) refreshLinks jobs, as a result of edits to several of the CC-* license templates (which are used on millions of files)
|
|
2024-11-21 21:28:29
|
<lucaswerkmeister>
|
(and even more jobs will come in once some additional edit requests are fulfilled by admins – I was only able to fulfill the template-protected ones)
|
|
2024-11-21 21:29:38
|
<Reedy>
|
I'm guessing there might be something in the job queue/runner config that SRE can tweak...
|
|
2024-11-21 21:29:39
|
<lucaswerkmeister>
|
Nikki estimated that, looking at how the number of results at https://commons.wikimedia.org/wiki/Special:Search/hastemplate:%22SDC_statement_has_value%22 is currently going down, it would take something like ten years at the current rate :/
|
|
2024-11-21 21:29:51
|
<lucaswerkmeister>
|
and I’m thinking of changes I’ve seen like https://gerrit.wikimedia.org/r/plugins/gitiles/operations/deployment-charts/+/b3f06857b2a719995dc8ac51677bc307acc08669, but I’m not sure if that’s the same system
|
|
2024-11-21 21:30:29
|
<lucaswerkmeister>
|
(obligatory nitpick that any Special:Search-based count will additionally need some other job to run as well, cirrusSearchRefreshLinks or whatever it’s called)
|
|
2024-11-21 21:30:45
|
<lucaswerkmeister>
|
(but refreshLinks is the one that would drop millions of rows from the templatelinks table and make the DBAs happier ^^)
|
|
2024-11-21 21:32:33
|
<AntiComposite>
|
ah, that's why some category changes are being slow
|
|
2024-11-21 21:32:46
|
<bd808>
|
changeprop-jobqueue should be the thing I'm pretty sure lucaswerkmeister.
|
|
2024-11-21 21:32:48
|
<Reedy>
|
I guess filing something under WMF-JobQueue marking maybe high (not ubn) and see if you can get some SRE help
|
|
2024-11-21 21:32:50
|
<bd808>
|
I think that folks like cdanis might be able to explain more/help reason about tuning
|
|
2024-11-21 21:33:29
|
<lucaswerkmeister>
|
alright, I’ll make a task and CC them
|
|
2024-11-21 21:33:39
|
<lucaswerkmeister>
|
and AntiComposite I wasn’t aware that this had this impact :/ probably worth noting on the task as well then
|
|
2024-11-21 21:33:43
|
<bd808>
|
I picked cdanis to ping because they helped me with some job queue info not too long ago.
|
|
2024-11-21 21:34:38
|
<cdanis>
|
I'm out of work time today unfortunately but feel free to cc me and I'll try to get serviceops attn tomorrow. (also please tag the task serviceops)
|
|
2024-11-21 21:35:08
|
<lucaswerkmeister>
|
will do, thanks!
|
|
2024-11-21 21:36:14
|
<lucaswerkmeister>
|
accidentally closes the phabricator new task tab but is saved by firefox remembering form contents on ctrl+shift+t \o/
|
|
2024-11-21 21:36:57
|
<AntiComposite>
|
yeah that would explain why https://commons.wikimedia.org/wiki/Category:Johann_Baptist_Hops isn't appeearing on https://commons.wikimedia.org/wiki/Category:Hops_(surname)
|
|
2024-11-21 21:39:26
|
<bd808>
|
job queue monitoring is at <https://grafana.wikimedia.org/d/LSeAShkGz/jobqueue?orgId=1&var-dc=codfw%20prometheus%2Fk8s>; (linked from <https://wikitech.wikimedia.org/wiki/MediaWiki_JobQueue>;)
|
|
2024-11-21 21:41:54
|
<lucaswerkmeister>
|
filed https://phabricator.wikimedia.org/T380544
|
|
2024-11-21 21:44:29
|
<lucaswerkmeister>
|
bd808: I expect there’s no total number of jobs to be seen anywhere? (I’m thinking also of T221224)
|
|
2024-11-21 21:44:29
|
<stashbot>
|
T221224: showJobs.php maintenance script useless and misleading in production - https://phabricator.wikimedia.org/T221224
|
|
2024-11-21 21:44:55
|
<lucaswerkmeister>
|
I think I remember reading that some job (might not have been refreshLinks – deletion/undeletion maybe?) enqueues its own successor jobs to continue work
|
|
2024-11-21 21:45:09
|
<lucaswerkmeister>
|
so for all I know, this backlog of jobs might even be mostly “virtual” and not fully “materialized” yet
|
|
2024-11-21 21:45:19
|
<lucaswerkmeister>
|
(which would mean you couldn’t meaningfully count it)
|
|
2024-11-21 21:46:24
|
<bd808>
|
yeah. there is some fancy magic in refresh links jobs where there is a "partitioner" job and a "partitioned" job and I'm pretty sure there is also some deduping that happens at runtime
|
|
2024-11-21 21:47:42
|
<lucaswerkmeister>
|
makes sense
|
|
2024-11-21 21:48:01
|
<lucaswerkmeister>
|
we can’t have the edit read millions of affected page rows and enqueue millions of jobs in one request
|
|
2024-11-21 21:48:06
|
<bd808>
|
I think it is probably more interesting to look at job backlog times and their trends. that's the time from a job submission to the job starting to run
|
|
2024-11-21 21:49:30
|
<lucaswerkmeister>
|
that makes sense
|
|
2024-11-21 21:49:46
|
<lucaswerkmeister>
|
https://grafana.wikimedia.org/d/LSeAShkGz/jobqueue?orgId=1&var-dc=codfw%20prometheus%2Fk8s&viewPanel=65 has 5.41 days for refreshLinks-partitioner
|
|
2024-11-21 21:49:54
|
<lucaswerkmeister>
|
(that’s the p99 panel… and across all wikis, I assume)
|
|
2024-11-21 21:50:35
|
<lucaswerkmeister>
|
also really smart of me to file that task on a thursday evening right
|
|
2024-11-21 21:50:38
|
<lucaswerkmeister>
|
but better than sunday I guess
|
|
2024-11-21 21:50:43
|
<bd808>
|
Th trend line on that is pretty flat for the last 30 days
|
|
2024-11-21 21:51:14
|
<lucaswerkmeister>
|
hm ok
|
|
2024-11-21 21:51:18
|
<bd808>
|
oh.. heh log scale
|
|
2024-11-21 21:51:27
|
<bd808>
|
looks more closely
|
|
2024-11-21 21:54:29
|
<bd808>
|
linear scale shows it be quite variable over the last 30 days like 1.72 days - 6.97 days.
|
|
2024-11-21 21:55:02
|
<bd808>
|
log scale graphs can be deceiving :)
|
|
2024-11-21 21:56:36
|
<bd808>
|
In theory we are more able to scale hardware for things like this now too because of the flexibility Kubernetes
|
|
2024-11-21 22:00:54
|
<lucaswerkmeister>
|
hm, that still sounds like the issue isn’t necessarily visible in the graph
|
|
2024-11-21 22:01:19
|
<bd808>
|
p99 is not always the best place to look for problems
|
|
2024-11-21 22:05:36
|
<bd808>
|
p50 was having a very bad time around the start of the month, but seems to be "normal" now (<3 minutes)
|
|
2024-11-21 22:06:10
|
<lucaswerkmeister>
|
I’m trying to look at some of the other graphs (normal job backlog time, p50) but grafana is not being happy about it
|
|
2024-11-21 22:06:15
|
<bd808>
|
there were some spikes on 2024-11-02 and 03 of 60 minute p50
|
|
2024-11-21 22:07:22
|
<bd808>
|
I am going to wander back to other things safe in the knowledge that nerds who actually know how to read these graphs will likely be looking in the next 24h :)
|
|
2024-11-21 22:07:59
|
<lucaswerkmeister>
|
that is a very good idea :)
|
|
2024-11-21 22:08:32
|
<lucaswerkmeister>
|
!bash <bd808> I am going to wander back to other things safe in the knowledge that nerds who actually know how to read these graphs will likely be looking in the next 24h :)
|
|
2024-11-21 22:08:32
|
<stashbot>
|
lucaswerkmeister: Stored quip at https://bash.toolforge.org/quip/8B3FUJMBFk7ipym_D37O
|
|
2024-11-21 23:36:48
|
<AntiComposite>
|
ah, the good graph reader has arrived
|
|
2024-11-21 23:37:07
|
<lucaswerkmeister>
|
indeed 😌
|
|
2024-11-21 23:37:26
|
<lucaswerkmeister>
|
the person who knows what the magic number 3 means
|