[11:10:04] <Rook>	 !log paws updating ingress from v1beta1 #134 T294342 5107562ccaf1160ba09a62ac7e2de9cd5e90f79f
[11:10:08] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Paws/SAL
[11:10:09] <stashbot>	 T294342: PAWS chart should use Ingress v1 API - https://phabricator.wikimedia.org/T294342
[12:55:48] <Rook>	 !log paws bump jupyterhub version #149 T308568 41f03a544041318f1fad479b32ae46ac9e816a55
[12:55:51] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Paws/SAL
[12:55:51] <stashbot>	 T308568: PAWS share link not sharing - https://phabricator.wikimedia.org/T308568
[15:42:09] <andrewbogott>	 !log admin updated the 'debian-11.0-bullseye' glance image with a fresh build
[15:42:12] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL
[16:22:59] <Joutbis>	 Hi, is there some problem with jobs scheduled through toolforge-jobs?
[16:23:47] <Joutbis>	 I have a job that should have started 80 minutes ago. It's not critical, but I was wondering whether I should start it manually, or everything will be rescheduled eventually
[16:25:03] <Joutbis>	 <code>tools.jorobot@tools-sgebastion-10:~$ toolforge-jobs list
[16:25:03] <Joutbis>	 Job name:    Job type:               Status:
[16:25:04] <Joutbis>	 -----------  ----------------------  ----------------------------------------
[16:25:05] <Joutbis>	 fer-efem     schedule: 0 15 * * 3    Last schedule time: 2022-05-11T15:00:00Z
[16:25:05] <Joutbis>	 </code>
[16:26:45] <dcaro>	 Joutbis: let me have a look, we had some issues during the upgrade last week but they were fixed already
[16:35:04] <dcaro>	 Joutbis: I agree with you, it should have triggered, looking
[16:41:51] <dcaro>	 Joutbis: you can trigger it manually for now, I'm still debugging
[16:43:20] <taavi>	 dcaro: as I said in T308300 I believe the scheduler might be getting overloaded at the top of the hour
[16:43:21] <stashbot>	 T308300: toolforge-jobs: scheduled jobs stopped being scheduled - https://phabricator.wikimedia.org/T308300
[16:43:40] <dcaro>	 taavi: thanks! did not see that task
[16:44:31] <taavi>	 toolforge-jobs is setting a higher startingDeadlineSeconds (30) for new jobs than what you backfilled last week (15), so we might want to backfill the higher value to all existing jobs
[16:44:56] <dcaro>	 ack
[16:45:40] <dcaro>	 question though, shouldn't it try to trigger the jobs anyhow even if they are late? (I'm a bit fuzzy on that behavior)
[16:45:57] <taavi>	 that's exactly what startingDeadlineSeconds controls
[16:46:12] <taavi>	 see also T308381, which would give us much better o11y to k8s internals
[16:46:13] <stashbot>	 T308381: toolforge: Scrape Kubernetes controller-manager and apiserver metrics into Prometheus - https://phabricator.wikimedia.org/T308381
[16:46:16] <dcaro>	 hmm, okok
[16:46:26] <dcaro>	 +100 on that
[16:57:09] <dcaro>	 taavi: all patche to 30
[16:57:13] <dcaro>	 *patched
[16:57:24] <taavi>	 cool
[17:39:48] <AntiComposite>	 !log tools.stewardbots restart StewardBot, RC connection 429'd
[17:39:50] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stewardbots/SAL
[23:48:18] <Hydriz>	 Were there any updates to Trove recently? I have an instance (dumps-db1) that was active until about 8 hours ago, and I am unable to restart the instance through horizon