[11:10:04] !log paws updating ingress from v1beta1 #134 T294342 5107562ccaf1160ba09a62ac7e2de9cd5e90f79f [11:10:08] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Paws/SAL [11:10:09] T294342: PAWS chart should use Ingress v1 API - https://phabricator.wikimedia.org/T294342 [12:55:48] !log paws bump jupyterhub version #149 T308568 41f03a544041318f1fad479b32ae46ac9e816a55 [12:55:51] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Paws/SAL [12:55:51] T308568: PAWS share link not sharing - https://phabricator.wikimedia.org/T308568 [15:42:09] !log admin updated the 'debian-11.0-bullseye' glance image with a fresh build [15:42:12] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [16:22:59] Hi, is there some problem with jobs scheduled through toolforge-jobs? [16:23:47] I have a job that should have started 80 minutes ago. It's not critical, but I was wondering whether I should start it manually, or everything will be rescheduled eventually [16:25:03] tools.jorobot@tools-sgebastion-10:~$ toolforge-jobs list [16:25:03] Job name:    Job type:               Status: [16:25:04] -----------  ----------------------  ---------------------------------------- [16:25:05] fer-efem     schedule: 0 15 * * 3    Last schedule time: 2022-05-11T15:00:00Z [16:25:05] [16:26:45] Joutbis: let me have a look, we had some issues during the upgrade last week but they were fixed already [16:35:04] Joutbis: I agree with you, it should have triggered, looking [16:41:51] Joutbis: you can trigger it manually for now, I'm still debugging [16:43:20] dcaro: as I said in T308300 I believe the scheduler might be getting overloaded at the top of the hour [16:43:21] T308300: toolforge-jobs: scheduled jobs stopped being scheduled - https://phabricator.wikimedia.org/T308300 [16:43:40] taavi: thanks! did not see that task [16:44:31] toolforge-jobs is setting a higher startingDeadlineSeconds (30) for new jobs than what you backfilled last week (15), so we might want to backfill the higher value to all existing jobs [16:44:56] ack [16:45:40] question though, shouldn't it try to trigger the jobs anyhow even if they are late? (I'm a bit fuzzy on that behavior) [16:45:57] that's exactly what startingDeadlineSeconds controls [16:46:12] see also T308381, which would give us much better o11y to k8s internals [16:46:13] T308381: toolforge: Scrape Kubernetes controller-manager and apiserver metrics into Prometheus - https://phabricator.wikimedia.org/T308381 [16:46:16] hmm, okok [16:46:26] +100 on that [16:57:09] taavi: all patche to 30 [16:57:13] *patched [16:57:24] cool [17:39:48] !log tools.stewardbots restart StewardBot, RC connection 429'd [17:39:50] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stewardbots/SAL [23:48:18] Were there any updates to Trove recently? I have an instance (dumps-db1) that was active until about 8 hours ago, and I am unable to restart the instance through horizon