[10:07:22] Would it be possible to redirect toolsadmin.toolforge.org to toolsadmin.wikimedia.org ? [10:10:27] jhosby: I think it would be yes, on a quick though might be even easy (though I'm suspicious as it's not done yet) [11:21:17] It's not something I use very often, but when I do, my workflow is: [11:21:18] * Visit admin.toolforge.org [11:21:19] * Notice that that's not the page I want [11:21:21] * Try toolsadmin.toolforge.org, tooladmin.toolforge.org, tools-admin.toolforge.org, toolsadmin.toolforge.org [11:21:22] * Give up trying to remember the right URL [11:21:24] * Remember that there are links to the thing I want at admin.toolforge.org [11:21:26] * Click them, realize it's at wikimedia.org and not toolforge.org [11:21:27] Wait a few months until next time, then rinse and repeat [12:02:25] https://wikitech.wikimedia.org/wiki/User:BryanDavis/Kubernetes#Make_a_tool_redirect_to_another_tool_WITHOUT_running_a_webservice is the typical way to do that (https://phabricator.wikimedia.org/T344630 for adding it to webservice) [14:47:37] Hi folks, I have a Toolforge monthly cron job that's not starting as expected. Question: is it correct to expect that the `@monthly` macro will let the job "run once a month at midnight of the first day of the month"? More context to follow\ [14:49:09] This is the full command I've used to create the job: `toolforge jobs run datasets --command "$HOME/pwbvenv/bin/python3 $HOME/gogologo/generate_monthly_reports.py" --image python3.11 --mem 1Gi --cpu 2 --schedule "@monthly"` [14:54:53] `toolforge jobs show datasets` seems fine, i.e., `Job type: | schedule: @monthly` is shown. `Status: | Last schedule time: 2024-12-15T16:44:00Z` correctly tells me I've set up the job last December. I was expecting it to run on January 1st, but as of today it hasn't. `toolforge jobs restart datasets` effectively runs it, but I'm wondering why the cron hasn't triggered [14:56:00] I'm following the docs at https://wikitech.wikimedia.org/wiki/Help:Toolforge/Jobs_framework#Creating_scheduled_jobs_(cron_jobs). Any help appreciated! [14:57:46] FYI the tool is https://toolsadmin.wikimedia.org/tools/id/gogologo [15:04:01] I've also tried to launch a test hourly job, and it seems it's starting 10 minutes after the beginning of the hour, which makes me wonder about the actual start of the monthly one. Perhaps it will get delayed due to some queue or similar? [15:09:48] the docs say @monthly should mean midnight on the first day https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/ [15:09:54] but I could imagine that we’ve intentionally changed that somehow [15:10:05] to avoid a bunch of jobs all starting at the same time and suddenly using up a ton of reasources [15:11:54] hm, or alternatively https://phabricator.wikimedia.org/T325027#8462661 / https://phabricator.wikimedia.org/T308300 sounds like it might have been scheduled for midnight, but if enough other things were also scheduled then, it might have been dropped? [15:12:12] though that was over a year ago and reportedly shouldn’t be a problem anymore [15:12:34] ah, but https://phabricator.wikimedia.org/T338006 is still open [15:12:53] “the best fix at the moment is to not schedule jobs at exactly midnight or top of the hour when the scheduler is less busy” [15:17:45] interesting [15:18:42] on the other hand [15:18:48] “Please use the @hourly, @daily, @weekly, @monthly, @yearly macros if possible. Those make it possible to spread the cluster load evenly through the time period which makes maintaining the cluster much easier.” [15:18:57] that sounds like the opposite of the advice in that task? o_O [15:19:20] so do these macros currently mean exactly midnight or is the load-spreading already supposed to happen? [15:19:22] /me confused [15:19:32] yeah I was following that advice [15:21:43] no, it’s implemented and working as expected I think https://phabricator.wikimedia.org/T331684 [15:21:55] `kubectl get cronjobs` shows bioparco scheduled at `10 * * * *` [15:22:01] sorry, wrong job [15:22:13] datasets schedulet at `44 16 15 * *` [15:22:14] which means… [15:22:33] 15th day of the month [15:22:41] so it is getting skewed, and it just hasn’t run yet [15:22:57] it’ll run next Wednesday in the UTC afternoon [15:24:04] so I think the “last schedule time” in `toolforge-jobs list` also reflects when it would’ve last been scheduled according to its schedule + random skew [15:24:08] and not when you set it up [15:24:17] it was just close enough to look like it ^^ [15:24:52] toolforge-jobs will internally randomize any @at-expressions [15:27:17] tried to clarify the docs at https://wikitech.wikimedia.org/w/index.php?title=Help:Toolforge/Jobs_framework&diff=prev&oldid=2257314, feel free to edit further [15:27:31] lucaswerkmeister: thanks! [15:28:05] got it, thank you for shedding light! Just to confirm, is the macro randomized once? [15:28:29] * dcaro in many meeting [15:28:47] it's deterministic for a given tool and job name [15:55:20] I haven't used @montly but I have 3 @daily jobs schedule: @daily Last schedule time: 2025-01-07T01:06:00Z @daily Last schedule time: 2025-01-07T06:18:00Z @daily Last schedule time: 2025-01-07T08:18:00Z The start time of the first run was randomly chosen by the system. Subsequent runs are all at the exact same time, 24 hours later. Use "toolforge jobs list" to check the sta [15:55:20] [15:55:21] tus of your jobs, that shows the time of the last scheduled run. [16:08:16] For @monthly there really ought to be a way to tell the system the day of the month to start and just let it choose a random time to start sometime that day of the month. You shouldn't have to wait for up to 30 days to pass before the first monthly run starts. Unless that's how long you really want to wait for the first run. [18:50:34] !log deployment-prep taavi@deployment-puppetserver-1:~$ sudo puppet node clean geoshapes.maps-experiments.eqiad1.wikimedia.cloud # T383153 [18:50:38] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/SAL [18:50:39] T383153: prometheus-openstack-stale-puppet-certs crashing on deployment-puppetserver-1.deployment-prep.eqiad1.wikimedia.cloud - https://phabricator.wikimedia.org/T383153 [18:53:45] !log deployment-prep taavi@deployment-puppetserver-1:~$ sudo puppetserver ca clean --certname maps-master01.maps-experiments.eqiad1.wikimedia.cloud # T383153 [18:53:49] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/SAL [19:00:49] !log deployment-prep `/usr/local/sbin/clean-stale-puppet-certs --clean` (T383153) [19:00:54] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/SAL [19:00:54] T383153: prometheus-openstack-stale-puppet-certs crashing on deployment-puppetserver-1.deployment-prep.eqiad1.wikimedia.cloud - https://phabricator.wikimedia.org/T383153