[08:33:08] !log tools refresh machine-id on tools-k8s-worker-[102-103,105-112].tools.eqiad1.wikimedia.cloud,tools-k8s-worker-nfs-[1-3,5,7-14,16-17,19,21-24,26-27,32-48,50,53-55 ,57-58,61,65-82].tools.eqiad1.wikimedia.cloud [08:33:11] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [08:51:13] !log tools bounce stashbot [08:51:17] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [09:46:33] danilo: earlier today I bounced wmopbot though https://wikitech.wikimedia.org/wiki/Tool:Wmopbot to restart it seems outdated, would you mind updating the page? thank you [10:15:02] !log tools.os-deprecation refresh to track upcoming bullseye deprecation [10:15:04] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.os-deprecation/SAL [11:18:03] !log tools delete tools-prometheus-6, shutdown for a while [11:18:06] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [15:14:51] https://os-deprecation.toolforge.org/ -- "only" ~270 vms to rebuild across the Cloud VPS projects. :) [15:21:01] \o/ [15:33:21] !log copypatrol launch copypatrol-backend-dev-02 debian-13.0-trixie [15:33:23] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Copypatrol/SAL [15:59:02] `Specified command fails to run` [16:00:13] Since yesterday , my tool got offline. Upon inspecting the logs, it is showing that the executable (that supposed to run my tool) is not found on the built image [16:00:36] ``` [16:00:37] | Job name: | Job type: | Status: | [16:00:38] +--------------------------+------------+--------------------------------+ [16:00:40] | campwiz-backend-readonly | continuous | Specified command fails to run | [16:00:41] | campwiz-backend-thing | continuous | Specified command fails to run | [16:00:43] | campwiz-task-manager | continuous | Not running | [16:00:44] +--------------------------+------------+--------------------------------+``` [16:02:18] But I did not change any code as far as I am concerned. Does it have to do anything with the recent upgrade? [16:04:00] It should not, but stuff happens, let me have a quick look [16:04:28] nokibsarkar: campwiz tool right? [16:05:05] maybe not xd (no jobs found there) [16:06:32] yes (re @wmtelegram_bot: nokibsarkar: campwiz tool right?) [16:06:55] campwiz-backend [16:07:00] ack, looking [16:08:25] should it run `campwiz` or `campwiz-backend`? [16:10:33] campwiz-backend-thing [16:10:46] the name of the command [16:11:05] gitlab.wikimedia.org/nokibsarkar/campwiz-backend/-/blob/main/.gitlab-ci.yml?ref_type=heads [16:11:09] nm. I think it's missing the `launcher` in front of it, might have been a change in the api [16:11:11] looking [16:12:49] as a workaround, you can prefix the job `--command` with `launcher ` [16:15:32] yep, found the issue, commit d9ce682db602a4b39576bf48f5eafc8c0e8dadce, looking [16:15:38] Raymond_Ndibe: ^ [16:19:09] opened T401846 [16:19:11] T401846: [jobs-api] buildservice-based jobs stopped prefixing the command with launcher - https://phabricator.wikimedia.org/T401846 [16:23:31] Rolling out the revert [16:23:54] !log toolsbeta reverting jobs-api release (T401846) [16:23:58] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [16:36:31] !log tools reverting jobs-api release (T401846) [16:36:36] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [16:36:36] T401846: [jobs-api] buildservice-based jobs stopped prefixing the command with launcher - https://phabricator.wikimedia.org/T401846 [16:36:51] nokibsarkar: the fix is out in production, should work now, can you try? [16:56:25] I am not getting any response although the `toolforge jobs list` says it is running [16:59:52] When attempting to connect from a new trixie instance to trove database, I get "ERROR 2026 (HY000): TLS/SSL error: SSL is required, but the server does not support it" unless I add `--skip-ssl`. It works fine on an existing bookworm instance. Is this something expected that I need to work with? Would creating a new db instance fix that? [17:13:32] JJMC89: https://mariadb.com/docs/release-notes/community-server/mariadb-11-4-series/what-is-mariadb-114#ssl-tls suggests that that's an intended behaviour change from upstream mariadb. I don't know how complicated enabling tls on the trove server side is, cc andrewbogott [17:16:52] nokibsarkar: can you elaborater? [17:16:56] *elaborate [17:24:32] thanks, taavi. Would you like a task for enabling on the server? [17:29:05] nokibsarkar: I see logs from your jobs [17:37:28] JJMC89: sure. [17:46:22] nokibsarkar: if you still have issues, please open a task (or comment in the one I mentioned if it's the same) [17:49:25] I don't know how complicated it is either, but likely pretty complicated. [18:53:21] !log damian-scripts@tools-bastion-13 tools.cluebotng-review reviewer deployed @ refs/tags/v0.1.9 [18:53:22] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-review/SAL [20:07:02] !log damian-scripts@tools-bastion-13 tools.cluebotng-review reviewer deployed @ refs/tags/v0.1.10 [20:07:04] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-review/SAL [20:12:05] Hi! I am trying to set up new XTools instances, and I'm having issues with the new VMs being able to talk to my Trove database. [20:12:15] For starters, I get `ERROR 2026 (HY000): TLS/SSL error: SSL is required, but the server does not support it` which is not good. I don't know what I'm missing, but anyway using `--ssl=0` (for testing purposes) and the same username/password that works on the old VM gives me `ERROR 1045 (28000): Access denied for user 'xtools'@'xtools-dev07…` [20:12:20] I suspect the Trove db is set up to only accept traffic from specific hosts, IPs, or something. If that is the case, I don't know where I can modify this configuration. I don't see anything to that effect in the Horizon interface. Any ideas? [20:13:36] Trove db is `xtools-db01`, VM I'm testing on now is `xtools-dev07.xtools.eqiad1.wikimedia.cloud` [20:13:57] fwiw, JJMC89 also encountered the SSL problem earlier in this channel (with the workaround --skip-ssl), but it sounded like the connection otherwise worked (though I’m not sure) [20:15:01] okay, there’s also a task at T401861 which describes --disable-ssl as a workaround [20:15:02] T401861: Enable SSL in Trove MariaDB - Trixie MariaDB client requires SSL but SSL is not enabled in the Trove server - https://phabricator.wikimedia.org/T401861 [20:16:36] ok thanks, so the SSL problem is Debain Trixie. I won't worry about that issue for now. But I still am not able to connect even with `--disable-ssl`, so something else is wrong too :( [20:17:54] I had the same issue (`ERROR 1045 (28000)`) with an earlier attempt of making a new VM using Debian Bookworm [20:19:45] I have the same security groups added to the new VMs, so I don't think it's that [20:21:39] the new VMs do have a different port security groups (that horizon added automatically). Maybe the Trove DB doesn't accept IPv6 traffic …? [20:23:44] error 1045 means "access denied", missing grants usually [20:26:18] aha! that led me in the right direction. Indeed I appear to have set up grants for each VM. Thanks! [20:26:41] :) [23:11:08] !log copypatrol launch copypatrol-backend-prod-02 debian-13.0-trixie [23:11:10] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Copypatrol/SAL [23:30:01] !log copypatrol copypatrol-backend-prod-01 disable/stop systemd services [23:30:03] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Copypatrol/SAL [23:32:24] !log copypatrol update copypatrol-api proxy to copypatrol-backend-prod-02 [23:32:25] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Copypatrol/SAL [23:33:54] !log copypatrol copypatrol-backend-prod-02 deploy 6931255 [23:33:55] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Copypatrol/SAL [23:48:42] !log copypatrol delete copypatrol-backend-dev-01 [23:48:43] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Copypatrol/SAL