[09:26:30] will restart orchestrator service on dborch1001 [09:30:45] ok! [10:01:10] We ( moritzm btullis and I ) checked a bit more of what could be the root cause of Orchestrator x509 issue → it might come from the way Go handles reading/loading the ca chain in memory. We're not sure, this will require some further code testing/debugging. Also, it's maybe a symptom that we should consider a course of action regarding Orchestrator maintenance [10:01:10] https://github.com/openark/orchestrator/pull/1478/commits/2f032d8af3e65de0c99806cb6e17071ff1f6f2a6 [10:05:23] Yeah, we are aware of that :) [10:20:30] arnaudb: Why would that change with a puppet migration? if it is memory related, I wonder why that's not solved by a daemon restart and/or host reboot? [10:21:45] I think it comes from a bug handling a ca bundle instead of a single one, not directly from puppet itself [10:23:16] Ah I see what you mean [12:25:30] @Amir1 cloudb hosts should not have production root password, if they have, that would be considered a breach [12:26:11] it doesn't have the password from what I'm seeing [12:26:23] it might be password hash [12:26:28] they have a different one [12:26:33] ah, ok [12:26:49] and that could be why db-mysql is failing then [12:27:00] then, that is a good thing [12:27:02] :-D [12:27:11] no, .my.cnf is correct in cumin102 [12:27:11] better than the alternative [12:27:25] My guess is it is a password in the old format, I will check when I am finished with something I am doing now [12:28:07] either that or an issue with db-mysql indeed [12:31:04] I just fixed clouddb1021 [12:31:52] so new client stopped supporting it, likely? [12:33:38] "mysql_old_password support was removed from the mysql client libraries in 5.7.5" [12:34:16] but pymysql in theory still supports it [12:36:46] All fixed [12:48:54] marostegui: Thanks, shall we close it? [12:49:04] not yet, I am checking the other two hosts which are not clouddb [12:49:09] I will close it when fixed [12:49:11] ah okay [12:49:13] thanks [15:23:50] I can't believe in 2024 megacli options are still so terrible [15:57:31] urandom, Emperor: I created a master ticket to kick off discussion on the task ahead of renumbering servers in codfw row A/B [15:57:36] T346428 Re-IP hosts on codfw row A and B to new per-rack vlans/subnets [15:57:36] T346428: Netbox: Add support for our complex host network setups in provision script - https://phabricator.wikimedia.org/T346428 [15:58:03] I also created 3 basic sub-tasks for the hosts we discussed here yesterday, where we can discuss the specific approach that might work [15:58:38] feel free to add any detail/thoughts, and add subscribers who may also have input. thanks! [16:02:46] topranks: thanks, will try and get some thoughts on that tomorrow [16:07:31] marostegui: I goofed up a select that'll probably run way too long, what dO? [16:08:25] I killed it client side, can't remember if it'll kill it server side? [16:16:20] claime: no, it will keep running in the server assuming it was already running [16:16:29] crap [16:16:53] urandom, Emperor: seems I linked the wrong task sorry - it's T354869 [16:16:54] T354869: Re-IP hosts on codfw row A and B to new per-rack vlans/subnets - https://phabricator.wikimedia.org/T354869 [16:17:24] marostegui: query killer server side engages at some point? [16:19:03] Yep [16:19:28] Unless it's not coming from the wikiuser [16:19:36] Which I assume it's not the case [16:23:04] It's coming from the user our sql script on mwmaint uses [16:29:00] If it's wikiadmin then we need to kill it on the server side [16:33:09] it is [16:37:11] I think it's ok, I checked the mysql processlist from the server I was querying and my queries aren't there anymore [16:38:14] Right :)