[13:08:36] anyone familiar with wikibugs online? it’s stopped posting to -dev and -operations AFAICT [13:08:49] but I’m not sure which of its jobs I should restart, none of them seem obviously broken [13:09:03] (irc, gerrit and znc all have recent messages in `kubectl logs`) [13:11:41] if nobody’s around I’ll just try restarting some of them soonish and hope it helps [13:13:46] (I think I would first try restarting znc, there’s some suspicious “cannot send to nick/channel” in its output) [13:15:51] !log lucaswerkmeister@tools-bastion-13 tools.wikibugs toolforge jobs restart znc [13:15:53] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikibugs/SAL [13:17:19] !log lucaswerkmeister@tools-bastion-13 tools.wikibugs toolforge jobs restart irc [13:17:20] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikibugs/SAL [13:19:25] okay looks like the irc restart fixed wikibugs [13:19:33] (the znc restart had no effect that I could see) [13:45:24] Nemo_bis, can you respond to (or act on) https://phabricator.wikimedia.org/T367528#9945179 ? [17:18:53] @lucaswerkmeister: thanks for tending to wikibugs. When you restarted ZNC the irc bot died and the new k8s spawned replacement was still initializing when you killed it too. Unfortunately I don't have any durable logging for the ZNC right now so I can't see what was up with it before the restart. [17:20:23] ah, damn :( [17:28:40] @lucaswerkmeister: no big deal. you were doing the needful I'm sure. [17:28:48] I tried to ^^ [21:11:40] I would like to run the update facts script on puppetmaster.cloudinfra.wmflabs.org, but I hit a timeout sshing to puppetmaster.cloudinfra.wmflabs.org [21:11:55] I can't recall if I have been able to access the host before [21:23:34] jhathaway: I found that the IP that name points to is bound to what is now called cloudinfra-cloudvps-puppetserver-1 [21:24:17] jhathaway: try this name: [21:24:18] cloudinfra-cloudvps-puppetserver-1.cloudinfra.eqiad1.wikimedia.cloud [21:24:54] had to add it to config so it goes via the bastion but I was able to get on it as root [21:25:31] I can confirm that the documented name just times out [21:28:49] ah!, thanks mutante [21:32:50] now my connection get refused, who controls access to the cloudinfra project? [21:34:15] jhathaway: I feel like "syncing puppet facts" did not need ever need access to "cloudinfra" [21:34:37] wasn't this a local puppetmaster in the actual "puppet-diffs" project? I could swear that was where we synced to [21:34:53] or is this not about the puppet compilers? [21:35:09] correct, this is about the puppet compilers [21:35:24] but cloudinfra is a different project [21:35:38] that's the central puppetmaster [21:35:52] now I wonder if the local puppetmaster was removed [21:37:14] good question, at present the hosts in the puppet-diffs projects are pointed to puppetmaster.cloudinfra.wmflabs.org [21:37:56] whereas the hosts in the devtools project are pointed to puppetmaster-1003.devtools.eqiad1.wikimedia.cloud [21:38:16] for devtools I can say that it depends on which instance it is [21:38:28] but puppet-compiler is puppet-diffs [21:39:18] stat says the puppet.conf file for pcc-db1001 was changed on 2024-04-09 [21:40:30] reads https://wikitech.wikimedia.org/wiki/Server_Admin_Log/Archive_78#2024-04-09 [21:41:39] well,, wrong SAL [21:42:04] the genesis of problem was pcc failing here, https://gerrit.wikimedia.org/r/c/operations/puppet/+/1054661?tab=checks [21:42:19] complaining that facts did not exist for those hosts [21:42:39] nothing in SAL of the project [21:42:40] though they puppet clean, so puppetmaster.cloudinfra.wmflabs.org has their facts [21:43:22] most definitely [21:52:08] cloudinfra is the shared project for all Cloud VPS things. SAL for it is probably under "admin" typically -- https://sal.toolforge.org/admin