[02:54:06] wm-bb appears to have dropped offline [06:46:46] !log tools taavi@toolserver-proxy-01:~$ sudo systemctl restart apache2.service # see if it helps with toolserver.org ssl alerts [06:46:50] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [09:47:05] !log cloudinfra upgraded cloudmetrics to grafana 7.5 (T292614) [09:47:08] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Cloudinfra/SAL [09:47:08] T292614: Upgrade grafana in cloudmetrics - https://phabricator.wikimedia.org/T292614 [10:05:27] !log toolsbeta Adding a new grid webgrid generic node (T292465) - cookbook ran by dcaro@vulcanus [10:05:31] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [10:07:58] !log toolsbeta Adding a new grid webgrid generic node (T292465) - cookbook ran by dcaro@vulcanus [10:08:00] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [10:08:52] !log toolsbeta Adding a new grid webgrid generic node (T292465) - cookbook ran by dcaro@vulcanus [10:08:55] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [10:09:50] hmpf, can't find the old sgeexec node flavor xd [10:13:23] !log toolsbeta Adding a new grid webgrid generic node (T292465) - cookbook ran by dcaro@vulcanus [10:13:26] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [10:36:39] !log toolsbeta Adding a new grid webgrid generic node (T292465) - cookbook ran by dcaro@vulcanus [10:36:43] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [11:54:28] hello again cloud people [11:55:05] I'd like to edit the hiera prefix config of deployment-cache-text but it seems that I'm not allowed: https://horizon.wikimedia.org/project/prefixpuppet/?tab=prefix_puppet__puppet-deployment-cache-text [11:55:20] can anyone help? [12:54:27] ema: hey, let me take a look, are you a member of the project? [12:54:53] dcaro: hi, I am [12:55:30] just a regular "user" though, nothing fancier [12:56:35] it might be that only project admins have right to change the hiera data, let me double check [12:57:08] thanks [12:59:07] I ended up making the change by using hieradata/cloud/eqiad1/deployment-prep/common.yaml in ops/puppet for all deployment-prep, but still it would be nice to be able to change things within horizon too [13:03:47] ema: what project is this? [13:04:00] dcaro: deployment-prep [13:04:02] ack [13:05:31] ema: can you try doing a change now? [13:09:13] dcaro: I still can't but maybe I have to logout and login again? [13:09:17] let's try [13:09:47] maybe [13:09:50] dcaro: yup, I see edit buttons now! [13:10:02] \o/, yep, it was the rights in the project [13:11:07] right, so you need to be projectadmin to edit the Puppet stuff [13:11:14] good to know and thanks for fixing it! [14:14:23] Who is responsible for the IRC/Telegram bridge? [14:19:23] bd808: might know ^ [14:41:17] hare: probably me. I see that it is broken. I’ll restart it as soon as I’m at a laptop. [14:41:25] thanks! [14:47:32] !log tools.bridgebot Restarting to reconnect to irc [14:47:35] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.bridgebot/SAL [14:51:19] ugh... ":copper.libera.chat 465 wm-bb :You are banned from this server- Your client is repeatedly reconnecting. Please email bans@libera.chat when fixed. (2021/10/6 02.24)" [14:51:58] ruh roh [14:53:07] bd808: a friend of mine is an oper, if you fix whatever the underlying problem is I can ask him to unban (once he wakes up) [14:53:15] Are they still called "opers" [14:53:19] That's what they were called in like 2003 [14:53:46] yeah or ircops, but that sounds too much like "irc cops" in my head [14:54:39] it's the matterbridge software that flips out. and that's a whole bunch of golang that I'm not too excited to debug :/ [14:57:01] libera usually just says "staff", but they're all volunteers [15:02:10] bd808: stupid hack: put a behaving bouncer between wm-bb and libera, that way the bridge bot is not the one controlling reconnects [15:02:34] majavah: that might be the best "fix" honestly [15:02:44] yup [15:03:01] works pretty well for the wm-bots [15:04:02] l.egoktm has some ideas about matrix and "double puppeted" connections between irc<->matrix<->telegram [15:34:24] yep! just a bit behind in getting around to testing it... [15:45:29] * bd808 makes some config changes and emails bans@libera.chat begging for another chance [15:52:46] The bridgebot irc bridge is down -- https://phabricator.wikimedia.org/T292640 [15:54:56] !log tools.bridgebot Shutting down bot until we hear back from libera.chat (T292640) [15:54:59] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.bridgebot/SAL [15:57:44] where is this channel bridged anyways? https://wikitech.wikimedia.org/wiki/Help:IRC doesn’t say afaict [15:57:51] and it doesn’t seem to be one of the Telegram channels I’m in [16:00:12] !log tools.bridgebot Restarting to reconnect to irc after libera.chat lifted the account ban (T292640) [16:00:15] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.bridgebot/SAL [16:02:20] Lucas_WMDE: it's at https://t.me/wmcloudirc, and I'll update that page. [16:02:30] cool, thanks! [16:03:45] 👋 (re @lucaswerkmeister: ) [16:07:34] !lot toolhub Updated demo server to 91e33ad [16:07:39] !log toolhub Updated demo server to 91e33ad [16:07:41] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolhub/SAL [17:34:12] is there any issues with Horizon? I'm getting a weird error on getting limits `totalVolumesUsed'` [17:36:26] the limits are public in openstackbrowser, so its not an immediate issue [17:38:13] matanya, andrewbogott: I want to spin up a new instance in the video project to host video2commons frontend. I'm a user in the project currently, doesn't seem to give me access to do that. Should I ask for admin access in phab? [17:38:40] yeah, you need projectadmin to launch instances and most other actions on horizon [17:39:00] * chicocvenancio nods... [17:44:55] chicocvenancio as majavah says, you need. But why do you need another instance? [17:45:52] to host the frontend that is currently broken in toolforge [17:46:30] matanya: T292355 [17:46:30] T292355: video2commons login is broken by the LE cert expiry (py2.7) - https://phabricator.wikimedia.org/T292355 [17:48:02] Wouldn't it be better to move it to python3 ? [17:48:45] sure [17:49:00] who is going to do that? [17:50:17] * chicocvenancio is running for Wiki Movimento Brasil (WMB) vice-president and WMB is asking WMF for grant money for tech projects that include dedicated developer time for video2commons [17:50:40] no idea if that grant will go through though. [17:51:08] python 2 to 3 is usually pretty trivial, often a few minutes of work [17:51:20] I can (maybe) help with that later in the day [17:52:02] Or really 2to3 can help as much as I can :) [18:01:38] https://phabricator.wikimedia.org/T292355#7406476 please mind the tool has been down for almost a week now. [18:03:24] I don't see why chicocvenancio shouldn't be a project admin of v2c if they're going to fix it... [18:03:31] related: T286067 [18:03:32] T286067: Maintainers needed for video2commons - https://phabricator.wikimedia.org/T286067 [18:24:22] legoktm: I think matanya can bestow adminship [19:22:58] thanks for the comment legoktm. Commented on T286067 as well. [19:22:59] T286067: Maintainers needed for video2commons - https://phabricator.wikimedia.org/T286067 [19:28:23] bd808: fyi - it seems https://orphantalk.toolforge.org/ was unable to make any API requests from PHP (e.g. to ) but it was fine after a service restart. I suspect maybe it had to do with HTTPS. Whcih might be a more general issue / reason to restart other php7.2 webservices [19:28:53] the last service start before just now was `2021-07-26 16:55:56: (log.c.217) server started ` [19:33:48] I believe any stretch container that hasn't been restarted since early September will need a restart [19:34:47] is that something specific to our images, or a general issue with any stretch install? [19:35:54] The general word around LE expiry is "oh you haven't updated your OS in 10 years" which doesn't seem to match with Debian Stretch which was released in 2017. [19:36:51] I guess it's some kind of caching problem? [19:37:15] and it affects even servers that were restarted just before Sep 30? [19:41:01] Krinkle: generally with stretch, because it needed an openssl (and gnutls) update [19:44:20] ah, so it's actually using a newer image that we published with that updated package. [19:51:07] yep [19:52:07] !log tools.ranker deployed 3540e2a083 (🌈 navbar) [19:52:09] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.ranker/SAL [20:12:52] I just noticed that the function-orchestrator deployment on notwikilambda is having issues [20:13:00] “Error creating: Internal error occurred: add operation does not apply: doc is missing path: "/spec/volumes/-": missing value” [20:13:33] could this be related to the new way that `toolforge: tool` pods get their Toolforge directories mounted? cc majavah [20:13:50] there aren’t any custom mounts/volumes in this deployments as far as I can tell [20:15:47] (ah, so there was indeed a change in mounting things on pods! thanks, I thought I had become crazy :-D ) [20:16:18] that would’ve been https://phabricator.wikimedia.org/T279106 if I’m not mistaken [20:18:19] thanks! it all looks beyond my understanding but I am glad I got it to work somehow ^^ [20:18:30] I think I found a workaround too [20:19:34] lucaswerkmeister: sounds indeed related, I was just about to go to bed so will fix tomorrow [20:19:43] ok thanks :) [20:19:48] should I leave a comment in the task or something? [20:20:02] https://github.com/wikimedia/cloud-toolforge-volume-admission-controller/blob/0a2d76cf09541767df7d675cf7d7e8a88c885e6b/server/admission.go#L148 is likely a good pointer if you want to try to make a patch [20:20:03] I’ll push my workaround to github once the startup probe succeeds [20:24:02] pintoch: out of curiosity, what kind of difference did you see? [20:24:06] !log tools.notwikilambda mount only function-orchestrator source into pod (8731f2ff6e), working around apparent “toolforge: tool” issue [20:24:09] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.notwikilambda/SAL [20:24:20] lucaswerkmeister: please file a subtask, thanks [20:24:21] here’s my workaround fwiw https://github.com/lucaswerkmeister/notwikilambda-k8s/commit/8731f2ff6e4c2c3637785acb6089ead0bef40e6c [20:24:26] ok will do [20:24:52] majavah: I had to do this change to my deployment files: https://github.com/Wikidata/editgroups/commit/814daea4455f060b017a98d8f6386d0ea076774b [20:25:32] I figured it out by comparing my files to the examples given on the Wikitech wiki: I just imitated the new version of the example [20:25:35] pintoch: I think that was fixed already, but Lucas has a new issue [20:26:09] okay! yes I made this change some time ago already. Do you mean the older versions of those files would work again now? [20:27:21] yes, and it's a bug if they don't, but I'd still recommend not declaring them as it duplicates what toolforge: tool gives to you for free [20:27:30] yeah, that looks like it was redundant anyways iiuc [20:28:13] created https://phabricator.wikimedia.org/T292672 [20:28:33] If I understand the error correctly simply setting `volume: {}` should at least change the error [20:28:59] it's an array, but otherwise yeah [20:31:20] looks like my test suite does not try to actually apply the patch: https://github.com/wikimedia/cloud-toolforge-volume-admission-controller/blob/0a2d76cf09541767df7d675cf7d7e8a88c885e6b/server/server_test.go#L113 [20:33:08] s/{}/[]/ [20:33:55] * AntiComposite starts writing up a phab task about a NoneType that's not iterable [20:34:03] * AntiComposite tests it a few times [20:34:13] and now it's magically fixed [20:45:45] Are NoneTypes normally iterable? [20:46:28] no, `webservice` was complaining about it every time I exited a python3.9 webservice shell [21:03:18] hm, I think I had the same error when restarting a webservice earlier [21:03:21] but it worked in the end [21:03:30] (and by now I’ve closed the terminal where the error message was) [23:49:19] !log tools.bridgebot Restarting to place bnc between bot and libra.chat IRC network (T292640) [23:49:22] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.bridgebot/SAL