[10:35:19] !log bsadowski1@tools-bastion-13 tools.stewardbots Restarted StewardBot/SULWatcher because of a connection loss [10:35:22] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stewardbots/SAL [14:52:49] another user got bit by the old bastion: T371505 (re @lucaswerkmeister: tools-sgebastion-10 could maybe also use a hint in the login banner at this point?) [14:58:00] lucas: +1 to adding a hint in the login banner [14:58:08] it's defined in modules/profile/manifests/toolforge/bastion.pp [14:58:18] we would need to customize it only for the old bastion [15:00:15] if you want to create a patch, otherwise I can try to do it myself but I can't promise I will find the time this week :) [15:01:30] I have never seen any evidence that people read login banners, but sure [15:02:53] also, the buster bastion being broken to run Toolforge workflows is the problem. That bastion is still there because the newer bastions do not support former workflows. [15:03:33] T360488 [15:03:34] T360488: Missing Perl packages on dev.toolforge.org for anomiebot workflows - https://phabricator.wikimedia.org/T360488 [15:03:37] yes, I'm not sure what the plan is for things like Perl [15:03:52] but I think we should still encourage people who don't have special workflows to migrate to the new one [15:05:14] for T360488 maybe we can add it to the agenda for the next Toolforge monthly meeting? [15:11:27] bd808: added here https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Monthly_meeting#Next_meeting_agenda [15:15:44] dhinus: ok. I'd be happy to hear ideas for next steps. So far I feel that the things I have suggested have been rejected as being long term undesirable. [16:08:02] does the buster bastion literally only serve one user at this point? [16:09:01] If so we could give it its own weird name and point the commonly-used aliases to the modern bastions [16:10:02] andrewbogott: the folks who are ending up there accidentally are using a hostname that was deprecated 4 years ago so... yeah. [16:12:20] Yep, seems like we should just delete login.tools.wmflabs.org and create 'bradland.toolforge.org' instead [16:12:29] any reason for me to not do that right now? [16:13:20] anything other than `login-buster.toolforge.org` that is pointed at tools-sgebastion-10.tools.eqiad1.wikimedia.cloud (172.16.6.95) could be pointed elsewhere. [16:17:47] !log tools changing login.tools.wmlabs.org to point to a newer bastion, tools-bastion-12, in response to T371505 [16:17:52] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [16:17:53] T371505: `toolforge jobs` is broken - https://phabricator.wikimedia.org/T371505 [16:18:14] dhinus, ^ [16:18:33] thanks, makes sense [16:19:03] for "bradland", we would still need to get rid of buster, so that would mean creating a bookworm bastion with different settings [16:19:25] I think we can discuss that or other alternatives during https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Monthly_meeting#Next_meeting_agenda [16:19:55] yeah, I wasn't suggesting a longterm bradland suggestion, just making it explicit that login-buster is for brad and only brad :) [16:20:24] But it would be weird for anyone but him to be specifying -buster anyway at this point [16:20:24] I would keep the name login-buster until we remove it or upgrade it, to avoid further confusion :) [16:20:29] yep, agree [16:20:40] I didn't know it already had an explicit name [16:21:09] for other dns names, +1 to what bd808 said, they can all point to the new bastion [16:21:45] maybe we could even sunset some old names? with some advance notice via mailing lists [16:22:19] I wouldn't mind killing off the .wmflabs names but on the other hand it doesn't cost us anything to leave them [16:22:50] Anyway, I've fixed the exact trap that caught magnus so now I'm going to eat lunch [16:22:53] when we import them to tofu-infra it's a good chance to sort them and evaluate [16:23:07] we have some .wmflabs for other things not just bastions [16:23:13] I'm kind of surprised that nobody else has actually shown up to tell us about complex workflows they have that are broken because of the lack of language runtimes on the new bastions. I guess the folks who used to make fancy remote code and state management workflows left or stopped at some point. [16:23:57] bd808: I was also a bit surprised and that's why I added "how do we make sure to capture all the tools/workflows" to the agenda... I guess we could check who is logging in to the old bastion? [16:25:53] dhinus: `last|awk '{print $1}'|sort|uniq|wc -l` there says 71 accounts have used it since the lastdb rotated on july 1 [16:33:08] I’m late to the party – is it still worth putting a notice in the banner now that only an explicit -buster hostname is left pointing at the buster bastion? [16:33:28] (I don’t disagree with bd808’s skepticism about people not reading banners, it just felt like a more lightweight solution if we didn’t want to move the host name ^^) [16:33:56] (eh, s/solution/some other word, idk, potential improvement/) [16:34:19] lucaswerkmeister, I think a banner is still useful if you feel like making it [16:36:38] alright, I might put something together after dinner [17:57:02] https://gerrit.wikimedia.org/r/c/operations/puppet/+/1058654 [17:57:40] also, apparently git show doesn’t escape control characters? o_O : https://tools-static.wmflabs.org/bridgebot/d005049b/file_63123.jpg [19:15:52] !log lucaswerkmeister@tools-bastion-13 tools.lexeme-forms deployed 4775170045 (upgrade toolforge_i18n to 0.0.6; also upgrade pip to 24.2) [19:15:55] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.lexeme-forms/SAL [19:19:31] !log lucaswerkmeister@tools-bastion-13 tools.wd-image-positions deployed 5540ef17c9 (upgrade toolforge_i18n to 0.0.6; also upgrade pip to 24.2) [19:19:32] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wd-image-positions/SAL [19:44:00] Hi! I am working on tools.osm4wiki@tools-bastion-12 and suddenly my editor "joe" does not work anymore:  "-bash: joe: command not found" I can not re-install it with apt-get. What is wrong? [19:56:44] !help [19:56:44] If you don't get a response in 15-30 minutes, please email the cloud@ mailing list -- https://wikitech.wikimedia.org/wiki/Help:Cloud_Services_communication [20:05:30] Guest98: do you know when it worked for you most recently? [20:06:12] also, are you by chance using 'login.tools.wmlabs.org' as the hostname for your bastion? [20:06:40] Today in the morning. Last changes 10:54 in Germany. [20:08:16] If you're using the old .wmflabs.org hostname when connecting, that hostname is now directed to a different bastion as the other one is going to be replaced soon. [20:08:23] So that would explain the change. [20:09:49] I use an login script since many months. Was there a chance today? [20:09:59] yes [20:10:46] I don't understand. I can login and I access my directory as usual. [20:11:06] what hostname is specified in your login script? [20:11:13] I login on "tools-login.wmflabs.org". [20:11:54] what else should I use? [20:13:07] login.toolforge.org is the new name, the 'wmflabs.org' domain was deprecated several years ago. [20:13:22] But that just means you've accidentally been using an obsolete bastion for a while, and only just now got upgraded. [20:13:49] If you want joe installed on the new bastions we can open a ticket; I don't know if there are any downsides to adding it [20:14:12] I will try this, thanks! I only hope that the authentication did not change [20:14:49] authentication is standardized across all the hosts [20:21:29] I am sorry. Login worked fine, but joe does not. [20:21:30] -bash: joe: command not found [20:21:30] tools.osm4wiki@tools-bastion-13:~/public_html/cgi-bin$ apt-get install joe [20:21:31] E: Could not open lock file /var/lib/dpkg/lock-frontend - open (13: Permission denied) [20:21:31] E: Unable to acquire the dpkg frontend lock (/var/lib/dpkg/lock-frontend), are you root? [20:23:59] You can't use apt-get install yourself [20:24:08] As Andrew said, you need to open a ticket [20:24:33] OK. Where can I open a ticket? [20:26:08] Guest98: https://phabricator.wikimedia.org/maniphest/task/edit/form/1/?tags=toolforge [20:29:03] Thank you very much! [20:31:43] * bd808 gets the good old 1990 TurboPascal feels any time he starts `joe` [20:35:36] Hey! I have an instance in horizon under the "traffic" project: traffic-puppetserver-bookworm. I accidentally "shelved" it and it's just stuck in perpectual "shelving" status. Can I just get the instance killed/deleted? [20:50:12] jobs is now fixed on the buster bastion [20:58:30] I found a bug in the "xtools.wmcloud.org" website. Can I report it here? [20:59:41] I'm the one to talk to, but the venues are https://mediawiki.org/wiki/Talk:XTools or file a bug on phab. In this case though, since I'm here, you can just message me :) [21:00:43] dcaro: thanks! [21:02:31] musikanimal Thanks. In the global contributions tool, such as this: https://xtools.wmcloud.org/globalcontribs/Pure_Evil , there is an error in the "page title" column. Instead of giving the namespace as "Project", which redirects to the project namespace of the respective project, it gives the namespace as "Meta". In the example I linked to, it [21:02:31] says "Meta:Simple talk" on simple.wikipedia.org. However, this redirects to Meta-Wiki instead of staying within simple.wikipedia.org. It should be "Project:Simple talk". Likewise for all project namespace pages across all wikis. [21:03:11] that's https://phabricator.wikimedia.org/T337104 and I have it fixed. I hope to do another deploy this week [21:04:06] thanks for reporting, anyway! :) [21:04:08] oh sweet, the shelving seems to have timed out and I'm able to continue [21:04:09] Ok, if that is fixed that will be good. Thank you. [21:25:21] brett: I'm pretty sure I've seen shelving work in the past :/ are you now able to do what you need to do? [21:25:34] I am, thanks! [21:25:51] andrewbogott: I do have a question with setting up a new puppetserver in horizon: It's complaining that it can't find /etc/ssl/certs/WMF_TEST_CA.pem [21:26:34] brett: that's new to me! What's the fqdn of the server? [21:26:46] traffic-puppetserver-bookworm.traffic.eqiad1.wikimedia.cloud [21:27:42] are you migrating things from the old puppetmaster or starting fresh? [21:27:42] The description of the attached cinder volume is " Certs and git repos for project-local p…" but I only see the git repos. Maybe something's missing there? [21:27:56] starting fresh since the CA was using the old buster cert. [21:28:13] I ran into that same issue and figured starting over was easier after mucking around [21:28:44] hm, ok [21:31:28] puppetservers store certs in /srv/puppet/server/ssl so building a new VM may not fix what you're trying to fix [21:31:55] But, for starters I'm going to temporarily remove the puppetserver role from that host to try to get one good initial puppet run [21:32:41] ah, I see. So then we'd delete parts of /etc/ssl/ and symlink to /srv? [21:33:27] The thing about WMF_TEST_CA seems to be a new, general issue, I'm going to build a test VM and see if I can reproduce [21:33:40] <3 [21:33:41] I don't think I follow what you mean about symlinking but we'll cross that when we get there [21:34:19] I guess I just figured that we needed to somehow get /etc/ssl/certs/WMF_TEST_CA.pem to exist since run-puppet-agent wanted it to [21:36:35] I think this could be a chicken-egg problem. [21:36:43] the first succesful puppet run should isntall that file, afaict [21:36:56] yeah, seems likely although I just now failed to reproduce it... [21:37:02] where it comes from is based on $root_ca_source [21:37:20] and by default that would taken from puppet repo itself [21:37:38] from modules/profile/files/pki/ROOT/ [21:38:14] oh I bet it's because that whole project has [21:38:16] https://www.irccloud.com/pastebin/TgntbMi6/ [21:39:38] you could manually copy that file from the puppet repo to /etc/ssl/certs/ and see what happens next? [21:39:40] yep, brett, if I remove ^ from your project-wide puppet config that cert issue goes away. [21:39:44] ah :) [21:40:05] * andrewbogott wonders if every VM in traffic has that problem [21:40:58] seems like they would since pki::client is included in base [21:41:15] heh, I tried a different VM and it has failed puppet for yet another reason [21:41:30] maybe there is "prefix puppet" setting in that project though [21:41:50] so... brett that bug is self-inflicted and I think I will leave it to you and your ilk to sort out. It definitely helps to remove that trusted_certs entry but of course i don't know why it's set in the first place [21:42:13] If you are incurious you can override that setting in the puppetserver prefix hiera [21:42:22] ...lmk if you don't know how to do that :) [21:50:23] ugh. Thakns for looking into it [22:05:33] !log tools.cronout rm -rf tb-dev/cronout to clear out space. No response on T324172 [22:05:34] andrewbogott: Unknown project "tools.cronout" [22:05:34] T324172: 'cronout' dir in tb-dev - https://phabricator.wikimedia.org/T324172 [22:05:46] !log tools.tb-dev rm -rf tb-dev/cronout to clear out space. No response on T324172 [22:05:48] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.tb-dev/SAL [22:42:02] o/ I'm getting some 429s from toolforge while requesting a bunch of CSVs at once for use in a JS app, are there some docs / details on what the rate limits are etc? [22:47:53] I could dig in the code but I don't know that we have specifically documented limits. [23:27:14] My TLDR is that I have a JS app, that loads CSVs of data to display some dashboards, so loading a bunch of static files at the same time, 9*7 files = 63 requests :D [23:28:13] each file is only around 8-20 bytes wide, and 1440 lines long :P, but I think those 63 requests put people just past the rate limit (whatever it is), maybe it is 50 [23:33:17] are you getting the files from tools-static or somewhere else? [23:40:40] just from my tools webservice [23:41:12] specifcaly an example, https://addshore-wikibase-cloud-status.toolforge.org/data/2024/07/31/wb_item_create_time.csv [23:41:37] *reads up on tools-static again* [23:42:31] oh, I should potentially totally be putting my files in $HOME/www/static rather than serving them via webservice... But I wonder if that has a different rate limit (id guess maybe) [23:49:30] addshore: the rate limit is 100/ip/minute across all of Toolforge [23:50:18] and yes tools-static is different and currently does not have any ip based throttling [23:53:48] andrewbogott: See T313131 for t.aavi's introduction of the proxy level rate limit for Toolforge. I'm not sure that we have done much to document that throttle. I think the general hope was it would be high enough that most humans would never notice. [23:55:00] Ugh. That ticket is private because it has IPs in it. Sorry folks who are trying to follow along without advanced Phabricator user rights. [23:55:45] T282732 is the public task the private one is most closely associated with. [23:55:45] T282732: Occasional HTTP 502 Bad Gateway errors for several Toolforge tools - https://phabricator.wikimedia.org/T282732 [23:56:03] addshore: this may be a silly question, but can you tell me about your impulse to search for documentation rather than just think 'whoops, too fast!' and try again slower? [23:58:45] "100/ip/minute across all of Toolforge", that sounds perfect, in which case I was only hitting it due to refreshing too much while developing the graphs and having silly JS load it each time the browser refreshed [23:59:39] andrewbogott: Basically to check if I needed to worry about slowing down / retrying / if any of that made sense / was needed. I reality, no user should hit the rate limits while loading the page :) [23:59:49] only me while doing silly development