[13:26:09] is anyone available to help with the beta cluster? WMDE-Fisch is having trouble with geo coordinates (see #wikimedia-search scrollback ) . He was having issues on Friday and I ran saneitizer ( https://wikitech.wikimedia.org/wiki/Search#Saneitizer_%28background_repair_process%29 ) . [13:27:17] now he's saying that the coordinates aren't being indexed by Elastic as expected ( https://www.mediawiki.org/wiki/Extension:GeoData#Parser_function ) [13:37:14] I can help on the WMCS side if you have any problems there, but I'm not familiar with the beta cluster (deployment-prep) itself [13:39:30] dcaro no worries, is there a better room to ask beta cluster questions? [13:43:36] It's not helped by being one of those weird things... Like, this might be more of an ES question [13:59:40] hi [13:59:45] hey [14:00:22] hi [14:00:26] hello [14:01:01] probably #wikimedia-search is the best place yep :/ [14:01:32] oi [14:02:48] OK, we are working thru it in #wikimedia-search [14:02:58] oi [14:03:27] 👍 just joined, in case anything rings a belll [14:03:32] i59: Saying "oi" repeatedly is considered rude [14:03:40] ok [14:04:55] ok then [14:13:46] why am i banned from wikitech [14:14:46] why [14:14:56] What is your username? [14:15:09] Otherwise there's no way for anyone to even begin to answer the question [14:15:56] butter [14:16:11] thats the name [14:17:44] hey [14:18:05] Based on the block reason, it seems fairly obvious [14:18:24] well when i get blocked [14:18:29] i cant log in [14:18:43] That's kinda the purpose of blocking people... [14:19:35] look i  dint abuse accounts untill some dude blocked me [14:19:40] i tried to help [14:20:02] i stoped untill i got blocked again [14:20:04] like [14:20:11] i tried to help [14:20:53] what do i do now [14:21:08] i keep getting marked as a troll [14:22:18] what do i do [14:22:20] someone help [14:22:25] I don't see any evidence of you helping [14:22:42] i tried [14:22:43] And your own edits to your various accounts seem to be you admitting you're doing what you're saying you're not doing [14:22:47] on another account [14:22:47] https://wikitech.wikimedia.org/wiki/User_talk:N00b for example [14:22:56] had to m8 [14:22:59] No you didn't [14:23:08] what was i sopossed to do [14:23:16] i cant log into my main [14:23:22] like when i put the pass [14:23:26] it fails [14:24:03] what do i do [14:24:17] !kb i9 [14:24:29] lol [14:24:54] had to m8 [14:26:19] hi hmm, i hope this is not something I already know the answer to, but I have created some new VMs in deployment-prep, and I can't yet log into them. Its been 45 mins since I created one of them. [14:26:25] puppet seems to have run fine on the node [14:26:31] v [14:26:32] https://horizon.wikimedia.org/project/instances/731f55fd-78f9-4ac9-b10b-d01230434681/ [14:27:00] oh [14:27:00] wait [14:27:01] or maybe not [14:27:02] 2022-04-25T14:02:25.976464+00:00 deployment-kafka-jumbo-4 puppet-agent[14026]: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Evaluation Error: Error while evaluating a Resource Statement, Evaluation Error: Operator '[]' is not applicable to an Undef Value. (file: /etc/puppet/modules/profile/manifests/kafka/broker/monitoring.pp, line: 89, column: 27) on node [14:27:02] deployment-kafka-jumbo-4.deployment-prep.eqiad1.wikimedia.cloud [14:27:11] wut? [14:27:15] okay lemme try to fix that, errr [14:27:46] oh because of prefix puppet. [14:32:38] is there anyway to trigger a puppet run on a node you I have ssh access to? [14:32:43] or do I have to just wait... [14:33:09] should be the same as prod ottomata.. [14:33:17] oh? cumin? [14:33:20] of course, the puppetmaster needs to have pulled the update etc first [14:33:21] oops [14:33:29] i mean to say 'i don't have ssh access to (yet)' [14:33:33] heh [14:33:44] well, that's a different question indeed [14:33:45] * Reedy grins [14:33:48] indeed! :p [14:33:49] ottomata: I can try to force a run for you. It's deployment-kafka-jumbo-4.deployment-prep.eqiad1.wikimedia.cloud? [14:33:52] yes please [14:34:11] tricky chicken/egg issue with prefix puppet! [14:34:26] i'm doing a migration for the OS upgrades! [14:34:41] now it is hitting the need to switch certs to the deployment-prep local puppetmaster [14:34:46] k [14:34:49] i can sign [14:35:05] hm i don't see it in puppet cert list on puppetmaster04 [14:35:43] yeah, hang on. I'm finding the instructions on what I have to do on the client. Out of practice at this [14:35:48] oh ight [14:35:57] https://wikitech.wikimedia.org/wiki/Help:Standalone_puppetmaster#Step_2:_Setup_a_puppet_client [14:36:03] sudo rm -rf /var/lib/puppet/ssl [14:36:35] ottomata: ok, new cert request should be there now [14:36:50] bd808: signed. [14:37:26] ottomata: puppet ran. Hopefuly you can access the instance now [14:37:43] yes can access, thank you! [14:37:51] i think i know how to avoid this for the other nodes I have to do too [14:38:38] prefix puppet is magic when it helps and the worst thing ever when it does not :) [14:38:56] yeah, and also combined with ops puppet hiera...oof [14:46:19] !log tools Building toolforge-webservice v0.82 [14:46:34] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [14:47:33] bd808: looks like you forgot to create + push a git tag for that release [14:47:54] taavi: you are correct. I can fix that [14:56:20] !log tools Rebuilding all docker images to pick up toolforge-webservice v0.82 (T214343) [14:56:22] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [14:56:23] T214343: Create a Perl Docker image for use on the Toolforge Kubernetes cluster - https://phabricator.wikimedia.org/T214343 [14:56:56] * bd808 would love it if this was all in a CI/CD pipeline somewhere... [14:58:54] ergh, bullsye won't work, no java-8 backport available. deleted that node and doing buster [14:59:18] but i just pre-created all the nodes...which i think will not work and will leave me in broken state again....grrrr [14:59:40] i have to do one at a time [15:02:40] nothing like being blocked by an 8 year old language interpreter... [15:03:07] shhh well i'm not TRYING to spend time upgrading java [15:03:14] it might work buuuut i just need OSes upgraded! [15:03:21] so the VMs aren't blasted away in a week :p [15:09:11] I don't think we have ever randomly deleted deployment-prep instances even months after the OS upgrade deadlines, but I do appreciate folks working to get them updated. :) [15:09:45] https://phabricator.wikimedia.org/T306068 [15:11:08] oh yes. we make that scary sounding task each time :) [15:13:54] haha :) [15:19:14] chaosmonkey on deployment-prep? [15:34:48] hm bd808 i have more problems. deployment-kafka-main-3 and deployment-kafka-jumbo-6 [15:35:28] i can log in [15:35:28] but i can't run puppet [15:35:28] asks for my sudo password [15:35:28] puppet should run successfully on those two nodes [15:35:40] i can't rm ssl certs because i can't sudo [15:40:28] hmm maybe i can just remove the puppet prefix configs for now. puppet shouldn't mess with the running kafka stuff on the old nodes [15:40:57] done, guess i'll just wait for puuppet to try and run again [15:43:49] ottomata: neither deployment-kafka-main-3.deployment-prep.eqiad1.wikimedia.cloud nor deployment-kafka-jumbo-6.deployment-prep.eqiad1.wikimedia.cloud is letting me in even with my cloud-wide root key. [15:45:17] but Y!? [15:45:18] ha [15:54:41] grr, maybe i'll just delete all these nodes and recreate now that i've removed prefix puppet class include [15:54:49] bd808: unless you have something? [16:06:42] bd808: just made deployment-kafka-main-4 [16:06:44] no extra puppet [16:06:46] samme problem [16:16:01] OH yes extra puppet. i did not remove the main prefix [16:24:29] ok yeah, basically having prefix puppet with a class applied before puppetmaster can be changed is not great [16:41:01] ok, so TIL: prefix puppet is great for hiera [16:41:04] not great for including classes [16:41:14] better to do just do that on the node puppet [16:51:52] ottomata: prefix puppet might be less frustrating if you were working in a project that 1) does not have its own puppetmaster (no hidden secrets that the first puppet run can't find) and 2) does not have a bunch of hiera in ops/puppet (fewer places to look for config). I do empathize with how frustratingly different it is to try and jump into deployment-prep work as someone who has spent much more time in production. [16:52:56] yeah [16:53:15] i mean, i at least know where to look! hehh someone else more new to it woudl be baffled [17:08:06] ottomata I feel your pain; puppet runs on my search servers and breaks LDAP with a not-helpful error, so I can't get in. I've been using cloud-init to put my pubkey into the root key as a workaround. You can also get in via virsh console if you have access to the cloud hosts [17:30:47] ohhh never tried that [17:30:51] i should have that access [18:24:05] !log k8splay - adding mbaoma user user and admin (technical writer, redoing k8s workshop) [18:24:07] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:K8splay/SAL [18:54:50] not sure if this helps anyone else, but I noticed that in deployment-prep, when puppet master signs the cert manually, the cert appears in the public as $HOSTNAME.deployment-prep.eqiad.wmflabs.crt as opposed to the newer ".cloud" TLD . (Or at least it did for my server) [18:59:39] inflatador: same here. also see https://gerrit.wikimedia.org/r/c/operations/puppet/+/784753 [19:00:10] stuff is supposed to move into Horizon/Hiera though [19:00:43] inflatador: that is probably mostly relevant for the small number of folk like you who are trying to use puppet certs to do additional things. It also sounds a bit buggy. Maybe just needs some alt names on the signing request reordered or something though? [19:08:16] okay i am so close [19:08:21] nodes all gooood and stuf [19:08:40] however, external proxy https seems to not respond [19:08:53] curl -v https://intake-analytics-beta.wmflabs.org/_info [19:08:59] does that finish for you all? [19:09:55] it works on the node port [19:10:00] just webproxy doesn't work [19:10:41] do you have a security group that allows incoming traffic to that port? [19:13:21] oh ho... [19:13:22] checking [19:13:23] probably not [19:15:05] hm, i can I edit security groups after the node is created? [19:15:08] can I*? [19:15:15] https://horizon.wikimedia.org/project/instances/2d94e48b-4e1b-4cf9-8b11-9d0d4bbc8efb/ [19:15:23] should look like https://horizon.wikimedia.org/project/instances/21480702-050f-4324-8bbe-4716ef3a29ca/ [19:15:26] i didn't add the 'eventgate' security group [19:15:43] Ah i found it in drop down menu in upper right [19:15:50] you can, top right corner on the instance details page [19:16:03] yay and now it works [19:16:05] thank you taavi [19:17:43] great! and thank you for giving deployment-prep some love [19:19:26] * taavi doesn't like the numbers on https://os-deprecation.toolforge.org/ [19:19:39] !log tools.lexeme-forms deployed 7b5d0d7298 (l10n updates) [19:19:41] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.lexeme-forms/SAL [19:25:20] hello, I am trying to help "guest"^ get on cloud VPS the first time [19:25:26] they are getting this: [19:25:27] 19:21 < guest> mbaoma@bastion.wmcloud.org: Permission denied (publickey). [19:25:30] 19:21 < guest> kex_exchange_identification: Connection closed by remote host [19:25:33] 19:21 < guest> Connection closed by UNKNOWN port 65535 [19:25:40] since I am not using that bastion host as an SRE [19:25:49] I was thinking I should ask here for support [19:26:16] also since the point is to write technical docs about that for others to follow afterwards [19:26:39] the docs followed were from https://wikitech.wikimedia.org/wiki/Help:Accessing_Cloud_VPS_instances#Accessing_Cloud_VPS_instances [19:27:06] the Wikitech user has been created today, a little while ago [19:27:10] but also not just 5 min ago [19:28:02] I added them to the cloud VPS project "k8splay" as user and admin [19:28:21] was there another checkbox needed for shell access for a new user maybe? [19:28:22] mutante: I don't see any configured SSH keys for that username [19:29:01] guest: did you paste a key in your user profile at https://wikitech.wikimedia.org/wiki/Special:Preferences#mw-prefsection-openstack ? [19:29:08] taavi: thanks [19:29:14] see "Set up and upload SSH keys" above on that page [19:30:13] i will work on that now [19:30:15] thank you [19:32:38] what email do I associate my ssh key with [19:32:39] WMF email or mbaoma@bastion.wmcloud.org [19:32:51] if you have WMF email, use that [19:32:58] technically should not really matter though for login [19:33:07] should be just a comment [19:33:30] or to be more precise, use the email address you used when siging up at wikitech wiki as well [19:34:35] thanks, I have added an ssh key [19:34:35] my mistake was adding my email to the ssh when i was asked to paste it on wikitech [19:35:07] ah, I see [19:35:58] i am able to login to the VM now [19:36:01] thank you so much [19:36:20] I would stop here for tonight, it is really late. [19:36:34] I will reach out here when I run into errors moving forward [19:37:43] guest: great! that is a success. yes, that's a good point to take a break and continue tomorrow [19:37:52] please do, i'll be here [19:38:06] and then we can live debug the bot [19:38:15] when we know we have the exact same setup [19:38:30] I like that last part [20:11:24] yeah, our elastic puppet config conflates puppet agent cert/elasticsearch cert/hostname/cluster name and probably more. Working on it ;)