[08:11:11] hnowlan: thanks a lot for the kartotherian fix! So IIUC it was a pending envoy config update and now everything is green? [08:33:13] tappof: I think I have a puppet change of yours pending merge, should I merge it? [08:33:32] wait no, from luca xd [08:33:36] dcaro: that is probably me right? If so go ahead :D [08:33:39] ack dcaro :) [08:33:44] yep +1 [08:33:56] those italians are all the same tappof [08:34:03] :D [08:34:10] I saw Toscano and though Tiziano [08:34:35] lol okok I see the brain shortcut, makes sense [08:40:47] vgutierrez, marostegui - hola, I am deploying a little change that triggers the restart of the cfssl/pki daemons, so far all good from the logs but if anything happens it may be my fault [08:44:29] done, all good afaics [08:45:03] thanks for the heads up [09:47:41] elukey: yeah, just envoy config and the wdqs mesh change [09:49:57] super thanks [09:50:26] speaking of maps, I am going to start some invasive maintenance on the eqiad config (currently depooled) to warm up the cache etc.. [09:50:41] it will take some days, and a repool of eqiad will not be super quick if needed [10:24:21] <_joe_> I get a 500 when connecting to trace.wikimedia.org [10:24:35] <_joe_> looks like oauth2 is broken [10:24:44] <_joe_> slyngs: can this be related to the CAS upgrade? [10:25:29] Yes, there are new IPs for the CAS hosts. Moritz did do a patch and Chris approved, but I don [10:25:36] 't know if it got deployed [10:25:52] <_joe_> ok [10:25:55] <_joe_> let me fix that [10:26:06] I'll see if I can find the patch [10:27:14] <_joe_> yes the patch is merged, I think we lack a deployment of jaeger [10:27:54] Patch, https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1196461 but yes, it's merged [10:27:58] <_joe_> Error: UPGRADE FAILED: release main failed, and has been rolled back due to atomic being set: cannot patch "main-jaeger-query-egress" with kind NetworkPolicy: NetworkPolicy.extensions "main-jaeger-query-egress" is invalid: spec.egress[1].to[5].ipBlock.cidr: Invalid value: "2620:0:861:4:208:80:155:104": not a valid CIDR [10:28:00] <_joe_> sigh [10:28:30] <_joe_> ok fixing [10:29:49] thanks _joe_ [10:30:00] There's a similar and working patch for zarcillo: https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1196437 [10:30:40] <_joe_> elukey: https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1197612 [10:31:01] <_joe_> sigh what did I do [10:31:09] <_joe_> lol nevermind, PS2 incoming [10:31:39] I checked the IP and it seemed correct [10:31:42] what did I miss? [10:31:46] We really should add IDP to the external_services chart [10:32:15] <_joe_> ^^ [10:32:33] claime: I think there is but jaeger doesn't use mesh or similar, IIRC it is a custom helm chart [10:32:55] elukey: It's the upstream chart yeah [10:33:32] for zarcillo IIRC I filed https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1169162 but probably it is now old [10:34:09] <_joe_> elukey: now looks better https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1197612 [10:34:11] claime: yeah my point is that we should customize that chart, IIRC IDP should already be in external services [10:34:59] _joe_ I am confused, what did you change? [10:35:13] I don't see a diff between 1 and 2 [10:35:21] <_joe_> elukey: uhhh on my computer I have the new version of that file [10:35:29] <_joe_> but somehow gerrit is not showing it in the interface [10:35:36] <_joe_> even worse, it's rejecting my push [10:35:44] the /128 ? [10:35:46] I see it [10:35:47] <_joe_> yes [10:35:50] <_joe_> akosiaris: I don't [10:35:54] what? [10:35:56] <_joe_> not in gerrit's interface [10:36:11] Are you on patchset 2? [10:36:12] the /128 was in the first as well, the one that I +1ed [10:36:16] <_joe_> yes [10:36:17] https://lounge.uname.gr/uploads/03d04e34ec048ed7/image.png [10:36:25] <_joe_> yeah I had removed the additional reformatting [10:36:31] PS2 is just a change in the date of the commit message [10:36:35] <_joe_> but I would still see the old verison of the patch [10:36:39] <_joe_> what? [10:36:55] this is why I was asking [10:37:01] https://lounge.uname.gr/uploads/13a70c363a0e39d8/image.png [10:37:02] <_joe_> what the hell did gerrit show me then ? [10:37:17] <_joe_> yeah I mean, that's what I *thought* I had done the first time [10:37:30] <_joe_> but gerrit showed me some reformatting too... I don't understand. [10:37:31] <_joe_> anyways [10:37:34] anyway, I +1ed :D [10:37:34] <_joe_> lemme merge [10:39:09] elukey: you're right about external_services, I was looking for idp as name but it's called cas [10:45:34] claime: I don't think it is impossible to add support to jaeger, maybe we can open a task if nobody did before [10:47:53] It's not, it's just we usually refrain from changing upstream charts too much [11:09:00] cdanis: OK to puppet-merge your ja4 patch ? [11:09:05] btullis: please do [11:09:14] Ack, thanks. [11:34:49] effie: Am I ok to merge your proxoid change? [11:35:06] yes please [11:35:17] Great, thanks. Done. [11:35:24] ben is our deployer for the day [11:37:26] :-) I think I'll give it a rest for a bit, now. [13:27:17] who is working on spicerack these days? I presume it's not Riccardo so elukey? [13:27:27] would that be true elukey? [13:33:54] sukhe: yep, plus other members of I/F [13:34:18] ok thank you, I will add I/F to the task then [13:34:59] rather, spicerack itself and then I will let you all figure that out <3 [13:43:12] * Lucas_WMDE wonders if you can season a hiddenparma with a spicerack [13:43:34] we have cumin too :) [13:43:35] as long as you use cumin Lucas_WMDE [13:43:41] * vgutierrez runs away [13:44:02] ^^ [13:44:43] not as cool as the other names, but we have https://wikitech.wikimedia.org/wiki/Durum as well. some of us love food. not all of us. [13:44:44] I might need a cookbook for instructions though [14:21:29] I am going to do reboots of the sessionstore cluster (T407110) shortly (if no one objects, of course). No impact is expected. [14:47:13] started the maps eqiad cache warm-up, it will take some days (new stack), so pooling in eqiad atm for maps is not possible (better, not only via dnsdisc) [17:17:58] when using yubikeys/fido, anyone is hitting an 'agent refused operation'? I can workaround not using an agent, but it's really tedious to enter twice the password and touch twice the yubikey for each ssh command I make [17:19:39] it's trying to use `Oct 21 19:13:42 acme ssh-agent[3659434]: error: notify_start: exec(/usr/libexec/openssh/ssh-askpass): No such file or directory` [17:21:41] hmpf... now it asks for the password and touch, just using ssh-askpass instead of the terminal xd [17:21:48] still have to enter it twice per ssh command [17:22:57] maybe I should not have set a password :/ tapping is way nicer than password + tapping [17:23:04] dcaro: you can ssh-add the keyhandle to your agent [17:23:07] which will save you the typing [17:23:24] (but also, keep the useless-w/o-hardware-token keyhandle encrypted at rest) [17:24:56] the 'official' SSH config also sets ControlMaster+ControlPersist for the bastions, which saves you one press [17:25:51] brennen: I've staged the redirect for testwiki in beta. kinda boring but seems to work: https://test.m.wikipedia.beta.wmcloud.org/wiki/Main_Page -> https://test.wikipedia.beta.wmcloud.org/wiki/Main_Page [17:25:59] oops, sorry, meant to ping brett * [17:26:13] https://gerrit.wikimedia.org/r/c/operations/puppet/+/1197351/ [17:26:35] boring is good :) [17:26:36] https://gerrit.wikimedia.org/r/plugins/gitiles/operations/debs/wmf-laptop/+/refs/heads/master/configs/ssh-client-config#14 [17:29:18] saw that one yep [17:29:27] ssh-add is failing saying it can't connect to the agent... looking [17:29:40] https://www.irccloud.com/pastebin/u6Z0k2N2/ [17:29:47] it's up and running though [17:30:29] 'agent refused operation' sometimes means that the yubikey is not plugged in [17:30:41] oh, it's SSH_AUTH_SOCK [17:30:44] not SOCKET [17:30:47] xd [17:33:18] ssh-add now worked, but it did not ask for a password, and I still have to enter it every time [17:33:41] https://www.irccloud.com/pastebin/3M6KRpsv/ [17:33:48] it's getting it from the agent [17:36:09] oh, is this the FIDO PIN on the hardware device? [17:39:41] I'm not sure :/, the prompt says "enter pin and confirm user presence for ED255... key..." [17:39:55] so I'm guessing FIDO (as it seems tied to the key, not the device) [17:41:56] yeah, I guess you set that with ykman or something? [17:42:17] I did a few months ago [17:42:26] (so my memory on the exact commands is a bit blurry) [17:43:13] this asks for password when adding the key, but still asks again on every use [17:43:15] https://www.irccloud.com/pastebin/28uJrzl0/ [17:46:16] hm https://wikitech.wikimedia.org/wiki/Yubikey-SSH-FIDO says resident keys aren't supported [17:47:39] hmm, so I messed up when setting it up? Can I verify if it's a resident key or not? [17:48:31] Resident identity added: ED25519-SK SHA256:Uov4L+21Cf1j0GlmgjxRArzbfilhdYasOejKXczszyQ [17:51:20] and yeah that matches the pubkey in admin data.yaml [17:51:28] πŸ’™cdanis@wmftop ~/work/gits/puppet πŸ•‘β˜• ssh-keygen -lf =(echo "sk-ssh-ed25519@openssh.com AAAAGnNrLXNzaC1lZDI1NTE5QG9wZW5zc2guY29tAAAAIDbrjOCsU1VbyZZjk8kFmDfL51LWnfUG6KH6n9gmM69IAAAABHNzaDo= dcaro@wikimedia.org yk1") [17:51:30] okok, so I guess I have to recreate it then :/, it's working though, just with the annoyance of the double password [17:51:30] 256 SHA256:Uov4L+21Cf1j0GlmgjxRArzbfilhdYasOejKXczszyQ dcaro@wikimedia.org yk1 (ED25519-SK) [17:51:53] yeah, so I think the key *not* being resident might fix the PIN issue [17:52:23] I think enumerating resident keys requires a user verification, or something [17:52:56] (there are ways to have sshd require user verification [PIN entry, on a yubikey], but, we don't enable them, on purpose) [17:56:52] any reviewers? https://gerrit.wikimedia.org/r/c/operations/puppet/+/1197692 [17:57:51] πŸ‘ [17:58:05] thanks :) [18:01:40] I think I'll leave the testing for tomorrow, thanks! [18:02:10] feel free to ping me if you need to troubleshoot more [19:25:09] dcaro: IIRC you're on macos, when I first configured it, it was working fine, then at some point after a macos upgrade it started to ask for the password each time because unable to store it in the agent, after a brew upgrade and a laptop reboot I was able to get it back to work normally. In case that helps ;) [21:34:04] mutante apologies, I accidentally cancelled your PCC run [21:35:16] no worries! easy to restart