[00:16:43] * ebernhardson finally realizes the oddity in commonswiki descriptions: "We don't need that for MediaInfo files, so swap labels data into descriptions instead so as not to overburden the search index" [00:18:44] doh, wrong room :) [00:22:20] ebernhardson: cloning from diffusion works again, btw [00:26:06] mutante: awesome, thanks! [08:49:33] headsup: going to reboot cumin2002 in ten minutes, if you start any new cookbooks, please run them from cumin1001 [09:15:46] reboot is done [13:29:48] vgutierrez, apergos: can you do a quick review? https://gerrit.wikimedia.org/r/c/operations/puppet/+/859498 , it's related to the default reply for xmldumps http, also, apergos do you want me to change the contacts for those alerts? (currently wmcs is getting them) [13:30:16] jenkins failed it [13:30:43] well that is my bad for +1 instead of insisting that someone at wmcs sign off [13:31:17] * vgutierrez looking [13:33:11] dcaro: it looks good.. please make the linter happy by moving the Bug: line in the commit msg [13:33:23] 👍 [13:58:35] jbond: heyo :), you added some extra config to the idp profile that is breaking puppet runs on the cloud one (included idp01.sso), let me know if/what/when I should put anything there, are you around later today to test the move from the sso project to cloudinfra also? (requires removing the web proxy from the sso project and creating it in cloudinfra instead) [13:58:49] (btw. sorry for the bad timing for this ping, got a meeting in 2 min...) [14:41:38] dcaro: ack looking [15:15:25] hi all we are moving the idp instance in cloud to the cloudinfra project. ill create a new on in the sso project called idp-dev.wmcloud.org which will be used for development and can break at any time. the current one idp.wmcloud.org can be used to protext services in cloud projects with more assurence that it wont randomly break [15:16:01] godog: i think o11y is the main user, other then wmcs, of this end point opefully you wont notice the change but depending on what yu are doing you may want to switch to idp-dev [15:31:08] jbond: neat, thanks for the heads up! but yeah using the main idp.wmcloud.org I think should be enough for o11y use cases [15:32:45] ack thanks [15:56:52] who owns/who should I ask about apertium? [15:57:33] <_joe_> hnowlan: what do you need? [15:57:50] <_joe_> I redeployed it this morning, so if it's misbehaving it might be my fault? [15:57:52] Depending on your question... It's something for CX and the language team... [15:58:01] (it being the software) [15:58:09] <_joe_> but it's a component called by cxserver [15:58:24] <_joe_> so I guess kartik [16:00:00] sessionstore's logs have a lot of TLS handshake errors from it [16:00:33] but since at least yesterday so probably not related to the redeploy, will check how long this has been going on [16:00:34] <_joe_> hnowlan: starting when? [16:00:37] <_joe_> oh ok [16:00:45] <_joe_> also why is apertium trying to call sessionstore [16:00:53] I have no idea, this is all new to me [16:01:01] <_joe_> ok let me look for a sec [16:02:21] seems like it's been happening for a looong time [16:02:25] <_joe_> hnowlan: can you point me to where you saw these errors? [16:02:39] <_joe_> because apertium AFAICS isn't calling sessionstore [16:03:28] https://logstash.wikimedia.org/goto/1894b6f1cbb7f3b124b373ac2bb82293 [16:04:31] that's not apertium [16:04:35] <_joe_> indeed [16:04:44] that's the kubelet tcp liveness probe [16:04:54] sigh [16:04:56] it originates from 1 of the IPs on the server [16:05:03] right [16:05:08] <_joe_> it shows up with the apertium IP because it's on the loopback interface of the kube node [16:05:09] and since all the k8s hosts are LVS realservers they have the LVS IPs [16:05:39] <_joe_> akosiaris: it's interesting it picks up the same IP consistently it seems [16:05:51] I think I had looked into it at some point and it wasn't possible to specify which IP to connect from [16:06:05] akosiaris: it's the source address selection algorithm. it is designed to be consistent [16:06:26] <_joe_> ah, TIL [16:06:40] http://linux-ip.net/html/routing-saddr-selection.html [16:06:58] the kernel will choose the first address configured on the interface which falls in the same network as the destination address or the nexthop router. [16:07:36] I wonder if we could set the src parameter though and avoid this [16:07:52] unless the kubelet has been changed to allow to override [16:08:01] override the IP address that the probes are from [16:13:32] akosiaris: I like the self-answer, nice [16:13:34] very meta [16:13:51] elukey: 😉 [16:14:32] We'll be updating sessionstore in a few minutes, currently on staging and looking okay. Starting with codfw [16:15:16] akosiaris: there is a (sort-of) famous presenter/journalist in Italy that always starts his interviews with "Please ask yourself a question and answer it" [16:15:34] it reminded me that [16:15:36] :D [16:23:29] elukey: will it also remind you of him if I answer it? [16:23:33] cause there answer is... [16:23:35] deviceRouteSourceAddress IPv4 address to set as the source hint for routes programmed by Felix. When not set the source address for local traffic from host to workload will be determined by the kernel. [16:23:42] https://projectcalico.docs.tigera.io/reference/resources/felixconfig [16:24:04] the k8s probe API hasn't been changed and I don't see anything in the kubelet that could help [16:24:16] but this will supposedly set the src parameter [16:24:49] and the env var appears to be FELIX_DEVICEROUTESOURCEADDRESS [16:25:21] merged in https://github.com/projectcalico/calico/pull/2779/commits/6bdd302a8dd20506a1c85035dccc3cf252ccaffc [16:25:26] so.. Aug 7, 2019 [16:25:36] since 3.0.12 [16:27:23] the big issue being ... how do we set it in admin_ng.d to the current IP of the node... [16:50:47] <_joe_> akosiaris: uh there was a trick IIRC [17:01:50] <_joe_> not sure it can work in this way though, I remember I figured out a way to pass the node IP to containers to connect to a local daemonset [17:04:55] niah, that's a different level, that wouldn't work here [17:08:55] just a heads up: kask v1.0.10 has been rolled out to sessionstore. Everything looks good to me, but if anyone notices anything out of sorts, please let me know. [17:09:04] <_joe_> uhm isn't felix part of the daemonset? [17:09:19] <_joe_> urandom: are you going to update echotore too? [17:09:44] this should fix the recent incident (https://wikitech.wikimedia.org/wiki/Incidents/2022-09-15_sessionstore_quorum_issues), and the past one (https://phabricator.wikimedia.org/T253244) [17:09:52] _joe_: yup! [17:15:28] _joe_: sigh, you are correct, I had the wrong mental model, I think it will work, let me try a patch real quick [17:16:39] do you guys ever use this form to search using "PQL"? https://puppetboard.wikimedia.org/query I am trying to follow some PQL examples but seem to just get "Bad Request - The browser (or proxy) sent a request that this server could not understand." on submit [17:19:02] what I want is "list of nodes using class foo:bar" just like the compiler picks them.. but the full list, not just one of each group / regex in site [17:19:50] mutante: Can cumin do that? [17:20:07] mutante: i think we explicitly disable that end point and and the puppetboard error handeling is just bad [17:20:09] btullis: oh, right, yes, it totally can :) thank [17:20:16] mutante: 1) it's quicker with a cumin query, 2) IIRC on the UI we disable that because it could expose secrets [17:20:24] jbond: I was almost guessing it might be on purpose [17:20:28] you have to use one of the available endpoints from the dropdown [17:20:28] i genrally just create snipets in my home dir on puppetdb1002 [17:20:35] volans: ACK, makes sense. I will use cumin [17:20:37] and use the usual puppetdb query syntax [17:20:49] thanks all, alright [17:20:55] but yes in this case cumin C:foo::bar is better [17:21:15] yeah, the puppetboard query UX is ... meh [17:21:21] and if you need the expanded list [17:21:32] mutante: if you do want to have a play though https://wikitech.wikimedia.org/wiki/Puppet/PQL [17:21:38] now my _actual_ issue is the thing is not a class but a defined type, but that's separate ) [17:22:01] jbond: thanks, ack [17:22:16] mutante: sudo cumin --no-colors 'A:cumin' 2> /dev/null | nodeset -e -S '\n' [17:22:22] if you one one per line [17:22:38] mutante: for defines just R:my::define [17:23:08] see https://wikitech.wikimedia.org/wiki/Cumin#PuppetDB_host_selection [17:23:24] volans: yay, it's perfect like that :) works, thank you [17:25:02] reading "man nodeset". interesting [17:27:00] volans: nice i had not used nodeset like that https://wikitech.wikimedia.org/w/index.php?title=Cumin&type=revision&diff=2033262&oldid=2011801 [17:27:25] <3 thanks [17:29:09] _joe_: jayme: https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/859586 [17:29:22] seems like I might not fully lost it, yet... [17:29:57] <_joe_> akosiaris: noice [17:30:31] sigh, ofc I lost it... I forgot to add Chart.yaml (which I bumped)... [17:34:20] <_joe_> lol [17:34:29] <_joe_> well you're not merging this now are you [17:42:07] yolo :P [17:42:12] * akosiaris just joking [17:43:15] <_joe_> akosiaris: btw, you said since 3.0.12 [17:43:29] <_joe_> but I had found https://github.com/kubernetes-sigs/kubespray/pull/6508/files#diff-196c324811f70dc51f059c7ecbb861689931338227571a1e1f40df179431b749R262 [17:44:03] <_joe_> they say 3.9.0 but probably it's just one of the fixed versions kubespray uses? [17:47:22] that's weird, where did they come up with that [17:48:47] ah [17:48:49] hmmm [17:48:51] so [17:49:07] https://github.com/projectcalico/calico/pull/2779/commits/6bdd302a8dd20506a1c85035dccc3cf252ccaffc says 3.0.12 [17:49:21] but https://github.com/projectcalico/felix/pull/2037/commits/2d1a0c3283a13615ff1e070f2f1947fb00e8dd4e says 3.9.0 [17:49:41] and https://github.com/projectcalico/libcalico-go/pull/1097/commits/b0fbf124506413c05481e1bd286a6db7400ae0d9 says 3.10.0 [17:49:47] aha [17:52:39] <_joe_> loll [17:53:17] <_joe_> 3.rand() [18:43:27] urandom: herron: I am moving high-traffic2 LVS to lvs4009 from lvs4006 [18:43:35] if something breaks that's on me but I thought you should know. thanks [18:43:44] sukhe: ack thx for the heads up [19:28:17] herron: all done :) [19:32:33] (⌐■_■)