[07:35:36] <elukey>	 hi folks
[07:36:01] <elukey>	 just sent another email to Singtel, the uslfo - eqsin transport is still flapping afaics
[07:37:16] <XioNoX>	 elukey: thanks, they followed up to me over the weekend, I'll check what are the latest
[07:37:57] <elukey>	 XioNoX: ah snap ok, I quickly checked on cr4-ulsfo and the bfd session was flapping so I thought to just ping them again
[07:38:21] <XioNoX>	 elukey: it's fine, don't worry! thanks for keeping an eye on it
[07:58:51] <XioNoX>	 elukey: replied and CCed noc@ on the ticket
[08:23:53] <XioNoX>	 I updated the Singtel NOC contacts on Netbox as well
[08:28:13] <elukey>	 super thanks
[12:19:16] <_joe_>	 if you had to detect from within some code if you're running inside a docker contianer or not, what would you do?
[12:22:00] <btullis>	 _joe_: I would probably start by looking at virt-what: https://packages.debian.org/bullseye/virt-what - This detects a docker environment as well as hypervisors.
[12:22:24] <_joe_>	 btullis: yeah I should look at what they do
[12:29:25] <klausman>	 Best guess: looking closely at DMI info
[12:30:52] <btullis>	 virt-what just seems to be testing for the presence of a `/root/.dockerinit` file: https://salsa.debian.org/libvirt-team/virt-what/-/blob/debian/sid/virt-what.in#L340
[12:31:27] <moritzm>	 and just had a look at what systemd-detec-virt does; it tests for the presence of /.dockerenv
[12:31:29] <btullis>	 Not sure if that's still reliable these days though. https://superuser.com/a/1021925/38404
[12:35:20] <_joe_>	 it's not
[12:48:52] <volans>	 yeah it seems that people in addition look for the names in /proc/1/cgroup like https://stackoverflow.com/a/69860299
[12:53:55] <_joe_>	 volans: that's implementation-dependent
[12:54:25] <volans>	 great
[12:54:30] <_joe_>	 but yeah, this is mostly an academic question, I'll just detect the user we're running as is "wikimedia" and we're running in "/wikimedia" as cwd
[13:14:39] <paravoid>	 fyi on podman+crun, virt-what returns "lxc", while systemd-detect-virt returns "podman"
[13:15:16] <paravoid>	 (I was toying with that a few months ago)
[13:17:31] <paravoid>	 (that's with bullseye)
[14:45:14] <Emperor>	 Am I right that if I want to make /path/to/somewhere/file.ext in puppet I have to make the directory /path/to/somewhere myself (and likewise recursively up to / if the dirs won't be there otherwise)?
[14:46:03] <volans>	 Emperor: see mkdir_p ;)
[14:46:09] <volans>	 does what it sunds
[14:46:11] <volans>	 *sounds
[14:46:18] <taavi>	 correct, although there's a helper wmflib::dir::mkdir_p that might be helpful
[14:48:09] <Emperor>	 thanks :)
[16:26:52] <inflatador>	 miss you razzi ;)
[16:27:20] <razzi>	 :) good to meet up in the sre meeting
[16:29:17] <razzi>	 Does anybody know why this haproxy command would be giving permission denied: `echo 'set server clouddb1018.eqiad.wmnet state drain' | sudo socat /run/haproxy/haproxy.sock stdio` gives `Permission denied`
[16:30:11] <volans>	 razzi: if tha'ts related to teh cookbook, it does run as root currently, so no need to add sudo if that helps
[16:30:42] <razzi>	 volans: ah yeah I was just testing the command manually, the cookbook has no sudo
[16:31:36] <elukey>	 on what node is the command being executed ?
[16:31:40] <elukey>	 I mean the cumin target
[16:32:06] <razzi>	 elukey: dbproxy1018.eqiad.wmnet
[16:33:00] <razzi>	 I already ok'd the updating of the views from data-persistence so now's a fine time to try to depool it manually, then update the cookbook
[16:35:06] <elukey>	 was the command used before? I mean, it is in any guide used by the cloud team?
[16:36:59] <razzi>	 not yet, it is supposed to be replacing the manual process of editing hieradata/hosts/dbproxy1018.yaml and reloading haproxy
[16:37:49] <elukey>	 but was it tested somewhere with haproxy? Or is this the first time that you are running it? (trying to get the context)
[16:38:19] <elukey>	 also I have never used dbproxy1018, what is the blast radious if it goes down or haproxy gets into a weird state?
[16:39:16] <elukey>	 (for example, netcat or similar tools were tested instead of socat? etc..)
[16:40:25] <razzi>	 This is the first run, was perhaps overly optimistic to run the whole cookbook
[16:40:47] <elukey>	 I am very ignorant with socat, but I am puzzled by the stdio at the end.. this is why I asked if it ran somewhere before prod :)
[16:40:57] <volans>	 the command was never tested on a test instance of haproxy?!?
[16:41:44] <razzi>	 it was not, this was my oversight. Fortunately, nothing happens, the host that is supposed to be depooled is still pooled
[16:42:43] <volans>	 ack, but please make sure to test all the untested commands in some test/sandbox environment in isolation, so that you're sure they are correct
[16:43:26] <razzi>	 If haproxy on dbproxy1018 gets into a weird state, queries to the public wikireplicas would fail, so toolforge bots and quarry (https://quarry.wmcloud.org/) would not work
[16:43:27] <volans>	 sorry gotta go afk now for a bit
[16:44:02] <razzi>	 sg, nothing urgent here
[16:45:53] <elukey>	 razzi: I'd suggest to go with the puppet version of the procedure, that is safer (and IIUC already battle tested) and then experiment with haproxy on a local set up or similar
[16:45:59] <ottomata>	 anybody have any experience making grafana dashboards for ephemeral jobs? 
[16:46:23] <elukey>	 (I mean for depooling clouddb1018)
[16:46:25] <ottomata>	 i finally have metrics in prometheus via push gateway, but making dashboards is going to be a litle weird because the value in pushgateway only changes after the next job run
[16:46:36] <razzi>	 sounds good elukey, I'll do the manual way for this round of updates
[16:46:53] <elukey>	 razzi: ack let us know if you need help
[16:47:53] <elukey>	 razzi: lemme try one thing
[16:49:02] <_joe_>	 sorry I just read the backlog
[16:49:33] <_joe_>	 if we want to make the pooled/depooled state of haproxy backends programmable, I'd try to think if there is a way to do it using conftool
[16:52:34] <elukey>	 +1 --^
[16:52:50] <elukey>	 razzi: the command is wrong, I see on various guides that things like `echo "show stat" | socat unix-connect:/run/haproxy/haproxy.sock stdio` work
[16:54:01] <razzi>	 elukey: `echo 'show stat' | sudo socat /run/haproxy/haproxy.sock stdio` works as well; it's not the socat part
[16:54:49] <elukey>	 good signs that testing is needed then :)
[16:54:53] <elukey>	 let's go with the puppet way
[16:56:04] <_joe_>	 in general, we can't keep the pooled/depooled state of a backend solely in the proxy; that introduces easily inconsistencies and more importantly they might not be persisted across restarts
[16:57:36] <elukey>	 it is a very good point that needs some follow up (the state is already in puppet and can be changed, so conftool seems a very nice follow up)
[16:57:39] <elukey>	 razzi: --^
[16:58:07] <razzi>	 indeed, it sounds like conftool is the way to go
[16:58:08] <razzi>	 Is there some environment that already has a test haproxy, or should I go about setting it up in cloud services?
[16:59:12] <_joe_>	 razzi: ofc going with conftool means we have to see how to integrate it with haproxy, that might take some elbow grease I fear :)
[17:01:23] <elukey>	 razzi: for the current issue - haproxy's unix socket doesn't accept unauthenticated admin commands unless explicitly told to in the config
[17:02:21] <razzi>	 got it elukey 
[17:03:20] <razzi>	 Thanks _joe_ volans elukey for the input, back to the drawing board I go, this time with more information
[17:03:20] <razzi>	 But first, I'll finish off this round of updates the manual way
[17:13:10] <arturo>	 razzi, elukey: thanks for loving the wiki-replicas <3
[18:30:03] <ottomata>	 answering my q earlier about ephemeral job dashboards in grafana: TIL about State timeline visualizations!
[18:30:18] <ottomata>	 they only change the viz when the value itself changes!
[18:58:34] <inflatador>	 I have one last task on my onboarding: " Add to Exim mail aliases (root via private.git:modules/privateexim/files/wikimedia.org)" - any idea how I can do this?
[19:05:40] <rzl>	 inflatador: have you made changes in the private puppet repo before?
[19:07:07] <inflatador>	 rzl Indeed! 
[19:07:17] <rzl>	 oh good! this will be easy then :)
[19:07:47] <rzl>	 "modules/privateexim/files/wikimedia.org" is the path of a file in private puppet -- in that file, the line starting "root:" is the config for the alias you're looking for
[19:08:04] <rzl>	 as long as you're comfy editing private puppet, all you have to do is add yourself there
[19:08:19] <inflatador>	 Ah, thanks for the direction! Will give it a shot
[19:30:35] <rzl>	 inflatador: lgtm :)
[19:31:53] <inflatador>	 rzl awesome, I can now close my onboarding ticket (let's not think too much about why it's taken 3 months ;P )