[09:02:23] <arturo>	 morning 
[09:06:37] <taavi>	 o/
[09:23:04] <arturo>	 taavi: I'm adding a new k8s control node
[09:23:07] <arturo>	 soon
[09:25:39] <taavi>	 ack
[09:53:55] <arturo>	 I'm removing the old control node now
[09:58:32] <arturo>	 ouch, I just pushed directly to the wmcs-cookbooks.repo
[09:58:58] <arturo>	 https://gerrit.wikimedia.org/r/plugins/gitiles/cloud/wmcs-cookbooks/+/065ba61a1cfb714925471475eaea18501906b24c%5E%21/#F0
[10:01:05] <taavi>	 hmm I wonder how that is even possible
[10:01:08] <arturo>	 I guess is this settings that allows it
[10:01:10] <arturo>	 https://usercontent.irccloud-cdn.com/file/X99MT9eU/image.png
[10:01:33] <arturo>	 from https://gerrit.wikimedia.org/r/admin/repos/cloud/wmcs-cookbooks,access
[10:02:40] <taavi>	 yeah. I removed that, let's see if anything breaks
[10:03:02] <arturo>	 let me see if I can push again
[10:03:38] <arturo>	 correctly fails now https://www.irccloud.com/pastebin/LXXHNyui/
[10:45:23] <arturo>	 taavi: this time I also had to do the static pod dancing by hand. Do you think it is worth automating?
[10:45:45] <taavi>	 at least writing a separate cookbook might be useful
[10:46:00] <arturo>	 ok, let me create a phab ticket
[10:51:55] <arturo>	 T358476
[10:51:56] <stashbot>	 T358476: toolforge k8s: some static pods needs manual restart - https://phabricator.wikimedia.org/T358476
[12:15:57] <arturo>	 taavi: if I recall correctly, the cookbooks/wmcs/toolforge/k8s/kubeadm_certs_renew.py cookbook is now obsolete because there is auto renewal?
[12:16:21] <taavi>	 arturo: that cookbook is needed in case we take too long between kubernetes upgrades
[12:16:30] <taavi>	 (so, hopefully never, but useful to keep around just in case)
[12:16:46] <arturo>	 ack
[12:16:58] <arturo>	 then I will refactor the static pod restart functions
[14:56:01] <taavi>	 oooh you can now select your own color in etherpad
[15:02:09] <dhinus>	 hmm I think I could change my color even in the past? has something changed?
[15:06:20] <taavi>	 I certainly had not noticed that before
[15:39:26] <andrewbogott>	 I just wrote a badly-edited brain dump of my thoughts about toolforge + s3/swift on T358496.  If anyone (looks at bd808 and taavi) has already written something about this topic please let me know and we can merge.
[15:39:27] <stashbot>	 T358496: Provide per-tool access to cloud-vps object storage - https://phabricator.wikimedia.org/T358496
[15:40:05] <andrewbogott>	 I'm also hoping someone will jump in and respond to my "Is it possible/practical to make per-container credentials?" with "yes, and here's how"
[15:40:26] <andrewbogott>	 now... breakfast
[15:54:44] <taavi>	 andrewbogott: did you consider the option of having a second radosgw instance where authentication is not tied to openstack?
[16:00:37] <andrewbogott>	 I didn't but I also don't think that would be very hard.
[16:01:01] <arturo>	 and a dedicated ceph pool
[16:01:05] <arturo>	 sounds interesting
[16:01:24] <arturo>	 could they share the same ingress port?
[16:01:30] <andrewbogott>	 yeah, I think a separate radosgw instance would imply a different pool (as far as I know)
[16:01:54] <taavi>	 arturo: we can certainly do host-based http routing with haproxy, that's not a problem
[16:02:08] <arturo>	 what would be the new fqdn ?
[16:02:09] <andrewbogott>	 wouldn't we want it to be a different endpoint anyway?
[16:02:19] <andrewbogott>	 Oh, I see what you mean
[16:02:49] <taavi>	 yeah, we would want it either on some subpath of object.eqiad1.wikimediacloud.org or we could invent a new subdomain
[16:02:56] <taavi>	 anyhow, that seems a relatively minor detail to me
[16:03:13] <arturo>	 or that service domain you were thinking for toolforge
[16:07:15] <bd808>	 highlight color has been user selectable on etherpad as long as I've been using it, but the UX for discovering that was maybe not great?
[16:09:20] <arturo>	 T306039
[16:09:23] <stashbot>	 T306039: Decision request - Toolforge external infrastructure domain usage - https://phabricator.wikimedia.org/T306039
[16:12:16] <bd808>	 andrewbogott: I haven't really thought about the "how" of it, but per-tool access to storage buckets with the option of making a bucket read-only to the public seems ideal. There is probably a good argument to be made for >1 bucket per tool as well to allow both internal and user facing usage separation.
[17:01:02] * arturo offline
[17:07:38] * andrewbogott adds 'toolforge-specific rados server' option to that ticket
[17:11:13] * dcaro off
[18:03:54] <andrewbogott>	 fyi all, I'm going to do the designate -> cloudcontrol move tomorrow around 17:00 UTC. I don't expect it to affect any running services but might cause some unexpected alerts during the transition.
[18:50:31] * bd808 lunch
[18:59:02] <andrewbogott>	 Rook: I see that paws-prometheus-1.paws.eqiad1.wikimedia.cloud has been removed, is there a different host that I should replace it with? (Seeing this on metricsinfra-alertmanager-1)
[18:59:27] <Rook>	 It's inside the paws k8s cluster now. So no?
[19:06:42] <andrewbogott>	 hmmm
[19:06:54] <andrewbogott>	 does that mean we had alerting before and now we don't?  Or was that vestigial anyway?
[19:08:39] <andrewbogott>	 oh actually that's under 'profile::wmcs::metricsinfra::alertmanager::project_proxy::trusted_hosts:' so probably if it's working then we don't need that anymore...
[19:08:43] * andrewbogott reads a bit more puppet code
[19:18:47] <andrewbogott>	 rook, at your leisure: T358519
[19:18:47] <stashbot>	 T358519: paws prometheus no longer 'trusted' in metricsinfra::alertmanager - https://phabricator.wikimedia.org/T358519
[22:08:08] * bd808 walk