[08:55:17] with the network mystery solved I'm going to replace some more k8s nodes [08:59:17] taavi: yes please :) [08:59:33] moning [10:14:57] why do I get gitlab notifications for random repositories? [10:15:58] I guess I had notifications activated for the whole toolforge-repos group [10:40:17] https://thenewstack.io/end-of-an-era-weaveworks-closes-shop-amid-cloud-native-turbulence/ [10:40:30] I wonder what will happen with fluxcd [11:43:00] * topranks waves at arturo :) [11:43:12] topranks: hey! :-) [11:43:32] how are things? I hope life post-wmf is treating you well :) [11:43:58] taavi: I opened T356986 in error it seems, forgot we had T350132 [11:43:59] T356986: Improve cloudgw filter between VM instances and cloud-private - https://phabricator.wikimedia.org/T356986 [11:43:59] T350132: Restrict traffic from instances to private IPs on cloudgw level - https://phabricator.wikimedia.org/T350132 [11:44:14] unsure which to keep, maybe my new one as there is more detail [11:44:19] topranks: I'm happy to be back here in the team :-) [11:45:37] oh wow [11:45:40] arturo: you spoiled my plan to confuse topranks by //not// telling him that you're back at WMF :D [11:45:44] I was off last week I missed the email :P [11:46:12] Are we confident Arturo is a good fit to replace Arturo?? [11:46:16] heh [11:46:29] well in that case welcome back! [11:46:44] thank you! :-P [11:46:53] I'm looking forward to continue some of the projects we left opened [11:47:02] apparently balloons had to do some paperwork to prove that we are indeed confident :) [11:47:15] I assumed you were just freelancing, j.bond raised his head to help with something the other day too :P [11:47:21] haha [11:47:25] hehe [11:48:06] well I guess my other comment is of interest to both you and taavi [11:48:21] in that I still think doing the "no nat" rule on the cloud-private supernet makes sense [11:48:43] reason being - I'm not convinced that two sets, cloud-vips and cloud-hosts, is granular enough for the filtering requirement [11:57:35] ok [11:57:44] let's rethink it, then [11:58:02] I'm open to discuss, it's not black and white [12:04:45] hmmm [12:05:07] let's do the supernet now, and figure out what we need for firewalling once we actually get to that part? [12:06:17] that works! we can discuss the filtering stuff on task [12:08:36] ok [12:08:41] I will update my patch [12:12:41] topranks: https://phabricator.wikimedia.org/T356986#9524488 what traffic to 10.x do we have remaining? I thought we fixed that all [12:13:21] I actually don't know, it was mentioned on T350132 which prompted me to add that comment [12:13:22] T350132: Restrict traffic from instances to private IPs on cloudgw level - https://phabricator.wikimedia.org/T350132 [12:13:35] But I think you're right we probably have none [12:27:00] * dcaro lunch [12:59:27] would anyone mind in labtesthorizon would be unreachable for some time? I to try reimaging cloudweb2002-dev to bookworm, T356966 [12:59:27] T356966: Upgrade cloudweb hosts to Bullseye - https://phabricator.wikimedia.org/T356966 [13:03:28] taavi: not me [13:32:32] arturo: topranks: let's try to deploy https://gerrit.wikimedia.org/r/c/operations/puppet/+/998412? I guess we should disable puppet on eqiad and roll it out to codfw first [13:33:42] taavi: yep that sounds like a good plan to me [13:34:55] ok, disabled and merging [13:35:55] running puppet on codfw [13:38:30] cool [13:40:36] seems to work, do you want to test anything? [13:41:07] yeah looks ok to me [13:41:11] ip daddr @cloud_public_v4_set counter packets 75 bytes 4500 accept comment "cloud_public_v4" [13:41:11] ip daddr @cloud_private_v4_set counter packets 943 bytes 74386 accept comment "cloud_private_v4" [13:42:34] continuing on 1001 [13:43:15] https://www.irccloud.com/pastebin/rMbQdaRk/ [13:43:33] I guess the only fear is something in cloud-private land that doesn't like the VM original IP [13:43:40] but in terms of the cloudgw everything working like it should [13:44:10] arturo: question, is it possible to reset the counters for nft rules? [13:45:23] actually nevermind "nft reset counters" worked for me [13:45:42] "nft reset counters table inet cloudgw" threw no error but didn't reset anything [13:45:52] deployed to all nodes [13:45:55] everything looks good to me [13:50:51] topranks: `sudo systemctl restart nftables.service` should reset the counters [13:51:12] that sounds like a scary command to run on a live router [13:51:22] I didn’t want to be so disruptive [13:51:39] heh, it just loads the config file again [13:51:49] similar to what puppet does on every change [13:52:03] yep with nft I think it’s fairly safe [13:52:19] fair [13:52:37] but, yeah better to be safe if `nft reset counters` works [13:52:53] I found myself somethings fighting a ruleset that wont load because a typo or whatever [13:52:57] sometimes* [13:53:09] yeah even if it’s safe that command feels “safer” to my head :P [13:53:51] I’ve had the same yeah [14:00:08] * arturo food break, back later [14:01:01] moritzm: hey, seems like our current wikitech puppetization installs both php-ldap and php7.4-ldap which are different packages. which of these is the correct one? [14:06:37] php-ldap pulls in the default phpX.Y-ldap package, it's built from php-defaults, so the php-ldap package will install the php7.4-ldap package built from the php7.4 source package [14:06:48] which class is that, I can have a closer look? [14:08:14] so yeah, I think https://gerrit.wikimedia.org/r/c/operations/puppet/+/998921 will do the right thing [14:08:44] we already have `php::extension { 'ldap': }` in profile::openstack::base::wikitech::web [14:08:55] in addition to that patch you just found [14:15:02] then we can just as well turn 998921 into a patch removes it there [14:15:11] yep, just did [14:16:27] +1d [14:40:10] andrewbogott: seems like the current containerized horizon relies on /etc/openstack-dashboard/default_policies (which is a leftover from the scap setup) existing [14:44:34] ah, just needs https://gerrit.wikimedia.org/r/c/operations/puppet/+/998938 I believe [15:16:23] do qe know if we can have a k8s network policy via calico that allows us to introduce network usage quota for tools? [15:16:50] for example: limit the number of open connections [15:41:47] * taavi looking for reviews for https://gerrit.wikimedia.org/r/c/operations/puppet/+/998926 https://gerrit.wikimedia.org/r/c/operations/puppet/+/998938 [15:42:04] arturo: I have not looked, but if there is something it's definitely worth considering [15:45:49] taavi: I can barely review the patches without a deep dive. I'll let andrew review them later [15:46:29] filtering on the number of open connections is definitely something netfilter/nftables allows [15:49:04] I think the question is more how you get calico/k8s/whatever to apply that filter [16:51:09] * bd808 sees cloudweb reimaging starting and rushes to make sure he has a backup of $HOME things [16:55:07] good idea bd808. do you think it might be worth a cloud-admin heads-up or are you the only one hoarding scripts there? [17:12:08] * arturo offline [17:20:50] taavi: I don't know who else may scatter things across $HOMEs, but I certainly do. I would actually appreciate it if either notice or automatic backup and restore of $HOMEs was part of our general playbook for reimageing. I very much appreciate it when mutante just takes care of this sort of thing when he reimages a prod host like mwmaint*. [19:24:58] * bd808 late lunch