[09:40:44] dhinus: hello, ping me if you have a few moments to talk about tofu-infra pending MRs [09:54:32] arturo: hello! I'm currently reviewing raymond's patch for jobs-api, I'll get to your one after that [09:54:41] I had a quick look yesterday and it seems ok [09:54:46] but I want to double check it [10:28:20] I'll review https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/148 meanwhile [10:48:53] Raymond_Ndibe: please merge that one ^^^ [11:32:43] arturo: I completed the review of the other jobs-api patch (https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/91) [11:33:00] I'll have lunch then I'll look at your tofu-infra patch! [11:33:05] ok [12:45:32] eqiad1 DNS just flapped? [12:52:54] I just popped in to ask about those dns emails too, was hoping someone restarted something [12:55:44] nothing obvious on the logs [12:57:53] arturo: reviewed and approved the tofu-infra MR [12:59:51] I am heading back to the eye doctor shortly so taking a sick day. I have an hour or so before I need to go so let me know if there's anything I can look at in the meantime. [13:00:30] I got Trove working with tofu-infra but magnum is still misbehaving and I don't have a good theory yet. [13:00:50] ack, thanks! I think it's fine if Magnum stays broken until next week [13:00:53] andrewbogott: you may be interested in the quick update I made to T374129 [13:00:54] T374129: openstack: consider removing labs-ip-aliaser - https://phabricator.wikimedia.org/T374129 [13:00:56] dhinus: thanks, merging now [13:03:17] arturo: interesting! Maybe we should add our public IP subnet to the default security group? [13:03:59] do we have something similar for the internal IPv4 CIDR? [13:04:19] lol @ "network weirdness" :D [13:04:56] speaking of which, I've a working theory about what went wrong [13:05:14] which is the c8/d5 ("spines") were reflecting routes and setting the next-hop to themselves [13:05:38] arturo: I think we must, in order to allow bastion access? Although I suppose that's limited to port 22. [13:05:56] I suspect this may have created a routing loop for certain src->dst, depending on if they hit c8 or d5 first (due to load sharing) [13:06:46] I've modified the policy now, and if possible I'd like to remove the statics again to verify [13:09:24] topranks: today is probably not the best day to do that since I'll be out and arturo and dhinus usually work short days on Fridays. [13:09:25] topranks: I'm about to head for lunch, so I wont be able to assist if you do it now [13:09:45] is monday an option? [13:09:48] ok np yeah lets do it when people are around [13:10:13] yep we can go for Monday morning I think [13:12:01] thanks! [13:12:20] monday sounds fine thanks! [13:13:22] dhinus: the flavor refactor and state transition is now completed, including the cleanups, thanks for the assistance! [13:13:40] * arturo food [13:20:09] * andrewbogott https://frinkiac.com/meme/S12E05/519227.jpg?b64lines=IE93ISBNeSBleWUhIEknbQogbm90IHN1cHBvc2VkIHRvIGdldAogcHVkZGluZyBpbiBpdCE= [14:17:13] Hi! I hope everyone is fine. We and QuanStack got hired to help with supporting PAWS and I wanted to ask if someone has some time at hand on Monday or Tuesday to help me running my first PAWS deploymenent from pawsdev-bastion on codfw1dev. [14:18:07] hello halloy1441 ! I assume you are also known as atrawog [14:18:55] Yes I'm. I just to beef up my nick reclaiming powers :) [14:20:25] no problem :-) [14:20:43] I think I will be online and available next monday, on EU office time, to help you with codfw1dev [14:20:50] That's better :) [14:21:19] on the other hand, we had already scheduled an operation window next monday during the EU office time [14:21:48] Excellent, what time will suit you? [14:22:23] I think we can book some time Tuesday 10:00 UTC ? [14:24:35] Tuesday 10:00 UTC is fine with me. [14:24:40] cool [14:25:20] atrawog: I will be available here in this IRC channel for sync, and we can set up a videochat if required [14:29:33] I'm out today but nice to see you here atrawog [14:36:48] Thanks and I'm hopefully getting to the point soon where I can be of help :) [14:38:18] andrewbogott: so I have checked the default security group rules, and indeed it is only for tcp/22 (ssh). So I'd suggest if additional rules are required, we add them on a project-by-project basis, otherwise any VM with a floating IP in Cloud VPS would bypass all security groups rules across all projects basically [14:39:39] atrawog: arturo: Tuesday at 10 UTC also works for me, I have some basic knowledge of the PAWS setup [14:39:47] ack [14:40:13] one small issue is that we're having some problems with magnum, and I'm not sure we can test the full workflow where a new cluster is deployed [14:40:32] I know andrewbogott is looking at that so maybe by Tuesday everything will be working fine [14:40:47] worst case we can still try to redeploy to the existing cluster [15:12:07] FYI network tests in codfw1dev shows some IPv6-related failures that are expected, because I'm adding them new [15:12:17] the tests I'm referring to are the ones you get when runnin [15:12:20] running* [15:12:21] sudo cookbook wmcs.openstack.network.tests --cluster-name codfw1dev [15:13:05] there are 4 tests out of 24 failing, and I will fix them monday [15:13:14] so we can disable them meanwhile if you prefer [15:14:04] (disable is done via puppet) [15:15:33] well, I just fixed 2 tests with https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/commit/63feeec596d486a235192aa415918580e97e5695 so only 2 broken left