[08:16:46] 10Puppet, 10DBA, 10Data-Engineering, 10Infrastructure-Foundations, 10Patch-For-Review: Split mariadb::dbstore_multiinstance into 2 separate roles (backup sources and analytics) - https://phabricator.wikimedia.org/T296285 (10jcrespo) a:05jcrespo→03BTullis Reassigning to btullis, as he was the person t... [09:26:58] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, 10ops-ulsfo: ulsfo: (2) mx80s to become temp cr[34]-drmrs - https://phabricator.wikimedia.org/T295819 (10ayounsi) 05Open→03Stalled Thanks I had a quick look and they both are healthy, all 8 interfaces show up as well. I'll wait that we make pr... [11:01:30] XioNoX, topranks: we have a small issue wrt netbox data and reimage for the ganeti hosts, and I'd like your input [11:01:38] if you look at https://netbox.wikimedia.org/dcim/devices/2135/interfaces/ [11:01:58] eno1 is the one connected to the switch, but the IPs are assigned to the private/public ifaces [11:02:02] and hence also the primary IP [11:02:25] the reimage cookbook does get the switch from the primary_ip.assigned_object.connected_endpoint [11:02:28] and in this case is None [11:02:33] thoughts? [11:03:34] interesting [11:03:35] hmmm [11:03:49] I guess we would have this with all "sub-interfaces" [11:04:21] AFAIK there is no way, in the Netbox data model, to have a virtual interface and define it's "parent" physical [11:04:37] it is in 2.11 [11:04:40] On the CRs we use ae1.1234, and can derive the parent from the name I guess. [11:04:41] oooh [11:04:41] moritzm: did you already reimaged other ganeti hosts recently that worked? [11:04:50] but it's also not really a parent [11:04:52] it's an IRB [11:05:12] ah yeah, it's a bridge [11:05:21] volans: ganeti-test2001/2002/2003 worked fine, but it's been a few weeks [11:05:44] similar concept I suppose, and this is coming up I think [11:05:45] https://github.com/netbox-community/netbox/issues/6346 [11:06:10] moritzm: those don't have private/public though [11:06:39] it also depends on what we need the cookbook to do [11:07:01] we could cycle through the (non-mgmt) physical interfaces to get its switch [11:07:04] doesn't really help us right now for this though. I guess if we had it you could follow from the virtual device (bridge) and find real interface that belonged to it, and from there the switch. [11:07:17] XioNoX: for the opt82, so I need the switch interfacename and the switch hostname [11:07:22] and vlan name [11:07:40] I can get all from switch_iface that is the conencted endpoint [11:07:45] so yes I could cycle them I guess [11:08:23] that's an extra complication, there are two Vlans here... [11:08:32] also need to figure out how multi-homed devices would behave [11:08:44] although looks like we are mixing 802.1q (public vlan) and standard Ethernet (private vlan) on the wire? [11:08:59] ideally we should have a quick workaround to unblock mor.itz and then free to discuss more long term solutions [11:08:59] XioNoX: yeah, assuming 1 interface is fine until there are 2. [11:09:10] volans: yep [11:09:46] first look I'd say if primary_ip.assigned_object.connected_endpoint returns None, cycle through the connected interfaces [11:10:22] even create but yeah it's an odd case [11:10:42] "even create"? [11:10:47] and get the vlan from the one associated with the prefix the primary IP belongs to. [11:12:24] volans: ignore, dunno what I wanted to say [11:13:10] rotfl [11:14:05] https://time.com/3858309/attention-spans-goldfish/ [11:14:35] volans: ah, I meant, even create a DHCP snipet for both links if needed, but not sure that would work [11:14:38] hahaha [11:16:47] I'm pretty sure what will be a good enough workaround for our usecases [11:17:07] until we have router Ganeti hosts with a real interface :) [11:17:54] topranks: was a way of saying we've already lost you? :D [11:18:23] ack, I'll code something to overcome this temporarily, may I ask you to open a task or comment on an existing task to not forget this detail please? [11:18:28] yeah there is no harm I guess. But considering possibility of multi-homed devices I think it may be better to take the primary IP and only create DHCP block for a single Vlan, based on the IP/vlan association. [11:18:31] for the netbox representation, that is [11:18:47] sorry what are we talking about again? [11:19:20] I can create a task unless XioNoX knows of an existing one that mentions it? [11:19:20] lunch, bbiab [11:20:11] volans: not sure I understand neither [11:20:27] to switch to parent/child interfaces? [11:21:20] yeah that was my interpretation of the ask, how to model the relationship between physical ports and bridge devices in Netbox ? [11:23:04] basically to have somewhere in phab a mention that we need to solve the problem of how to represent this data in netbox [11:23:25] and as a consequence how to modify the puppetdb import script and the reimage cookbook accordingly [11:23:37] basically the current issue is created because: [11:23:58] - the provisioning script creates ##PRIMARY## interface with the primary IP connected to the switch [11:24:07] - reimage runs [11:25:02] - reimage calls the puppetdb import script that renames ##PRIMARY## to the real name (say eno1), adds private/public interfaces, moves the primary IP to the private interface and so we endup with the eno1 connected to the switch byt without IPs [11:25:10] and the private/public with IPs but disconnected [11:25:46] thanks volans I'll create a task now [11:27:26] thanks! [11:32:34] volans: you need the MAC address of the server's interface for the DHCP bit right? [11:32:41] where does that come from? [11:33:05] no, no MAC address involved [11:33:10] we're using option 82 [11:33:16] ah of course [11:33:33] I was recently doing a VM, but that's different. [11:33:36] nice! [11:33:40] yep, sorry [11:33:42] https://wikitech.wikimedia.org/wiki/Server_Lifecycle/Reimage#DHCP_Automation [11:35:18] The puppetdb import script knows that Ganeti hosts need private/public interfaces specifically is it? [11:35:44] Like that runs for other devices too but doesn't create those bridge interfaces [11:36:26] it creates anything that is on the host [11:36:31] doesnt' have any specific logic [11:36:40] it imports what's in puppetdb that is what's in the host itself [11:36:43] after puppet has run [11:36:49] ok yeah makes sense. [11:37:01] but that script is working fine, no? [11:37:04] I need to check in puppetdb I guess, is there meta-data there to tell us what type each interface is? [11:37:06] puppetdb importer [11:37:45] XioNoX: for some definition of working fine [11:38:03] it reflects reality but is the one that makes the eno1 interface connected but without IP [11:38:17] yeah I don't think it's not working, but when it comes to importing into Netbox need to see if it can tell us "this is a sub-interface", "this is a bridge device", "this bridge device has these members" [11:38:45] so it might need tweaking too, depending on how we decide to represent the data [11:38:49] yeah but eno1 connected with no IP is fine [11:39:13] that's the reality yeah, so 100% how it should be represented. [11:39:51] I think the upcoming additions to Netbox would allow us to model the setup exactly in netbox, with proper associations between things. [11:40:07] yep exactly [11:40:17] leverage that new feature [11:40:18] the problem is how to extract that info from the host [11:41:25] for ganeti I don't think there is a need [11:41:25] topranks: just run sudo facter -p networking [11:41:28] on a ganeti host [11:41:35] and see if there is anything you could use [11:42:24] thanks for that.. big help [11:42:36] the bad news is the relationship between devices is not represented :( [11:56:29] topranks: fwiw the reimage cookbooks is always getting switch_iface.untagged_vlan.name for the VLAN [11:56:57] 10Puppet, 10DBA, 10Data-Engineering, 10Infrastructure-Foundations, 10Patch-For-Review: Split mariadb::dbstore_multiinstance into 2 separate roles (backup sources and analytics) - https://phabricator.wikimedia.org/T296285 (10Kormat) 05Open→03Resolved Cumin alias change is merged, so i'm going to optim... [11:57:04] that's the private one, I guess for ganeti we pass also the public vlan as tagged on the same wire [11:58:07] yeah [11:58:45] Traditionally I've always fought against mixing untagged and tagged traffic on a single link, but actually we need it here and it makes sense. [11:58:59] for the initial boot etc. it's needed [11:59:19] so I think that setup is fine, and the logic of using switch_iface.untagged_vlan.name can remain. [11:59:26] ack, thx [12:07:12] 10netops, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (Kanban): cloud: decide on general idea for having cloud-dedicated hardware provide service in the cloud realm & the internet - https://phabricator.wikimedia.org/T296411 (10aborrero) create and shared a spreadsheet trying to capture/compa... [12:07:16] patch sent: https://gerrit.wikimedia.org/r/c/operations/cookbooks/+/742948 [12:27:11] 10netops, 10Infrastructure-Foundations: Represent sub-interface and bridge device assocations in Netbox - https://phabricator.wikimedia.org/T296832 (10cmooney) p:05Triage→03Low [12:31:40] volans: I created the task https://phabricator.wikimedia.org/T296832 [12:31:49] let me know if any errors or comment back of course :) [13:01:35] 10netops, 10Infrastructure-Foundations, 10SRE: Represent sub-interface and bridge device assocations in Netbox - https://phabricator.wikimedia.org/T296832 (10cmooney) [13:07:44] 10netops, 10Infrastructure-Foundations, 10SRE: Represent sub-interface and bridge device assocations in Netbox - https://phabricator.wikimedia.org/T296832 (10cmooney) [13:20:20] thanks! [13:21:50] volans: I added a few comments [13:22:33] XioNoX: you meant "mgmt_only": false? [13:22:41] volans: yeah :) [13:23:26] I still had the "name__nie" in mind.. [13:29:30] :D [13:29:31] topranks: thanks for the detailed task! it would be indeed nice to document those. I also agree with the Low priority, as I'm not sure it would bring much value [13:45:31] volans: I hope https://gerrit.wikimedia.org/r/c/operations/software/spicerack/+/742951 wasn't too painful :) [13:45:49] what do you mean? :D [13:51:14] sometimes one missing semi-colon causing bugs can be hard to find [13:52:54] this one was easy, the automation tests the dhcp config and that failed with a message [13:53:08] I checked the snipped and found the issue right away [13:53:12] *snippet [13:53:36] https://pics.me.me/the-missing-semicolon-you-have-been-looking-for-for-5-63716971.png [13:54:05] lol [13:54:53] hahhaa [13:55:22] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Represent sub-interface and bridge device assocations in Netbox - https://phabricator.wikimedia.org/T296832 (10Volans) @cmooney we have the possibility to add custom facts to puppetdb, we already have a bunch of them, or modify existing one... [13:59:11] volans: reimage cookbooks works fine now, thanks! [13:59:35] great! [14:32:10] nice :) [14:38:09] 10Puppet, 10Infrastructure-Foundations, 10Patch-For-Review, 10User-jbond: Decommission puppetboard[12]001 - https://phabricator.wikimedia.org/T296744 (10jbond) [17:40:47] 10netops, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (Kanban): cloud: decide on general idea for having cloud-dedicated hardware provide service in the cloud realm & the internet - https://phabricator.wikimedia.org/T296411 (10aborrero) We had a meeting today, rough summary: The idea is rou... [17:58:10] 10netops, 10Infrastructure-Foundations, 10SRE: Represent sub-interface and bridge device assocations in Netbox - https://phabricator.wikimedia.org/T296832 (10cmooney) @volans thanks for the info. Sounds like we have a way forward if we want to do this. And certainly if we expand our use of bridges, sub-int... [19:33:19] just curious, is gitlab slated to be used across the board, i.e. replace gerrit for our use? [19:38:42] jhathaway: short answer: yes, long one: https://www.mediawiki.org/wiki/GitLab_consultation [19:39:05] volans: thanks [19:41:58] damn it, I'll probably just get comfortable with gerrit before it goes away ;P [19:43:29] yeah the timing seems right!