[01:34:15] 10Packaging, 10Infrastructure-Foundations, 10Thumbor, 10Wikimedia-SVG-rendering: Update librsvg to > 2.44.10 - https://phabricator.wikimedia.org/T265549 (10Glrx) To fix T344564 (font fallbacks), we need to upgrade to at least `librsvg 2.48.5`. [01:45:51] 10Packaging, 10Infrastructure-Foundations, 10Thumbor, 10Wikimedia-SVG-rendering: Update librsvg to > 2.44.10 - https://phabricator.wikimedia.org/T265549 (10Glrx) [05:04:29] 10Packaging, 10Infrastructure-Foundations, 10Thumbor, 10Wikimedia-SVG-rendering: Update librsvg to > 2.44.10 - https://phabricator.wikimedia.org/T265549 (10Arthur2e5) So uh @tstarling, what about the promise of trying to deploy a build script? I've got all the steps laid out... [06:43:36] the column titles at https://netbox.wikimedia.org/virtualization/clusters/ are very confusingly named, what Netbox refers to as the "Name" is in fact the Ganeti group and what Netbox refers to as the "Group" is in fact the cluster name [06:43:53] not sure if it has been always like that or whether this regressed during some Netbox update? [07:11:34] moritzm: no unfortunately has always been the case, that's why I've specified that in the help message of the makevm cookbook, see https://wikitech.wikimedia.org/wiki/Ganeti#Create_the_VM [07:12:01] specifically the --cluster and --group helps [07:13:28] ah, ok [07:14:38] not much we can do I guess, but if you have a suggestion for a better naming let me know [07:15:04] maybe we can go with the netbox naming and hide the ganeti names from the workflow to avoid others confusion? [07:18:34] no, I guess that would just mean further confusion down the road, I was mostly wondering whether the Netbox colums titles are adjustable, then we could simply rename them as e.g. "Name (Ganeti group)" and "Group (Ganeti cluster name)" [07:18:59] but if that's not a simple config item, let's just leave it as-is and stick with the existing docs [07:19:48] nope, that's the hierarchy in netbox, there cluster groups ( https://netbox.wikimedia.org/virtualization/cluster-groups/ ) that group different clusters ( https://netbox.wikimedia.org/virtualization/clusters/ ) together [07:21:21] maybe in the future once we'll have finished the migration to the rack-based network setup and we'll have 1 cluster per cluster group we cloud just stop using the cluster groups and just have clusters... dunno just an idea [07:22:49] ack, we'll see [07:23:55] We have at least 4 free tickets for https://ripe87.ripe.net/ In Rome (/cc volans, topranks) [07:26:09] moritzm, topranks: talking about ganeti... can I clear the esams cluster group and related cluster in netbox? we should have migrated everything to the new esams01/02 by now correct? [07:26:32] XioNoX: interesting, when is it? [07:27:12] 27 November – 1 December 2023 [07:27:24] volans: yes [07:28:53] ok proceeding [07:30:56] {done} [07:36:44] thanks [07:37:03] I've added an item to the cleanup task [07:37:07] already marked [08:42:32] 10netops, 10Infrastructure-Foundations, 10SRE: Maintain ROAs for currently unannounced BGP assignments - https://phabricator.wikimedia.org/T345601 (10cmooney) p:05Triage→03Low [08:43:22] 10netops, 10Infrastructure-Foundations, 10SRE: Maintain ROAs for currently unannounced BGP assignments - https://phabricator.wikimedia.org/T345601 (10ayounsi) Sounds good to me! [08:47:34] XioNoX: thanks for the feedback on that task [08:47:41] one question strikes me while I'm in there. [08:47:54] For 185.15.56.0/22 we have 4 separate ROAs on each /24 [08:48:08] Maybe is it better to have a single ROA with length up to /24 on it? [08:48:42] Shouldn't make a practical difference, but I guess it's good practice for the overall RPKI system to not have more objects than needed? [08:49:21] until now it reduces the risk of breaking multiple prefixes at once as not all were managed the same way with the same description/etc [08:49:36] like missconfig, etc [08:52:20] I'm not sure I understand [08:52:41] I know we used to use different ASNs right is that what you mean? [08:53:38] Related - on the v6 side there is none for 2a02:ec80::/29, but separate ones for the esams/drmrs /48s [08:53:46] I think it makes sense to keep those /48s [08:53:55] But also probably add a "covering" one for the whole allocation ? [08:59:48] I added them to match the BGP advertisements iirc and having multiple ones reduce the risk of a mistake taking down multiple prefixes (eg. typo in the ASN, etc) [09:00:39] I would argue having multiple ones, so say 4 for the /22, creates 4 times more places you can make such a typo increasing that risk [09:01:28] haha yeah, both approaches are valid [09:01:33] We can't really make a typo in the ASN anyway, we're only authorized for our own ASN [09:02:15] Actually there is a good argument for keeping the /24s [09:02:25] Similar to my question on the v6 ranges [09:02:34] Say for the /22 we should create a ROA for it [09:02:44] And keep our 4x/24s separate [09:02:59] Now if we stop using a /24 at a site we can remove the ROA for that specific /24 [09:03:15] Which means if someone announces it, *even spoofing our ASN*, it's still invalid [09:03:34] as the only covering ROA is the /22, and the /24 doesn't match the prefix length? [09:05:42] yeah, quite a niche risk though [09:05:45] but I agree [09:05:56] ease of mgmt should be important too [09:06:09] even if they don't change often [09:06:32] true enough, I think with the small total number and fact they very rarely ever change it's not a big deal [09:09:56] so the options are 1/ covering only, 2/ covering + specific 3/ specific only? [09:10:30] of course only for the ones where we have more than a /24 [09:10:41] (2) seems fine to me [09:13:14] not having only the covering also helps keep a clean output on https://irrexplorer.nlnog.net/asn/AS14907 :) [09:15:02] (SystemdUnitFailed) firing: check_netbox_uncommitted_dns_changes.service Failed on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:16:07] Yeah 2 is what I think is probably best [09:16:19] Agree it's an edge-case, but I think no harm to block it off [09:17:39] In terms of the nlnog display I think it lists all the prefixes seen in the routing table [09:17:46] And then just has a green tick if it's covered [09:17:55] So probably the number or combination of ROAs won't affect the display [09:19:17] if there is a ROA for a prefix not in the routing table it show it [09:19:23] in blue so it's ok :) [09:20:02] (SystemdUnitFailed) resolved: check_netbox_uncommitted_dns_changes.service Failed on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:21:54] Oh ok. It's not caught up with the one I added for the old esams range then :) [09:23:14] yeah it takes a bit of time [09:39:38] 10netops, 10Infrastructure-Foundations, 10SRE: Maintain ROAs for currently unannounced BGP assignments - https://phabricator.wikimedia.org/T345601 (10cmooney) 05Open→03Resolved a:03cmooney I've added ROAs for our newer RIPE /24 range and the old esams one now to help protect against hi-jack / misuse.... [11:49:47] moritzm, jbond: FYI https://phabricator.wikimedia.org/T268369#9142016 feel free to comment if you have different ideas ;) [11:50:52] that sounds good [11:51:08] the durum servers are in fact awaiting a reinstall after the knams setup [11:59:08] volans: lgtm thanks [15:20:57] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10Papaul) netbox cable id update for ssw1-a8 to lsw1-a1 and lsw-a8 [17:48:58] (SystemdUnitFailed) firing: check_netbox_uncommitted_dns_changes.service Failed on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [17:53:58] (SystemdUnitFailed) resolved: check_netbox_uncommitted_dns_changes.service Failed on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed