[06:37:29] 10netbox, 10Infrastructure-Foundations: Move AS allocations to Netbox - https://phabricator.wikimedia.org/T310744 (10ayounsi) [10:12:26] 10netbox, 10Infrastructure-Foundations: Upgrade pynetbox - https://phabricator.wikimedia.org/T310745 (10jbond) SGTM [10:13:13] XioNoX, jbond: luca's last reimage had an issue with the netbox script:https://netbox.wikimedia.org/api/extras/job-results/3342612/ [10:13:25] "message": "An exception occurred: `DoesNotExist: CablePath matching query does not exist.`\n```\nTraceback ... [10:13:37] could you please have a look? [10:13:43] of course [10:15:06] also it has a weird thing [10:15:06] "message": "10.64.134.8/32: created, vip, no interface and active" [10:15:13] why is trying to create tthat address? [10:15:24] it's already there, as a /24 as it should be [10:15:24] https://netbox.wikimedia.org/ipam/ip-addresses/10991/ [10:26:39] looking at the cable error from the re-image script first [10:28:45] the script works in dry-run [10:29:25] even though it wants to delete then re-create the cable while nothing changed [10:30:03] and I can reproduce the error in "commit mode" [10:31:15] great [10:32:10] FYI, I'm rebooting the netflow* VMs now [10:34:05] moritzm: ok! [10:34:52] on ml-cache, the interface name changed, so it's deleting the old cable and re-creating it on the good interface [10:56:50] XioNoX: if I may suggest if we can easily cleanup netbox-next we could test it there [10:57:01] (along with the ganeti group one :-P ) [10:57:23] volans: I can reproduce it there, trying to figure out what's going on, looks like a netbox cache bug [10:57:35] oh nooo, not those again [11:03:44] sorry missed this ping let me know if you need help [11:05:37] trying a few more things then will escalate to level 2 pyton [11:06:36] lol [11:06:44] who's level 2? [11:06:56] hehe [11:12:56] yeah there is something weird going on [11:14:24] cable.save() fails I think because it can't generate tthe end to end path from one interface to the other. If before the save I do self.log_warning(f"{cable.termination_a} - {cable.termination_b}" is shows the proper interfaces on both side, and if I do self.log_warning(f"{nbiface.connected_endpoint}") it returns none, like they're not connected [11:14:52] XioNoX: I suggest you move to an nbshell [11:15:02] and try the lines one by one on netbox-next [11:16:03] the issue is that it does a lot of actions before that one, so I can't do it line by line [11:19:14] some relevant upstream issues, but in theory fixed in 3 https://github.com/netbox-community/netbox/issues/6945 [11:19:50] the script does exactly the same steps [11:21:00] maybe they fixed it for changes done through the UI, but the script is too fast? [11:22:04] * volans lunch [12:56:58] volans, jbond, I'll need a 2nd pair of eyes for that one [12:57:08] XioNoX: ack give me a sec [13:04:55] I also tried to add a 10s wait before the save with no luck [13:08:40] ok I'm here [13:11:31] XioNoX: so how I can help you? [13:12:27] volans: with some python expertise :) [13:13:24] volans: current symptoms is that, with a script, deleting a cable, then re-creating one that terminates on one of the old interfaces, returns that error [13:13:49] sorry, re-can't right now (see private) [13:14:44] yep, no pb! [14:33:46] 10netbox, 10Infrastructure-Foundations: Agree how to document intra-DC patch panels in Netbox - https://phabricator.wikimedia.org/T293221 (10ayounsi) 05Open→03Resolved a:03ayounsi I went ahead and created circuits instead of the existing cables, moved the patch cables over (and fixed some miss-cabling) a... [14:35:04] 10netbox, 10Infrastructure-Foundations: Agree how to document intra-DC patch panels in Netbox - https://phabricator.wikimedia.org/T293221 (10ayounsi) [14:35:15] 10netbox, 10Infrastructure-Foundations: Agree how to document intra-DC patch panels in Netbox - https://phabricator.wikimedia.org/T293221 (10ayounsi) [16:47:36] 10SRE-tools, 10Infrastructure-Foundations: Q3 2018/19 Goal: TEC6: Build automated workflows for server provisioning (Tracking Task) - https://phabricator.wikimedia.org/T213114 (10ayounsi) [16:47:42] 10netbox, 10Infrastructure-Foundations, 10SRE: Netbox: fill network topology - https://phabricator.wikimedia.org/T205897 (10ayounsi) 05Open→03Resolved a:03ayounsi Finally time to close this task. We've added more things to Netbox since, but no need for a tracking task anymore. Tracking core sites pow... [19:11:46] 10netbox, 10Infrastructure-Foundations, 10SRE: DNS cookbook failed syncing with netbox - 403 from netbox1002 - https://phabricator.wikimedia.org/T310831 (10Dzahn) [19:23:24] 10netbox, 10Infrastructure-Foundations, 10SRE: DNS cookbook failed syncing with netbox - 403 from netbox1002 - https://phabricator.wikimedia.org/T310831 (10Dzahn) After this I ran only the DNS cookbook directly and this time it finished without such an error. I am not sure if it tried though because it said... [19:38:39] 10netbox, 10Infrastructure-Foundations, 10SRE: DNS cookbook failed syncing with netbox - 403 from netbox1002 - https://phabricator.wikimedia.org/T310831 (10Dzahn) The run of the decom book was at: `2022-06-16 18:54:09,812 dzahn 2165070 [INFO] START - Cookbook sre.hosts.decommission for hosts gitlab-runner10... [19:53:49] 10netbox, 10Infrastructure-Foundations, 10SRE: DNS cookbook failed syncing with netbox - 403 from netbox1002 - https://phabricator.wikimedia.org/T310831 (10Arnoldokoth) fatal: unable to access 'https://netbox1002.eqiad.wmnet/dns.git/': The requested URL returned error: 403 0.0% (0/1) success ratio (< 100.0%... [20:43:33] 10Mail, 10Infrastructure-Foundations, 10SRE, 10Wikimedia-Incident: 2022-05-09 Exim BDAT Errors incident - https://phabricator.wikimedia.org/T309238 (10jhathaway) [20:44:18] 10Mail, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: [mitigated] Google returning 503 error when delivering to mx1001 and mx2001 - https://phabricator.wikimedia.org/T307873 (10jhathaway) 05Open→03Resolved This has now been fixed upstream, https://git.exim.org/exim.git/commit/462e2cd30. We w... [20:45:28] 10Mail, 10Infrastructure-Foundations, 10SRE: Upgrade Exim to 4.96 - https://phabricator.wikimedia.org/T310836 (10jhathaway) [20:46:04] 10Mail, 10Infrastructure-Foundations, 10SRE, 10Wikimedia-Incident: 2022-05-09 Exim BDAT Errors incident - https://phabricator.wikimedia.org/T309238 (10jhathaway) [20:46:14] 10Mail, 10Infrastructure-Foundations, 10SRE: Upgrade Exim to 4.96 - https://phabricator.wikimedia.org/T310836 (10jhathaway) 05Open→03Stalled This is stalled until 4.96 is available in Debian. [22:09:41] 10netbox, 10Infrastructure-Foundations, 10SRE: DNS cookbook failed syncing with netbox - 403 from netbox1002 - https://phabricator.wikimedia.org/T310831 (10Volans) p:05Triage→03High It seems the vhost has changed: ` root@netbox2002:~# runuser -u netbox -- git -C "/srv/netbox-exports/dns.git" fetch -v ne... [22:31:30] 10netbox, 10Infrastructure-Foundations, 10SRE: DNS cookbook failed syncing with netbox - 403 from netbox1002 - https://phabricator.wikimedia.org/T310831 (10Volans) Run puppet on both netbox hosts (1002/2002) [22:37:55] 10netbox, 10Infrastructure-Foundations, 10SRE: DNS cookbook failed syncing with netbox - 403 from netbox1002 - https://phabricator.wikimedia.org/T310831 (10Volans) Run dns cookbook to force sync the data everywhere (the last couple of commits where not deployed to the authdns hosts). The procedure is describ... [22:39:02] 10netbox, 10Infrastructure-Foundations, 10SRE: DNS cookbook failed syncing with netbox - 403 from netbox1002 - https://phabricator.wikimedia.org/T310831 (10Volans) 05Open→03Resolved a:03Volans This should have fixed the issue. I'm resolving it, but feel free to re-open in case it's not fully solved. [22:50:56] 10netbox, 10Infrastructure-Foundations, 10SRE: DNS cookbook failed syncing with netbox - 403 from netbox1002 - https://phabricator.wikimedia.org/T310831 (10Dzahn) Thank you for the very quick response!