[06:50:30] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, 10ops-drmrs: Q3:(Need By: ASAP) rack/setup/install cr[12]-drmrs - https://phabricator.wikimedia.org/T300277 (10wiki_willy) Hi @ayounsi - I'm not sure if you're copied on the Interxion ticket, so just forwarding the info along that they completed th... [08:13:04] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, 10ops-drmrs: Q3:(Need By: ASAP) rack/setup/install cr[12]-drmrs - https://phabricator.wikimedia.org/T300277 (10ayounsi) I can confirm that (1), (2) and (4) are done. However cr2-drmrs is currently fully down (console is dead as well). My guess is... [09:16:49] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, 10ops-drmrs: Q3:(Need By: ASAP) rack/setup/install cr[12]-drmrs - https://phabricator.wikimedia.org/T300277 (10ayounsi) I gave a call to Tarek: the power cord on cr2 was faulty, but he was able to find 2 spare ones which he will bill on the ticket.... [09:26:59] 10Puppet, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [09:40:48] 10Puppet, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [10:28:51] 10SRE-tools, 10Infrastructure-Foundations: Long timeout on debmonitor client with server unreachable/unpingable - https://phabricator.wikimedia.org/T302205 (10fgiunchedi) [10:34:13] 10SRE-tools, 10Infrastructure-Foundations: Long timeout on debmonitor client with server unreachable/unpingable - https://phabricator.wikimedia.org/T302205 (10Volans) That was by design, the parameters used are defined in https://gerrit.wikimedia.org/r/plugins/gitiles/operations/software/debmonitor/+/refs/head... [10:47:26] 10SRE-tools, 10Infrastructure-Foundations: Long timeout on debmonitor client with server unreachable/unpingable - https://phabricator.wikimedia.org/T302205 (10fgiunchedi) IMHO the client should fail faster since while running it will block dpkg/apt in such cases [13:49:39] I've run spicerack's tox -e py3-style (for example) and pip is taking a long time with 100%, the logs from .tox/py3-tests/log/ look like this https://phabricator.wikimedia.org/P21155 have you run into this before? I'm on Bullseye [13:50:34] godog: did you rebase on top of master's head? [13:51:29] volans: I have no local commits yet to rebase, this is from d941554d56 [13:51:39] i.e. current master [13:51:55] did you had a pre-existing local .tox/ dir? [13:52:15] mmhh that's possible, I'll nuke it and try again [13:53:04] thanks, try that because d941554d56 was done exactly on purpose to fix that, but that was last week and maybe something else has changed in the meanwhile [13:55:26] ack, yeah looks like it is doing the same, I'll paste the log [13:55:32] volans: fyi d941554d56 did not fix the issue for me [13:56:10] https://phabricator.wikimedia.org/P21157 [13:58:21] which version of pip do you have? [13:58:54] pip 20.3.4 from /usr/lib/python3/dist-packages/pip (python 3.9) [13:58:54] the interesting part is that on my local env (on mac) I don't have any 'zipp' in the venv [13:59:25] * jbond also has 20.3.4 [14:01:36] ack, yeah that is interesting [14:01:45] I don't think I've ran into this before in general [14:04:46] pip backtracking is an issue in general since they started to strict the way pip resolves dependencies a year or two ago [14:05:19] I'll take a look this afternoon to see if I can find a quik fix, is becoming a bit too unstable for my taste [14:05:54] and spicerack, including multiple other libraries, will become worse over time [14:08:40] easy to image yeah, thanks for taking a look [16:00:28] FWIW "arping" does not allow one to generate random ARPs from an IP not bound to a local interface. [16:01:03] So I don't think it'd be use to us to verify the public / tagged vlan was reachable from the Ganeti hosts. [16:02:01] topranks: arping -0 [16:04:02] doesn't seem to be an available flag on the version on ganeti1012 [16:07:54] topranks: the version installed on ganetei1012 is not the version that comes with buster. im not sure where its from [16:08:04] but the version in buster dose support it [16:08:08] oh ok [16:08:20] it still may not help thugh ... [16:08:46] im gussing the version on 1012 is left over from an apt-get full-upgrade and possibly having the source package renamed (cc moritzm ) [16:10:58] yeah I can see that option in buster alright, lots of different options in that version (which is surprising, not something you think is gonna change much in recent years) [16:11:35] the version of what specifically? arping? [16:11:57] I think the difference is between package "iputils-arping" (like ganeti1012) and plain "arping" [16:12:03] moritzm: yes [16:13:10] all the ganeti hosts were reimaged, so they have the stock buster version, but in fact arping on ganeti1012 is provided by iputils-arping [16:14:17] there's also the standalong src:arping [16:15:26] ack that will be it [16:15:27] iputils-arping gets pulled in via a dependency of ganeti-2.16 [16:16:00] and they conflict, so we can't easily replace it with arping on the ganeti hosts [16:16:13] ok [16:16:16] they conflict == arping vs. iputils-arping [16:16:44] maybe let's stick with the first approach and see how it works out. [16:17:01] Even if arping is an option, it adds a difficulty of knowing what IP to ARP for (will change from row to row) [16:19:21] topranks: what if we check it on the switch instead of the host itself? [16:20:25] certainly an option yeah, you could even check against the NB config as the switch config should match that (although homer may not have been run) [16:21:21] IIRC we can just also run simpel commands via cumin so via the cookbook [16:29:34] i forgot to mention earlier i will be out thu/fri and mon due to a PTO thing [16:29:51] i will be completely off grid this time in case you need me for any administrative things please plan ahead [16:30:14] I'll send over an email as well but wanted to get the word out early :-) [16:30:39] ack, noted [16:39:28] im also going to be out from fri, mon & tue and will be mostly not reachable other then to the occational checking (internet still needs a generator ;))