[00:31:33] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, 10observability: icinga config error for new rows E/R - https://phabricator.wikimedia.org/T302940 (10RobH) [00:34:36] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, 10observability: icinga config error for new rows E/R - https://phabricator.wikimedia.org/T302940 (10RobH) Failed to run Homer on lsw1-f1-eqiad.mgmt.eqiad.wmnet: Command '['/usr/local/bin/homer', 'lsw1-f1-eqiad.mgmt.eqiad.wmnet', 'commit', 'Ho... [00:39:22] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, 10observability: icinga config error for new rows E/R - https://phabricator.wikimedia.org/T302940 (10Dzahn) When this host was installed and added to Icinga config by puppet, it broke Icinga config. The error was: ` Error: 'lsw1-f1-eqiad.mgmt.eqi... [09:36:00] 10netops, 10Infrastructure-Foundations, 10SRE: all network devices must run OpenSSH >= 7.2p1 but != 7.4p1 - https://phabricator.wikimedia.org/T254013 (10ayounsi) Slightly related, as of today those devices don't support ssh-ed25519: (11) asw2-b-eqiad.mgmt.eqiad.wmnet,asw2-c-eqiad.mgmt.eqiad.wmnet,asw2-d-eqi... [10:41:01] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, 10observability: icinga config error for new rows E/R - https://phabricator.wikimedia.org/T302940 (10ayounsi) https://gerrit.wikimedia.org/r/c/operations/puppet/+/764791 should fix the issue. About hostname vs. FQDN is because the devices use LLDP... [10:59:21] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, 10observability: icinga config error for new rows E/R - https://phabricator.wikimedia.org/T302940 (10cmooney) @robh apologies for this, I was working on an improved version of the CR Arzhel lists above yesterday. But it should have occurred to me... [13:47:02] 10SRE-tools, 10Infrastructure-Foundations, 10serviceops, 10Patch-For-Review: Add a kubernetes module to spicerack - https://phabricator.wikimedia.org/T300879 (10Joe) a:03Joe [13:47:14] 10SRE-tools, 10Infrastructure-Foundations, 10serviceops, 10Patch-For-Review: Add a kubernetes module to spicerack - https://phabricator.wikimedia.org/T300879 (10Joe) p:05Triage→03Medium [15:04:27] I feel like I'm missing something obvious in why test coverage for alertmanager.py isn't 100% in https://gerrit.wikimedia.org/r/c/operations/software/spicerack/+/765480 . spicerack/alertmanager.py 66 1 20 2 97% 59, 93->95 [15:04:44] checking [15:05:13] thanks volans [15:05:28] so line 59 is triggered if you instantiate AlertmanagerHosts with a NodeSet instance for the target_hosts param [15:06:01] indeed, and I'm doing that in test_nodeset_hosts [15:06:14] 93->95 means that in the tests line 93 is never False, and so it never goes to raise directly without passing through line 94 [15:06:58] yeah and I thought I'd be testing that in test_downtimed_remove_on_error [15:07:06] ack, echking the tests now [15:07:10] *checking [15:07:11] :) [15:08:48] godog: so for the first one is the verbatim_hosts, because is false by default, your nodeset gets converted into a list [15:08:49] I'm sure it is sth silly I got tunnel vision for and can't see [15:09:07] and so the next check is not anymore a nodeset [15:09:17] ah yeah of course [15:09:40] I guess we could move the verbatim_hosts functionality after the conversion [15:09:44] and work with nodeset [15:09:57] or just pass it true in the test [15:10:17] yeah I'll pass true in the test for now [15:12:02] as for the other one, in test_downtimed_remove_on_error you call downtimed(...remove_on_error=True) and it does test the 93->94->95 path, I dont' see it tested for the False case [15:12:22] you can parametrize that test passing False/True and then adjust the last assert to be 1 or 2 [15:12:35] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, and 2 others: (Need By: TBD) rack/setup/install 2 new labstore hosts - https://phabricator.wikimedia.org/T302981 (10RobH) p:05Medium→03High @nskaggs, These hosts were ordered without a fully filed racking task (I meant to do it before order), so... [15:12:48] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, and 2 others: (Need By: TBD) rack/setup/install 2 new labstore hosts - https://phabricator.wikimedia.org/T302981 (10RobH) [15:15:13] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, and 2 others: Q3:(Need By: TBD) rack/setup/install 2 new labstore hosts - https://phabricator.wikimedia.org/T302981 (10RobH) [15:15:24] ok! I thought I was missing sth obvious indeed, thanks volans [15:15:37] no prob :) [15:24:31] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, and 2 others: Q3:(Need By: TBD) rack/setup/install 2 new labstore hosts - https://phabricator.wikimedia.org/T302981 (10RobH) [15:35:59] 10netbox, 10Infrastructure-Foundations, 10SRE, 10SRE-Access-Requests: Grant cn=nda some sort of read only access to Netbox - https://phabricator.wikimedia.org/T302870 (10Ladsgroup) FWIW, I asked for this when I was not SRE. One complicating factor is that netbox contains serial number of hardware we have a... [15:44:09] 10netbox, 10Infrastructure-Foundations, 10SRE, 10SRE-Access-Requests: Grant cn=nda some sort of read only access to Netbox - https://phabricator.wikimedia.org/T302870 (10Dzahn) Are hardware serial numbers more abusable / serious than other things we give NDAed people, like logstash, piwik and the other thi... [19:44:24] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, 10observability: icinga config error for new rows E/R - https://phabricator.wikimedia.org/T302940 (10cmooney) 05Open→03Resolved dumpsdata1007 looks good in Icinga now after being re-added, following the above patches being merged. Apologies fo...