[13:57:39] hi all -- some childcare snafus this morning (nanny has covid) -- i'll miss the foundations meeting but I'll be around for the incident review ritual in 2 hours [13:58:11] ack, take care [14:12:18] Hi folks - we talked a bit about maybe using queue/rotational to determine "SSD" or "not-SSD" to try and make install / puppet of swift nodes more reliable (particularly, no longer dependent on drive node ordering); j.bond pointed out that some swift nodes appear to the OS to have no SSDS. I went looking at this, and stuck what I found in T309027 ; it appears a workaround would be to switch the SSDs from single-member RAID-0 to non-R [14:12:18] (and that this might be lossless - we do have nodes we could test on), and that might be easier than trying to get the RAID-0 devices correctly understood by the kernel? [14:12:19] T309027: Poweredge R730xd, R740xd, R740xd2 SSDs not visible to OS as SSDs - https://phabricator.wikimedia.org/T309027 [14:12:39] [that might have to be done by hand on all the nodes, though, which would be rather tedious] [14:14:10] Emperor: we are all in a team meeting, expect delays in the reply ;) [14:14:46] NP! [14:19:24] potentially archaeological information: it was not possible to set up drives as JBOD with H7xx controllers -- single RAID-0 was the only mode supported [14:19:53] there were folks out there that were doing this by cross-flashing different (non-Dell) firmware on them [14:20:06] which we opted to not do for (what I hope are) obvious reasons :P [14:22:33] :) thanks paravoid; Emperor that sounds like a good way forward to me. in relation to a test host can we use opne of the ms-be serveres that are not yet in service? [14:23:13] as to how if we can automate this is suspect we can. once we have the manual step documented i suspect we would be able to knock up a cookbook fairly easily [14:23:46] re: netbox my note was wrong so no action there :-) [14:24:23] ack thx [14:26:25] ack lmata [14:30:58] jbond: sure, do you want to take ms-be2069 to poke? [14:31:39] cross-flash> What Could Possibly Go Wrong? [14:33:15] (equally, I can have a go myself ~and hand you the smouldering remains~ if you'd rather :) ) [14:37:34] Emperor: if you are able to prove the theory and document the manual steps i can help with the cookbook and puppet code [14:39:34] 'k [15:52:25] 10netbox, 10Infrastructure-Foundations: netbox: drop profile::netbox::active_server parameter - https://phabricator.wikimedia.org/T309034 (10Volans) Yes I agree. > * its safe to monitor both servers, probably yes? Actually not really, the monitoring includes the reports monitoring that currently alerts in th... [15:52:48] 10netbox, 10Infrastructure-Foundations: netbox: drop profile::netbox::active_server parameter - https://phabricator.wikimedia.org/T309034 (10Volans) p:05Triage→03Medium [15:53:34] ms-be2069 now booting of non-RAID SSDs [15:54:43] yay, finger crossed [15:55:36] Emperor: if you're curious here is some other "fun" we had with non-raid/JBOD setup with HP servers in the past https://phabricator.wikimedia.org/T178177#3818600 [15:57:32] booted up OK, /sys/block/sd{a,b}/queue/rotational now 0 as you'd like [15:57:32] 10netops, 10Infrastructure-Foundations, 10SRE: Upgrade Fastnetmon to 1.2.1 - https://phabricator.wikimedia.org/T271228 (10MoritzMuehlenhoff) @ayounsi I've built a backport of fastnetmon 1.2.1 for bullseye-wikimedia. It's not yet uploaded to apt.wikimedia.org, let's sync up for some smoke testing when you're... [16:01:23] volans: interesting that the conclusion there was "just use HBA mode, it's quicker than all the RAID-0 devices" [16:01:34] [swift ms-be* are using all RAID-0] [18:42:30] 10SRE-tools, 10Discovery, 10Discovery-Search, 10Infrastructure-Foundations, 10IPv6: Some elastic hosts do not have IPv6 DNS records - https://phabricator.wikimedia.org/T271143 (10bking) AAAA records successfully added for elastic202[5-9]: ` for n in $(cat codfw.hosts); do quad=$(dig aaaa +short ${n});pri... [18:48:38] 10SRE-tools, 10Discovery, 10Infrastructure-Foundations, 10Discovery-Search (Current work), 10IPv6: Some elastic hosts do not have IPv6 DNS records - https://phabricator.wikimedia.org/T271143 (10Gehel) [21:14:03] 10Mail, 10Infrastructure-Foundations, 10LibUp: LibUp Gerrit mail ratelimited to mail.tools.wmflabs.org - https://phabricator.wikimedia.org/T306295 (10Legoktm) 05Open→03Resolved a:03Legoktm {F35170566} I logged into the LibUp account and disabled all email notifications (I hope). [23:41:03] 10Mail, 10Infrastructure-Foundations, 10LibUp: LibUp Gerrit mail ratelimited to mail.tools.wmflabs.org - https://phabricator.wikimedia.org/T306295 (10Dzahn) Thank you legoktm!