[13:41:53] <wikibugs>	 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Upgrade core routers to Junos 21+ - https://phabricator.wikimedia.org/T295690 (10cmooney) cr2-eqdfw upgrade completed successfully today.
[13:42:24] <wikibugs>	 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Upgrade core routers to Junos 21+ - https://phabricator.wikimedia.org/T295690 (10cmooney)
[15:50:26] <volans>	 topranks: would be a good time in a few minutes to deploy the homer upgrade with the fix?
[15:50:41] <cwhite>	 I suspect after running sre.dns.netbox the dns servers are out of sync.  Sometimes the AAAA addresses are reported and sometimes not.  I'm unsure how to proceed.
[15:51:43] <topranks>	 volans: yep fire away
[15:51:59] <volans>	 cwhite: did you query the AAAA records before adding them via the cookbook?
[15:52:25] <volans>	 the cookbook updates the authdns servers, but the recursor might have cached the negative result
[15:52:36] <volans>	 *you or any software, that is
[15:53:07] <topranks>	 indeed volans.
[15:53:17] <cwhite>	 I did not query the IPs, but I did query the fqdns prior to adding them to netbox.
[15:53:18] <topranks>	 cwhite: what hostnames are you seeing the issue with?
[15:53:40] <cwhite>	 example host: logstash2001.codfw.wmnet
[15:53:41] <volans>	 one thing you can easily do is to wipe the reursors cache for those hostnames
[15:54:16] <volans>	 see the sre.dns.wipe-cache cookbook's help
[15:55:34] * cwhite gives that a try
[15:55:45] <topranks>	 All the auth servers return records for that
[15:55:48] <topranks>	 https://www.irccloud.com/pastebin/J0rhwewJ/
[15:56:13] <cwhite>	 It must have been cached.  I'm getting consistent responses now.  Thanks!
[15:56:16] <topranks>	 So like a cached negative entry as volans suggested
[15:56:21] <topranks>	 *likeyly
[16:06:20] <volans>	 topranks: Connecting to device mr1-eqsin.wikimedia.org (user=homer ssh_config=None timeout=120)
[16:06:42] <volans>	 yay
[16:06:42] <topranks>	 That looks good :)
[16:06:57] <topranks>	 There is a diff to be applied right?
[16:07:04] <volans>	 still running
[16:07:06] <topranks>	 So we can test the commit works?
[16:07:14] <volans>	 yes
[16:07:27] <volans>	 netmon groups are changed
[16:09:16] <volans>	 topranks: can I leave it to you to do the commit so you can validate the diff?
[16:09:24] <volans>	 I'm not familiar with that diff
[16:09:36] <topranks>	 sure, I think it's safe (based on diff yesterday), but let me run it to be usre
[16:10:34] <volans>	 thanks
[16:12:30] <topranks>	 worked on mr1-codfw.wikimedia.org :)
[16:13:17] <volans>	 yay
[16:14:19] <topranks>	 same with drmrs - I think we're good :)
[16:14:58] <volans>	 great, I'll do a full diff against *, but if you want to commit all the mr* first feel free
[16:15:15] <volans>	 or I can do it if you tell me the diffs are fine :D
[16:20:15] <topranks>	 It's still running through the last of the mr* there
[16:20:31] <topranks>	 When I'm done you can run against "*" should be ok
[16:24:45] <volans>	 perfect, thanks a lot
[16:25:52] <topranks>	 cool all done now
[16:37:02] <volans>	 great
[16:37:07] <volans>	 running diff
[16:41:16] <wikibugs>	 10SRE-tools, 10Infrastructure-Foundations, 10Release-Engineering-Team: Investigate sharing releng common python code to pywmflib - https://phabricator.wikimedia.org/T316757 (10thcipriani) 05Open→03Declined For the time being, I don't think we have anything appropriate to upstream.
[17:23:12] <volans>	 yay, Changes for 53 devices: No diff
[17:23:53] <wikibugs>	 10SRE-tools, 10Infrastructure-Foundations, 10Release-Engineering-Team: Investigate sharing releng common python code to pywmflib - https://phabricator.wikimedia.org/T316757 (10Volans) Ok, no problem. Feel free to re-open if that changes.
[20:56:11] <wikibugs>	 10SRE-tools, 10Infrastructure-Foundations, 10Release-Engineering-Team: Investigate sharing releng common python code to pywmflib - https://phabricator.wikimedia.org/T316757 (10thcipriani) >>! In T316757#8236998, @Volans wrote: > Ok, no problem. Feel free to re-open if that changes.  Thank you for answering a...
[21:34:35] <wikibugs>	 10Puppet, 10Infrastructure-Foundations, 10SRE: Facter is slow on a few hosts - https://phabricator.wikimedia.org/T251293 (10colewhite) raid_mgmt_tools does not detect raid on `clouddb1021` ` cwhite@clouddb1021:~$ sudo /usr/bin/ruby /var/lib/puppet/lib/facter/raid.rb | jq . {   "raid": [     "megaraid"   ] }...