[07:02:29] hello folks! [07:03:16] I'd need to restart the jvm of puppetdb, and IIUC the least impactful way is to disable puppet fleet-wide, wait a bit, and then restart both [07:03:51] would it be ok if I do it now, or is there anything ongoing/pending/with-more-priority ? [07:11:41] <_joe_> elukey: not objecting, btu just curious [07:11:52] <_joe_> what happens if you don't stop puppet everywhere? [07:12:12] thunderstorms of failed puppet runs [07:12:25] and puppetdb takes like two minutes to restart, so they add up [07:12:35] <_joe_> ok so basically the same result, it's just that we reduce the noise, ack [07:12:35] ahahaha TIL about the two minute [07:12:42] *minutes [07:12:56] <_joe_> maryum: just two minutes? pretty swift for a jvm application! [07:13:03] <_joe_> err sorry maryum I meant moritzm [07:13:05] <_joe_> :) [07:13:18] not sure if it's actually still two minutes :-) [07:13:28] that data is from the time [07:13:33] when puppetdb ran on 16G VMs [07:14:06] since some time they are on 64G baremetal and probably need 4x the time, not sure :-) [07:19:06] the puppetdb's jvm runs with only 6GB, I didn't expect 2 minutes of bootstrap [07:28:01] puppet disabled fleetwide [07:35:37] proceeding with 2003 [07:39:14] done with 2003 and 1003 [07:39:48] was pretty quick, around 10/20 seconds [07:42:00] trying to run puppet on esams nodes first [07:55:29] re-enabling puppet fleetwide [08:08:22] ah, good to know that it's quicker on baremetal [08:22:34] we may think to test a restart (next time) without a fleetwide disable, even though it is good if anything happen to the JVM (like bad update etc..) [09:17:58] <_joe_> elukey: it's not advisable to send requests to a JVM before you've properly warmed the glow plugs [09:18:57] <_joe_> (a tractor diesel engine is my mental image of the JVM) [09:26:20] etherpad.w.o will be briefly gone for a firewall fixup [11:40:44] btullis: can i merge the DNS add for rgw.eqiad.dpe ? [11:41:03] It's in my sre.dns.netbox run [11:41:25] Rados Gateway (S3/Swift) interface of the DPE Ceph cluster in codfw apparently [11:41:32] Oh yes, please do go ahead. Sorry, I didn't realise I should have done it myself. [11:41:40] no worries [11:43:46] claime: I was just about to ask Ben about those... [11:44:10] I think you may have some problems pushing the dns - the v6 reverse entries I think will cause an issue [11:44:12] E003|MISSING_OR_WRONG_PTR_FOR_NAME_AND_IP: Missing PTR '8.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.1.0.1.0.0.0.f.f.0.8.c.e.2.0.a.2.ip6.arpa.' for name 'rgw.eqiad.dpe.anycast.wmnet.' and IP '2a02:ec80:ff00:101: [11:44:14] yes [11:44:23] yeah... [11:44:36] are you running the dns cookbook itself? or some other cookbook that calls it? [11:45:04] other cookbook that calls it [11:45:32] can you just abort the dns part or will that bork the overall work ? [11:45:53] TL;DR to fix I need to create a manual patch for the dns repo and merge, then re-do the authdns-update (bit that is failing for you) [11:45:54] the dns part is the work, it's the renumbering cookbook :) [11:46:11] (well it's part of the work anyway) [11:46:25] I can wait until you've done that and retry [11:46:30] ok... yeah so you need updated dns entries for the rest to complete, is it prompting you to retry? [11:46:33] or skip [11:46:37] yep [11:46:57] maybe pause for a few mins I'll get my path in, retry should work then [11:47:02] *patch [11:48:19] ack [11:48:33] ping me when it's good :) [12:05:03] claime: ok I think we are back in a healthy state [12:05:13] cool ty [12:05:14] do you want to see if 'retry' handles things cleanly? [12:05:16] yep [12:05:34] otherwise it's probably safe to skip as I've had to do the update in the meantime anyway [12:06:12] looks clean, thanks [12:06:18] nice [12:06:34] IPv6 PTRs are truly the gift that keeps giving :) [13:28:11] moritzm: ok if I merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/1070586? I want to see if I can get it hooked up to horizon. [13:37:43] andrewbogott: ack, go ahead. worst case we can always revert [13:37:53] ok :) [14:37:14] moritzm: it works! But also I have created https://phabricator.wikimedia.org/T374123 for your team. [14:40:26] ack, I'll have a look later