[11:36:13] Hi arturo noticed we are both doing some decommissions, have you merged the dns deletions of analytics1063, cloudswift1001, cloudswift1002 or should I proceed to do it? [11:36:30] stevemunene: I think I have already [11:37:28] I saw this: !log stevemunene@cumin1001 START - Cookbook sre.hosts.decommission for hosts analytics1061.eqiad.wmnet [11:38:27] sorry, this: !log stevemunene@cumin1001 START - Cookbook sre.hosts.decommission for hosts analytics1063.eqiad.wmnet [11:38:37] and the diff matched just that [11:39:34] https://www.irccloud.com/pastebin/MyetO1T6/ [11:40:20] aborting and restarting since it is already merged arturo [11:41:06] ack [11:41:56] I don't think you strictly need to restart the decomission cookbook, but should be harmless anyway [12:00:10] np restarted it and so far all good on this end arturo :) [12:00:23] 👍 [12:50:25] Still working on some decommissions, is this from Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data okay to merge? https://usercontent.irccloud-cdn.com/file/dOoCf2cZ/image.png [12:56:38] topranks maybe? ^^^ [12:57:19] volans: yes correct, should have anticipated that [12:57:30] stevemunene: you can merge that it looks correct [12:57:36] apologies for the confusion [12:57:49] great, thanks topranks volans [15:19:31] volans, sukhe o/ [15:19:47] elukey: hellooo [15:19:48] I'd like to test https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/935772, that is a new kafka setting for changeprop [15:19:49] elukey: what's up? [15:20:09] go for it as long as you fix it :P [15:20:10] (kidding) [15:20:21] it should just change the batching size (so wait 20 ms max before sending msgs to increase throughput) [15:20:26] all yours I'm offcall in 10m :D [15:20:39] haha [15:22:01] ahahah [15:22:03] okok [15:55:41] rolled out, I'll avoid to deploy the change to the jobqueue as well to limit the blast radious if anything goes wrong [15:55:46] elukey: <3 [15:56:06] I'll check metrics later on, if anything looks weird it is sufficient to revert my last two changes in deployment-charts [17:09:01] hi folks, if you see any DNS issues, please ping me immediately [17:09:31] the reason is that we just reimaged a DNS host (dns1004) and took out dns1001, but this is the first time with automatic config generation (NTP and resolv.conf) [17:09:44] while everything looks and should be fine, if you see something, please let me know thanks [17:17:08] stops using "ssh ns0.wikimedia.org" as I should have, heh [17:18:02] haha, you might get a key error but it now points to dns100[2-4] now [17:18:04] key change, that is kind of historic if dns1001 is down [17:18:10] yeah, truly [17:18:21] congrats [19:21:21] --- [19:21:29] no pending DNS work, so all is fine :) [19:22:08] I am on on-call so if I jinx it, that's on me