[15:45:26] hello folks! [15:45:33] if you want we can roll out https://gerrit.wikimedia.org/r/c/operations/puppet/+/772788 [15:45:48] it requires a roll restart of kafka brokers, one at the time to verify settings etc.. [15:45:58] let me know if you like the idea, otherwise we can do next week [15:46:17] (I can roll out the change but I'd need somebody to double check that the cluster is ok) [16:02:45] elukey: /me double-checking kafka producers and consumers. The logging pipeline is using the truststore, but so far apifeatureusage is not. [16:05:23] cwhite: o/ this change will only set kafka brokers to trust both pki/puppet certs, no new pki certs issued [16:05:31] kafkatee looks to be using wmf-ca-certificates. elastic search nodes relay to udp localhost compat uses wmf-ca-certificates [16:05:41] ahh, ok [16:05:49] then, should be ok [16:06:13] I'll be around if you're comfortable moving forward :) [16:12:16] super, going to do some prep work and then I'll ping you :) [16:34:03] cwhite: ok ready! downtimed kafka-logging2001 and puppet disabled on all nodes [16:34:11] going to merge and restart 2001 if you are ok [16:34:14] 👍 [16:37:38] two changes are made to the kafka server config [16:37:39] -super.users=User:CN=kafka_logging-codfw_broker [16:37:39] +super.users=User:CN=kafka-logging2001.codfw.wmnet;User:CN=kafka-logging2002.codfw.wmnet;User:CN=kafka-logging2003.codfw.wmnet;User:CN=kafka_logging-codfw_broker [16:38:07] this one is needed so the new PKI certs (when we'll roll them out) can be trusted, since their CN is the hostname of the broker [16:38:14] but we keep User:CN=kafka_logging-codfw_broker [16:38:17] that it is the actual CN [16:38:26] and then [16:38:27] +ssl.truststore.location=/etc/ssl/localcerts/wmf-java-cacerts [16:38:27] +ssl.truststore.password=changeit [16:38:31] that's it basically [16:38:45] Great! [16:39:03] restarting the first broker :) [16:40:54] 2001 done, atm nothing weird popped up [16:42:13] I am going to wait a bit for partition leader to be redistributed [16:43:47] kafkacat -C -b localhost:9093 -t rsyslog-info -X security.protocol=SSL works nicely as well [16:43:57] \o/ [16:44:15] excellent, well done elukey [16:44:38] let's wait a sec before victory but it looks good :) [16:48:35] fair enough! hope I didn't jinx it [16:53:17] all good, broker has recovered :) [16:53:21] will proceed in a bit with 2002 [16:53:38] thanks a lot folks for being the first one to migrate to PKI [17:06:18] 2002 done, 2003 done in a bit [17:13:52] cwhite: codfw done, all good afaics, ok to proceed with eqiad? [17:14:18] elukey: SGTM /me is watching [17:18:05] 1003 restarted [17:23:27] basically recovered, will wait a bit and then proceed with 2002 [17:23:29] err 1002 [17:32:43] still looking good over here [17:33:06] yep I got distracted by puppet breaking, proceeding :D [17:38:38] 1002 restarted [17:50:05] 1001 restarted, as soon as it recovers we are done :) [17:50:49] cwhite: the next step will be to get the pki cert for one broker (say 2001), so I'll wait for all the clients to be using the new truststore. Please lemme know if you want me to help to send patches etc.. [17:51:42] Sounds good, thanks! I'll see about getting apifeatureusage using the right truststore :) [17:54:06] <3 [17:54:20] 1001 recovered, all good! [18:27:28] cheers, thank you elukey