[02:43:33] 10serviceops, 10SRE, 10Datacenter-Switchover: Document communication expectations around planning a DC switchover - https://phabricator.wikimedia.org/T285806 (10sgrabarczuk) - (1) From my perspective, the switchover went smoothly. Most tasks were well documented and automated. I know of no serious consequenc... [04:39:35] 10serviceops, 10DBA, 10User-fgiunchedi, 10cloud-services-team (Kanban): Roll restart haproxy to apply updated configuration - https://phabricator.wikimedia.org/T287574 (10Marostegui) I have disabled puppet on the active dbproxies: * dbproxy1013 * dbproxy1014 * dbproxy1020 [04:53:47] 10serviceops, 10DBA, 10Toolhub, 10User-bd808: Discuss database needs with the DBA team - https://phabricator.wikimedia.org/T271480 (10Marostegui) >>! In T271480#7252163, @bd808 wrote: >>>! In T271480#7251095, @Marostegui wrote: >>>>! In T271480#7225145, @bd808 wrote: >>> * toolhub: user with CRUD rights on... [05:04:51] 10serviceops, 10DBA, 10Patch-For-Review, 10User-fgiunchedi, 10cloud-services-team (Kanban): Roll restart haproxy to apply updated configuration - https://phabricator.wikimedia.org/T287574 (10Marostegui) The above patch is ready to be merged and deployed once the standby dbproxies are done. [07:26:21] hi everybody, today I am going to meet with Keith and Razzi for the rebalance of kafka-main's topic partitions (https://phabricator.wikimedia.org/T225005) [07:26:47] for Jumbo we used a tool able to schedule the best layout for how to place the partitions [07:26:53] and it worked well [07:27:32] basically for every topic there will be some commands to execute, and the worst that happens is some temporary alert for under-replicated partitions (since they are being moved) [07:27:51] we want to do it since we have two new brokers that are not getting traffic basically [07:38:29] elukey: absolutely [07:41:09] ack, so we'll likely create a list of commands to execute, and then set up a schedule for them, it will take a bit but if we do it slowly nobody will notice [08:39:34] 10serviceops, 10DBA, 10Patch-For-Review, 10User-fgiunchedi, 10cloud-services-team (Kanban): Roll restart haproxy to apply updated configuration - https://phabricator.wikimedia.org/T287574 (10fgiunchedi) Thank you @Marostegui ! To recap here's my plan: # stop puppet on `C:haproxy` # merge https://gerrit... [09:56:41] 10serviceops, 10DBA, 10Toolhub, 10User-bd808: Discuss database needs with the DBA team - https://phabricator.wikimedia.org/T271480 (10JMeybohm) @Marostegui please find the up to date Pod IP ranges at https://netbox.wikimedia.org/search/?q=kubernetes+pod&obj_type=#prefixes [09:57:00] 10serviceops, 10DBA, 10Patch-For-Review, 10User-fgiunchedi, 10cloud-services-team (Kanban): Roll restart haproxy to apply updated configuration - https://phabricator.wikimedia.org/T287574 (10Marostegui) 05Open→03Resolved The proxies were failed over, and the old active ones got puppet enabled + run a... [09:58:08] 10serviceops, 10DBA, 10Toolhub, 10User-bd808: Discuss database needs with the DBA team - https://phabricator.wikimedia.org/T271480 (10Marostegui) Thanks - 10.64.% and 10.192.% should work then [10:06:44] 10serviceops, 10DBA, 10Toolhub, 10User-bd808: Discuss database needs with the DBA team - https://phabricator.wikimedia.org/T271480 (10Marostegui) Recap - @bd808 please let me know if this looks good: cluster: `m5` db name: `toolhub` entry point: `m5-master.eqiad.wmnet` db users: * `toolhub_admin` Grants:... [14:39:07] 10serviceops, 10DBA, 10Toolhub, 10User-bd808: Discuss database needs with the DBA team - https://phabricator.wikimedia.org/T271480 (10bd808) >>! In T271480#7255015, @Marostegui wrote: > Recap - @bd808 please let me know if this looks good: > > cluster: `m5` > db name: `toolhub` > entry point: `m5-master.e... [17:00:00] legoktm, rzl: FYI I've deployed the latest spicerack to cumin[1001,2002] that has your latest patches (see https://doc.wikimedia.org/spicerack/master/release.html ) [17:00:16] volans: thanks! [17:00:24] I didn't tested the stuff related to the switchdc stuff specifically [17:00:42] ack, assume we'll do another live test before switchback [17:00:45] (x2 and dnsdisc) [17:00:49] I hope so :) [17:00:58] I'm hoping to get that service-downtime thing done and released before then too [17:01:10] great [17:01:19] lmk if you encounter any issue [17:02:05] I figured I'd barely get to talk to you during the workday anymore, since I moved out west! I forgot you keep sailor's hours [17:42:27] volans: woot, thanks :D [18:31:21] cheers [22:43:16] 10serviceops, 10SRE, 10Wikimedia-production-error: PHP7 corruption reports in 2020-2021 (Call on wrong object, etc.) - https://phabricator.wikimedia.org/T245183 (10Krinkle)