[05:20:06] s3 eqiad master switched [07:37:48] Amir1: can you review this when you have a chance? https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/1025670 I need it to be able to set up dbctl for es6 [08:51:46] marostegui: I check it now, OTOH, can you tell me once you're done with the old master of s3? I have a couple of stuff [08:52:17] Amir1: will do, just a few mins [08:54:47] Amir1: done [08:54:57] Thanks [09:08:34] Amir1: is it ok for me to run a schema change on s3/db1157 ? [09:10:54] arnaudb: I just started one :D [09:11:20] could you let me know when it's my turn then? :D [09:12:27] Amir1: so to be in sync, https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/1025670 this is needed to be able to set the sections in dbctl [09:12:48] arnaudb: sure [09:12:55] marostegui: noted 😭 [09:13:06] so I am going to go ahead and see what happens [09:13:40] fingers crossed [09:31:38] so apparently that's not enough [09:31:41] 'es6' does not match any of the regexes: '^(s[1-8]|s1[01]|es[12345]|x[12])$' [09:33:16] the regex is clearly from somehwere [09:33:29] yeah, but I cannot find where [09:33:37] I added es6 to puppet too as section for dbctl [09:34:08] marostegui: modules/profile/files/conftool/json-schema/dbconfig/instance.schema? [09:34:10] in puppet [09:34:20] ha [09:34:27] I'm seeing it in modules/profile/files/conftool/json-schema/mediawiki-config/dbconfig.schema [09:34:31] as well [09:34:35] I added it at https://gerrit.wikimedia.org/r/c/operations/puppet/+/1025603 [09:34:44] I will add it to those two [09:44:23] that was it! [09:44:39] I will modify the doc so we know it also needs those two places [09:46:47] Amir1: es6 eqiad pushed [09:46:52] Going for codfw config now [09:51:26] Amir1: es6 codfw pushed [09:53:54] oh nice [09:54:26] marostegui: Do you want me to do the mw script run? [09:54:32] Amir1: let's do it yeah [09:54:50] I am not disabling notifications just yet, in case something breaks (eg: replication) [09:55:02] sure [10:00:34] marostegui: running [10:00:41] yay [10:01:10] FWIW, it was this: [10:01:11] ladsgroup@mwmaint1002:/srv/mediawiki/php-1.43.0-wmf.2/extensions/WikimediaMaintenance/storage$ ./make-all-blobs es1038 blobs_cluster30 [10:01:44] I'll add that to the doc [10:02:34] I see the tables in the hosts and the grants look correct as well [10:02:41] (from mw side of things) [10:03:54] yeah [10:04:04] let's enable writes tomorrow? [10:04:12] sure [10:04:19] ah no [10:04:21] it is a holiday [10:04:23] let's do it next week [10:04:37] note I haven't prepared yet backups for the new hosts [10:04:53] jynus: ah fine, we can wait :) [10:05:06] I will set up es7 in the mean time [10:05:10] so we can enable both at the same time [10:37:14] FYI https://gerrit.wikimedia.org/r/c/operations/puppet/+/1025714 [11:02:59] I am going to stop deploying at this point [12:53:29] urandom: o/ so Restbase's Cassandra fully on PKI! \o/ [12:53:39] going to merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/1024738 to clean up [12:54:48] now the fun part - session store :D [12:58:07] \o/ [13:00:52] in theory the truststore change could be done anytime, but if we want to be really safe we can schedule maintenance on https://wikitech.wikimedia.org/wiki/Deployments (maybe the MW infra slot) and do it next week (i am out tomorrow) [13:10:56] what do you think? [13:11:24] in theory we could even do it without depooling kask, in practice we may want to be extra sure? [13:11:32] I am not 100% familiar with kask :( [14:15:51] PROBLEM - MariaDB sustained replica lag on s6 on db2114 is CRITICAL: 8 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2114&var-port=9104 [14:16:52] RECOVERY - MariaDB sustained replica lag on s6 on db2114 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2114&var-port=9104 [14:23:24] elukey: I agree, I think we can do the truststore change anytime [14:25:41] (without depooling) [14:30:38] also, wrt my comments about depooling. It's not a requirement for Kask, I was only suggesting it as a way of entirely isolating the data-center we change, so that in the event it *did* result in some issue, it wouldn't impact users. I don't feel strongly that is necessary, but it's very easy to do; I think it's perfectly acceptable to do in this situation (without using a deployment window, either). [15:16:48] arnaudb: I haven't forgotten about your replica. It's still ongoing [15:21:13] urandom: I have some time now if you want to rollout the truststore :) We can do it with puppet disabled in eqiad so in case we rollback [15:21:51] elukey: I have ~40 mins before a meeting, if you feel that's time enough then let's do it [15:23:30] I'd say so yes, doing it [15:23:34] so to recap: [15:23:40] 1) disable puppet on all session store nodes [15:23:49] 2) merge the change, rollout on one node, restart the instances [15:23:54] (codfw node) [15:24:02] 3) complete codfw [15:24:08] 4) rollout eqiad [15:24:11] does it sound good? [15:34:02] restarting 2004 :) [15:35:14] it does! [15:35:15] sorry [15:38:46] urandom: 2004 is up! Do you want to double check? Otherwise I'll proceed with the rest of codfw [15:41:44] Looks ok to me! [15:41:59] elukey: I'd say proceed [15:42:13] super, proceeding [15:44:49] 2005 and 2006 in progress [15:45:31] arnaudb: db1157 is yours, I haven't pooled it yet so please repool once done [16:14:14] urandom: session store done! [16:14:37] I'll prep the changes for PKI, I think we can do it next week [16:14:49] elukey: sgtm; thanks! [16:15:01] after that we'll also be able to do a big cleanup in puppet private etc.. [16:15:05] and also update the docs [16:27:40] elukey: oh yes, I'm looking forward to that [16:27:51] that module is getting...unwieldy [16:45:00] urandom: filed the changes for sessionstore, I left a comment for the kask changes in deployment-charts, sorry for the confusion I thought we needed a change to the chart too [16:45:47] going afk for the evening, have a nice rest of the day folks :) [16:46:36] elukey: enjoy your evening