[03:11:12] !log admin 'cp /etc/apt/sources.list /etc/apt/sources.list.prepuppet' on all VMs. Backing up state before puppetizing sources.list with https://gerrit.wikimedia.org/r/c/operations/puppet/+/751498 [03:11:14] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [07:01:06] Kxammwm [07:01:07] Xl [07:01:16] Dajaklgwy t [08:55:06] Hi, are you a human? (re @Karrihk: Dajaklgwy t) [09:31:54] Dssd [09:31:56] Bn SSC [12:39:49] hi guys [12:40:04] Where are you from? [12:42:23] Bdnddmdj (re @wmtelegram_bot: hi guys) [12:42:35] Jdnnshyndn bdje [12:42:40] Dmdjwys yu rude in [12:42:52] Nddjeu brrr rmhseb e veggwge. Bdbdhd d [13:33:50] I think we can count that as a “no” (re @jhsoby: Hi, are you a human?) [14:53:52] Nsns (re @Karrihk: Dajaklgwy t) [14:53:58] Dhrkejee time (re @Karrihk: Dajaklgwy t) [14:54:03] RR (re @Karrihk: Dajaklgwy t) [14:54:10] RR Nagar Trichy (re @Karrihk: Dajaklgwy t) [15:06:10] Hej hej, could a wikitech admin with powers™ please edit https://wikitech.wikimedia.org/wiki/MediaWiki:Sitenotice to state: [15:06:26] The [[meta:Coolest Tool Award|Coolest Tool Award 2021]] ceremony will be live streamed on [https://zonestamp.toolforge.org/1642179615 Friday 14 January 2021, 17:00 UTC]. [15:06:33] ? Thanks in advance! [15:07:41] sure, done [15:11:05] taavi: Uh that was quick. :) Thanks a lot! <3 [16:10:49] taavi: thanks for helping andre out. There is some magic that we try to preserve when setting site notices on wikitech that you were likely not aware of. https://wikitech.wikimedia.org/wiki/MediaWiki_talk:Sitenotice#Template_for_adding_Sitenotice_content has the details. (/me will add the missing bits to the current message) [16:11:19] argh. I'll take notes for next year :) [16:11:49] it's a no worries thing :) [16:12:41] thanks bd808 [16:13:06] can we add an editnotice to let the next person know as well? [16:13:26] I was just poking around to figure out how to do that :) [16:13:49] I think that the "editnotice-8-Sitenotice" message might be the right place... will experiment [16:19:47] taavi: How does https://wikitech.wikimedia.org/w/index.php?title=MediaWiki:Sitenotice&action=edit look for you now? [16:26:08] bd808: awesome! [16:26:23] mediawiki can do some cool stuff :) [16:28:50] {{cn}} ;) [16:34:40] Hosting Wikidata isn't a cool thing that MediaWiki can do? :P [16:35:03] touché [16:37:52] on the other hand, hosting Wikifunctions won’t count. MediaWiki doesn’t even do most of the work! [18:18:38] Hi andrewbogott yt? Thinking of rolling out the cloud db views https://phabricator.wikimedia.org/T298505 shortly [18:19:47] I am mostly around! Have at, and I'll be around to help troubleshoot if something goes awry [18:20:16] I have an appointment in ~90 minutes, nothing much until then [18:38:01] Ok cool andrewbogott. Could you look at this patch and let me know if I'm on the right track to depool clouddb1013? https://gerrit.wikimedia.org/r/c/operations/puppet/+/751779 [18:38:48] I know last time we didn't depool and we probably don't need to this time but I'm doing all the steps for the practice [18:40:36] I responded on the bug, I'm not positive but that's my best guess [18:40:55] razzi: why are we depooling s1/s3 if the centralauth db is only on s7? [18:41:21] Ah yeah taavi I got the section wrong [18:52:35] I guess I'm not clear on something else andrewbogott - what is the distinction between "web" and "analytics" as applied to wikireplicas? [18:53:57] I see comments like "LVS DNS of [dbproxy1019] is wikireplicas-a.wikimedia.org" and the other is wikireplicas-b [18:54:08] hopefully bd808 or balloons will correct me on this. I believe that they're two different sources of traffic, and that one uses 1018 and primary and 1019 as the fallback, and the other is the other way around. [18:54:40] But really I don't know a lot. I don't think you need to know in order to depool things, since you don't want /any/ traffic hitting the depooled server. [18:56:08] I think https://wikitech.wikimedia.org/wiki/Portal:Data_Services/Admin/Wiki_Replicas#Physical_database_server_layer might help answer your question [18:57:12] ah yes that's helpful balloons [18:58:05] andrewbogott had it right. 1018 is analytics, 1019 is web [18:58:56] so each pair is active/active but different workloads have preference about which server gets hit first. When you depool one or the other there's not an outage but some change in behavior for which ever workload's 'favorite' was depooled [19:04:57] Another question: since I'm going to be doing maintenance on s7, and clouddb1014 hosts s2 and s7, do I need to do section overrides for s2 and s7? I assume so, since maintain-views runs on the host level [19:08:59] I think that's how it's intended to work, since clouddb1014 and clouddb1018 host the same sections. So adding in all the overrides for clouddb1018 depools 1014 [19:12:11] Updated the patch: https://gerrit.wikimedia.org/r/c/operations/puppet/+/751779 [19:12:11] Thanks all for the tips thus far! [19:20:37] razzi: I think the sections run on different mysql processes [19:20:53] so you shouldn't really need to do anything with s2 even though it's on the same host [19:21:01] that said, it's mostly harmless to depool more than you need [19:23:39] Sure thing. I updated it to not touch s2: https://gerrit.wikimedia.org/r/c/operations/puppet/+/751779 [19:24:59] looks good! [19:25:33] Cool I'll go ahead and merge that [19:52:21] razzi: I'm going to vanish in a few minutes. Are things going ok? [19:53:23] Probably obvious but after the maintain run finishes you'll need to revert that patch and then make a new one that depools 1018 instead, etc. etc. [19:54:32] Yep yep, just merged the views update patch and I'm about to run maintain-views on clouddb1014 [19:54:53] I don't see any u## or s## processes so looks like it's depooled fine [19:55:16] nice [19:55:59] `sudo maintain-views --table globaluser --databases centralauth --dry-run` looks good [19:56:42] ran it again without --dry-run, looks good [20:02:07] Ah actually I think I need to run --databases centralauth_p [20:02:46] hm no that's not how the script works [20:03:13] I'm a bit confused because I tested a query with the new field gu_hidden_level: [20:03:13] `MariaDB [centralauth_p]> select gu_hidden_level from globaluser limit 1;` [20:03:13] `ERROR 1054 (42S22): Unknown column 'gu_hidden_level' in 'field list'` [20:03:47] so either the view wasn't updated or I'm querying the new field wrong [20:03:56] what does `show create table globaluser;` on centralauth_p say? [20:03:59] the query looks correct [20:04:31] clouddbs don't seem to be included in my current shell access :/ [20:07:03] taavi: it's a lot of output, but you can see a paste here: https://phabricator.wikimedia.org/P18404 [20:07:26] hmm, can you add a \G instead of ; for a bit more readable output? [20:07:54] but I don't think it re-created the view [20:08:55] ah yeah \G is better: https://phabricator.wikimedia.org/P18405 [20:09:03] but yeah there's no gu_hidden_level [20:13:00] taavi: even if you had shell to the clouddb boxes you wouldn't be able to do much. Things there are locked down so that only full prod roots (not mortals like you and me) can auth to the db in a way that exposes the underlying tables. [20:15:10] razzi: hmmm... so the closure of T297094 was premature or ... something? [20:15:10] T297094: Add globaluser.gu_hidden_level column to production - https://phabricator.wikimedia.org/T297094 [20:16:14] CA is definitely writing data to that column, so I would except that something would have broken by now if the column did not exist on the non-_p databaess [20:17:32] oh.. the paste at P18405 is showing the create for the current view, not the raw table. [20:17:42] I'm curious to try creating the view manually; if that works there's a problem with maintain-views.py or the way I ran it [20:18:19] I think it's be fine to create a copy view `globaluser_temporary_debug_view` and immediately delete it [20:18:26] *it'd be fine [20:18:40] razzi: in that test select you did, were you authed to the db instance as root or some other user? [20:19:22] * bd808 may be confused about what's not working as expected [20:19:44] I used sudo and `select current_user()` reports root@localhost [20:22:22] ok. I think I'm catching up. The paste at P18405 is from the centralauth_p database which is the user facing view layer over the centralauth "raw" database on the same instance. The create there is showing the view as it exists on that instance. [20:22:40] hm ok sure enough when I run `sudo maintain-views --table globaluser --databases centralauth --dry-run --debug` I see the `CREATE OR REPLACE ...` doesn't have `gu_hidden_level` [20:23:18] did you run puppet on the host to pull the updated config? [20:23:39] Ah yeah I forgot to puppet-merge [20:23:50] I didn't forget to run puppet :) but ... [20:23:54] E_TOOMANYSTEPS [20:24:46] yep, I had 2 puppet configs, and I forgot the second puppet merge. Thanks for the debug help tho bd808 taavi ! [20:25:35] ok everything's working on clouddb1014 [20:26:23] now to repool it, depool 1018, run the change on 1018, repool that, and done1 [20:26:26] *done! [20:27:12] someday: https://phabricator.wikimedia.org/T297026 [20:27:20] "Automate maintain-views workflow" [20:29:11] razzi: +1. We could use a heck of a lot more gitops around here [20:30:11] I somehow had a feeling we had 4 copies of each section and not 2.. apparently I just remember incorrectly? :/ [20:33:07] I think we have 3, if you include clouddb1021 (wmcs::db::wikireplicas::dedicated::analytics_multiinstance) [20:33:27] since that host has all 8 sections [20:34:08] clouddb1021 is only for the analytics sqoop job [20:36:00] We did have 3 of each section in the prior cluster that was all multiinstance, but only ended up with 2 of each with the multisource cluster. [20:36:38] but I think things are still showing that the new cluster is more performant than the old even with fewer copies of things [20:41:48] hmm [20:42:05] razzi: did you recreate the views for localuser table too? [20:42:14] the patch changed that too [20:43:23] I did not taavi, good catch [21:00:53] razzi: I am briefly back, is everything good? [21:01:25] (and, btw, I'm extremely tempted to automate all this but I haven't yet convinced myself that it's the highest priority) [21:09:02] yeah andrewbogott all's well, still have a bunch more steps to do but I'm in a meeting at the moment [21:09:10] ok!