[08:11:53] <Emperor>	 godog: I've just gone to look at ms-be2028 and it looks like maybe swift-drive-audit and puppet are fighting over fstab? sdg1 has been repeatedly commented out (and presumably re-added) - there are now 121 sdg1 entired in fstab on that node...
[08:14:34] <Emperor>	 (I'll open a ticket about the disk - that's 3 dead disks in ms-codfw now :( )
[09:34:55] <Emperor>	 also: hardware RAID is terrible
[10:59:56] <Amir1>	 gods forgive me for running this bash
[11:00:00] <Amir1>	 root@db2109:/srv/sqldata# find . -type f -printf '%s %p\n' | grep templatelinks | grep -i ibd| sort -nr | python3 -c "import sys; print(sum(int(l.split(' ')[0]) for l in sys.stdin)/(1024*1024*1024))"
[11:00:01] <Amir1>	 138.36166381835938
[11:02:40] <Emperor>	 stylish ;-)
[11:08:02] <Amir1>	 I've been trying to figure out why s3 in total is as large as s5 and s6 combined (and some more). Sure having 1000 wikis amplifies every issue but e.g. in total revision is  less than 100GB. We have a couple of quite large wikis (ruwikinews alone is taking up 10% but after some of the normalizations, it'll get better). OTOH, it's not that big so all of optimizations will bring it to a better state and it doesn't get much read so meh
[11:30:39] <Amir1>	 my estimations say if we finish optimization of pagelinks, templatelinks and externallinks, all sections will go under 1TB, with exception of s4 and s8 but both of them will be quite close (1.1-1.2TB)
[11:32:36] <Amir1>	 the total of core dbs will go to 7 to 8 TB from 10TB that is right now
[11:33:14] <Amir1>	 (the total really doesn't matter, we can add more sections if collectively it grows too big)
[15:38:15] <btullis>	 Hello. I have a maintain-views change to run for the wikireplicas: https://phabricator.wikimedia.org/T313281 which is adding a new table to the allowed_logtypes
[15:38:57] <btullis>	 1) Is this a good time to do it, or do you have anything in-flight that means I should defer?
[15:39:49] <btullis>	 2) Should I depool each clouddb server before running the script? I think that these instructions suggest that I should, but I'm not sure: https://wikitech.wikimedia.org/wiki/Portal:Data_Services/Admin/Wiki_Replicas#Updating_views
[15:41:09] <Amir1>	 btullis: I have done maintain-views before, I can tell you
[15:41:27] <Amir1>	 first, I have something on s5 but we have time
[15:41:30] <btullis>	 Great, thanks Amir1
[15:41:34] <Amir1>	 and it should be fine
[15:42:21] <Amir1>	 second is that, it should be fine to run it live if there aren't any live queries happening 
[15:42:57] <Amir1>	 so for most tables you can just run it but logging is a bit popular so we probably need to depool
[15:43:06] <Amir1>	 (I have done depool twice or three times)
[15:43:44] <btullis>	 OK, thanks. This depooling is an haproxy hiera change in puppet, right?
[15:43:55] <Amir1>	 btullis: yup
[15:44:37] <Amir1>	 https://phabricator.wikimedia.org/T305064 see patches
[15:44:56] <btullis>	 Or I wonder if I could use confctl now, to depool all 'analytics' or all 'web' now.
[15:45:49] <btullis>	 Like I did here: https://phabricator.wikimedia.org/T298940#8122811
[15:53:06] <btullis>	 That would work, wouldn't it? If I depool dbproxy1019 with confctl, then that effectively depools all of dbproxy10[13-16] - the 'web' service and moves the workload to 'analytics' temporarily.
[15:55:01] <Amir1>	 btullis: sorry I was afk
[15:55:07] <Amir1>	 no that won't work
[15:55:34] <btullis>	 Oh, ok. :-)
[15:55:34] <Amir1>	 To my knowledge 
[16:16:18] <btullis>	 I don't mean to sound contrary, but I still think this will work. We changed it fairly recently, so that both dbproxy1018 and dbproxy1019 can each serve the wikireplicas-a and wikireplicas-b services traffic.
[16:18:19] <btullis>	 They still have their preferred backend database servers, so if we change the conftool settings to this:
[16:18:23] <btullis>	 https://www.irccloud.com/pastebin/0T48mYIB/
[16:19:12] <btullis>	 ...then all requests to the 'web' replicas go to 'analytics' backends, like this:
[16:19:25] <btullis>	 https://www.irccloud.com/pastebin/wfxsNsyR/
[16:20:32] <btullis>	 That would give me the means to run `maintain-views` on  clouddb10[13-16] without any load, unless I'm missing something.
[16:22:00] <Amir1>	 we can definitely try, if it's new, then nice
[16:22:19] <Amir1>	 less work
[16:22:21] <Amir1>	 :D
[16:24:40] <btullis>	 Cool, ok. Proceeding to depool the 'web' wikireplicas now.
[16:27:32] <btullis>	 I think that's done.
[16:27:45] <Amir1>	 I think you need to wait for it to drain 
[16:28:17] <Amir1>	 there will be lots of existing connections and queries, some might take a while
[16:28:44] <btullis>	 👍
[16:39:18] <Platonides>	 Amir1: find . -type f -printf '%s %p\n' | grep templatelinks | grep -i ibd → you could have done find . -type f -name templatelinks.ibd -printf %s
[16:39:32] <Platonides>	 the sort is unneeded if you're just going to sum them all
[16:40:03] <Platonides>	 and summing all entries, I agree is not so trivial in shell
[16:40:12] <Amir1>	 Platonides: so it grew organically, at first I wanted to see which tables are the biggest, then filtered templatelinks and just added more pipes
[16:40:32] <Platonides>	 perhaps an awk '{ value+=$1; } END{ print $value; }'
[16:40:47] <Amir1>	 oh awk + sed, my favorite os
[16:41:03] <Platonides>	 heh, it really looks it did :D
[16:41:35] <Platonides>	 you could use plain bash as well: while read value; do sum=$((sum + value)); done; echo $sum
[16:42:49] <Amir1>	 yeah
[17:01:52] <btullis>	 It's taking along time to drain. I might have to leave it like this until tomorrow morning, UK time.
[17:06:36] <Amir1>	 there is a high chance of the remaining queries not touching logging table
[17:06:42] <Amir1>	 but whatever you prefer