[06:29:02] jynus: I think I have fixed the layout [06:35:18] nice catch on https://phabricator.wikimedia.org/T296507 Amir1! [06:35:37] :D [06:36:04] I woke up and thought my script broke or something and saw the poor thing has been waiting politely for nine hours, checking every minute [06:36:36] hahaha [06:36:41] Didn't have much rest [06:39:14] your script, I meant [06:40:11] oh okay. yeah, it's a machine, it's better than humans not having rest :D [06:40:30] well, you are pretty close to them lately [06:41:09] :D I promise to reduce it once we clean up the most terrible shit [06:41:45] getting breakfast [08:23:13] I go do some shopping, I'll be back [08:59:34] I have added stuff to the Monday meeting doc, if you have relevant stuff, please add it too! [09:10:40] as db1139 will take another day from now to catch up replication, I will do the topology change on Monday [09:10:53] Amir1: thoughts on https://phabricator.wikimedia.org/T286552? [09:16:39] back [09:17:06] marostegui: I think I can do the testing. [09:17:13] And handle the rest [09:17:25] sure, happy to help if needed! [09:32:11] marostegui: so regarding T296274, the missing 10.% user is a bit complex. There is no wikiadmin in hosts mentioned, except in s8, for the s4 one, it's test-commons it seems, didn't check the s1 one [09:32:11] T296274: Clean up wikiadmin GRANTs mess - https://phabricator.wikimedia.org/T296274 [09:32:50] Amir1: just by looking at those hosts, I think most of them aren't production [09:32:59] db1140, db1139 are sources [09:33:05] db1133 is test, same as db1128 [09:33:10] db1125 and db1124 are test cluster [09:33:14] and db1111 i think it is s8 [09:33:19] db1177 I don't remember, let me check [09:33:34] so db1177 and db1111 are production [09:33:56] yeah, the s8 ones are the weird ones, the rest we can ignore [09:34:06] From all those, only db1111 and db1177 would need it and the backup sources maybe too, if we need to restore stuff it'd be good if they had them [09:35:00] Amir1: keep in mind that db1111 (I haven't checked db1177) has 10.192.% and 10.64.% so that'd have covered for 10.% [09:35:29] okay, I create it for sources as well. Is it okay if I depool the s8 ones, run the user thingies and repool them? [09:35:40] I also need to double check the queries with you [09:35:54] sure [09:36:11] Amir1: For the sources I am not sure what we currently do regarding grants, so double check with jynus [09:36:15] if not I can check other sources [09:37:37] in theory, backup sources have the wiki grants prepared in case of emergency, but I was waiting, for example, on db1139 to add the right ones (only grants pending to add) [09:38:27] if you tell me what to add to db1139:s1, I can review the others for you [09:39:12] (I was actually about to do other maintenance on them) [09:41:04] jynus: I want to run this [09:41:07] https://www.irccloud.com/pastebin/LFJE62QR/ [09:41:09] ok [09:41:34] haven't figured out what to do put for the password, I know the first one, not sure about the second [09:43:00] you don't need the second- that is the one on top, but hashed [09:43:19] so you can use IDENTIFIED BY 'the password' [09:43:40] or IDENTIFIED BY PASSWORD '*hash of the password' [09:43:51] (at least on mariadb) [09:44:00] you don't need to do it twice [09:44:29] also note that db1139 is also missing the webrequest user [09:45:26] wikuser is another beast I'm planning to tackle later [09:46:32] so I should keep it empty and generate backups without the webrequest user? [09:47:40] I suggest checking a production host and see how it looks (or puppet supposed to keep the source of truth) so it would reduce my work in the future [09:47:52] but if not, it's fine. I will handle it when the time comes [09:47:56] I would add it, and once it comes the time to clean it up, db1139:3311 will be fixed too [09:48:03] +1 [09:48:24] so I'm doing db1140 now [09:50:46] okay, looks good now ^^ [09:55:42] check if you like also how db1139 end up :-) [10:00:17] looks good [10:04:04] I can fix the others copying from db1140 ? [10:06:04] sure [10:06:21] I'm doing s8 one which is a bit complicated [10:06:31] depooled the host now, waiting for traffic to drain [10:06:41] as in s8-backup source, or mw-s8? [10:06:48] mw-s8 [10:06:51] ah! [10:07:00] I can fix backup sources for you [10:07:08] Thanks [10:07:12] that way I can do the reboots I wanted to do at the same time [10:07:27] and maybe later you can run the script to make sure they are ok? [10:09:37] sure [10:10:05] I don't know if I excluded source backups or not, very likely not, otherwise they wouldn't show up in the report [10:10:10] I'll check [10:14:49] I repooled the db177 with one go since I didn't alter tables or so [10:24:11] db1111 is also done now [10:25:39] good! [10:26:52] I was checking I had changed the grants well- at the moment, DROP is kept, right for the admin user? [10:27:25] based on https://www.irccloud.com/pastebin/LFJE62QR/ and db1140:s1 [10:27:33] jynus: yes, Amir1 is unifying them first [10:27:44] yeah, I want to make everything the same and then revoking [10:27:46] that's ok, I just was making sure I was applying them [10:27:49] jynus: once that is done, we'll DROP the DROP grant [10:27:54] it is indeed quite confusing [10:27:56] :D [10:29:00] so I will be just comparing with: diff <(pt-show-grants h=db1140.eqiad.wmnet,P=3311 | grep wikiadmin) <(pt-show-grants h=db1140.eqiad.wmnet,P=3316 | grep wikiadmin) [10:34:42] python style question. Suppose I have `hosts = set([('127.0.0.1', 6220), ('127.0.0.2', 6230), ('127.0.0.1', 6221)])` (tuples are host,port), and I want a result which contains each host only once and don't care about port. I can do this thus `filtered = set(dict(hosts).items())`, but is that too "golf", and I should do something more lengthy by hand? [10:35:38] Emperor: i'm struggling to follow the code even with your explanation, which is probably an answer in itself :) [10:35:42] http://paste.debian.net/1220890/ <-- i.e. is this bad? [10:36:20] you can construct a python dictionary by passing it a set of key,value pairs [10:36:22] [x[0] for x in your_array] ? [10:36:36] Emperor: oh god [10:36:40] 🙅‍♀️ [10:36:44] jynus: that doesn't work because I need a set of host,ip tuples at the end [10:37:13] (and because a dict is a 1->1 mapping, that deduplicates for you) [10:37:18] sorry, I didn't understand that part, where is the host? [10:39:24] each entry in the set is a (ip, port) pair. I want my final set to contain each ip only once, but I don't care which (ip, port) tuple that ends up being for each ip [10:39:32] I see, you just want to filter [10:39:39] I saw it with the link [10:39:54] it was confusing when you said ip and host separatelly [10:40:09] sorry [10:41:05] my question is which port do you chose- you say you don't care- but then why keep it? [10:41:20] that's a good question :) [10:41:32] that's undeterministic programming! [10:41:53] like, if it is because api reasons, put a null or something? [10:42:47] this is a swift thing; some hosts have more than one valid (ip,port) pair, and I only want to talk to each host once; but I do want to talk to each on a (ip,port) pair that exists [10:44:38] http://paste.debian.net/1220892/ is the longhand alternative [10:45:20] Emperor: +1 for the longhand version [10:46:10] for the general question, what I would do is using a set, but overriding the equality condition, but I haven't checked if that is small in python [10:46:27] I think that would be the "java way" at least :-) [10:48:15] I don't think a custom type is the way to go here :) [10:48:50] hah, I asked this q in a different IRC and they all prefer the shorthand version :) [10:49:11] it's good to know that that channel is all Wrong [10:49:19] people on a different IRC may not have to READ your code :-) [10:49:33] lol. I guess I can ask swift upstream which they'd rather [10:53:04] Amir1, for s7 I will just leave it as is- same grant for centralauth [10:55:45] yeah [11:07:22] afk for a bit [11:33:17] x1 is also an outlier [11:59:19] I belive I have "fixed" all codfw backup sources (I was also upgrading them, so it took me a while) [11:59:41] I will do eqiad backup sources (and upgrade/reboot them) after lunch [14:30:28] countinuing with my reboots + grant checks of eqiad backup sources [14:47:25] nice, I just triggered an uncorrectabl memory error on reboot [14:47:53] the nice is not sarcasm, better now that while it was running [14:51:08] "I forgot to download more ram": https://grafana.wikimedia.org/d/000000377/host-overview?viewPanel=4&orgId=1&var-server=db1102&var-datasource=thanos&var-cluster=mysql&from=1637927452651&to=1637938252652 [14:51:56] will reduce the buffer pool of instances and file a hw ticket [14:53:09] mmm db1102 sounds old have you checked if it is still under warranty? [14:53:30] probably not, but the report will be needed anyway [14:54:04] even in the worst case scenario (no replacement), to remove the bad stick [15:00:48] yeah, maybe the have some old ones around [15:00:57] or we can buy one like we did for db1112 [16:26:48] Hm, this swift ring management code may be almost working now [16:28:26] Emperor, https://jynus.com/gif/high_five.gifv [16:30:57] :)