[04:23:43] Krinkle: we do have log_slave_updates so the binlog size should be about the same in codfw and eqiad, no matter with dc is active [04:56:57] Oh, that new phabricator "other asignee" is great! [05:00:21] what is that then? [05:00:55] A second person assigned to the task in a backup capacity [05:00:58] That's what it says [05:02:25] Krinkle: yesterday purging took: [05:02:29] * pc1: 13h [05:02:29] * pc2: 15h [05:02:29] * pc3: 14h [05:02:32] pretty good! [05:08:07] ooohhh that's a grand idea. and I have just the use for it [06:37:39] !issync [06:37:39] Syncing #wikimedia-sre (requested by legoktm) [06:37:41] Set /cs flags #wikimedia-sre litharge +o [10:30:34] topranks: did you see my comment yesterday? [11:34:21] _joe_: any reason why db1118 in eqiad is receiving traffic from MW? [11:34:41] <_joe_> marostegui: from what IP? [11:34:48] <_joe_> my best guess would be dumps [11:34:52] <_joe_> they're eqiad-only [11:35:16] _joe_: I am seeing from mw1331.eqiad.wmnet. and from snapshot1012 but this host isn't on the dumps group [11:35:29] apergos: ^ [11:35:49] <_joe_> mw1331 is strange indeed that might be me :D [11:35:56] defintely not me :-D [11:35:57] XD [11:36:13] <_joe_> marostegui: are you seeing queries from mw1331 now? [11:36:19] apergos: why would that host get a connection from snapshot1012.eqiad.wmnet. if it is not in dump? (it is on s1 btw) [11:36:21] <_joe_> in that case, that's not me [11:36:27] um [11:36:35] was it in the dumps group ealier? [11:36:40] _joe_: | 4399105107 | wikiuser | 10.64.16.166:43208 | enwiki | Query | 0 | init | SELECT /* MediaWiki\Block\DatabaseBlock::newLoad */ ipb_id,ipb_address,ipb_timestamp,ipb_auto,| [11:36:45] <_joe_> if it's an active connection, kill it, it's a leftover from my tests this morning [11:36:51] apergos: that host hasn't been changed for ages [11:37:47] <_joe_> marostegui: the ip you pasted is for mw1402 though [11:37:56] yes, that's a new one that just arrived [11:38:08] I am seeing different IPs now [11:38:29] <_joe_> I would assume it's monitoring calls causing those connections [11:38:34] <_joe_> is the db depooled? [11:38:37] nop [11:38:46] <_joe_> ok then it's monitoring, if it's s1 [11:38:54] ah ok [11:39:00] <_joe_> we call the enwiki blank page from pybal [11:39:05] ah right [11:39:13] and the snapshot wikiadmin connection? [11:39:29] snapshot1012 is doing en, unless it's trying to get some revision info that is corrupt and it's been redirected to theprimary (deep in mw core) there's no reason for it to request anything [11:39:38] it's mostly talking to es at this point anyways [11:39:47] can you give me a query or two? [11:39:59] apergos: there're no queries at the moment, just wikiadmin connected [11:40:08] 4399103386 wikiadmin 10.64.0.157:56366 enwiki Sleep 0 NULL 0.000 [11:40:08] 4399103862 wikiadmin 10.64.0.157:56368 enwiki Sleep 0 NULL 0.000 [11:40:21] but why to a non dump host? [11:40:47] no idea, db1118 isn't a primary is it? [11:40:54] nope, just a slave [11:42:06] ugh [11:44:23] according to the ps that's mwscript invoking fetchtext.php and not the direct invocation of fetchtext itself [11:44:38] for the one process anyways, lemme check the other one [11:45:08] same for the second one [11:46:42] <_joe_> marostegui: need anything else from me? else I'd take a break [11:46:51] _joe_: no no, go away! thanks [11:48:44] it shouldn't even be trying to retrieve metadata from anywhere at this point, just taking what's been written into a file from the first pass stubs [11:49:26] and given that this is the wrapper calling the scrip that gets the blob from external store, I guess it's somewhere in an include from MWScript :-/ that will be hard to find and beat [11:49:50] but what if that host gets rebooted, would that break dumps too? even if it is not in the dumps group? [11:50:08] no idea, but if you need to reboot, just do it [11:50:18] the jobs that die will all retry, we're in that phase [11:50:34] where these are much shorter jobs, even if it costs a few hours it's not a big deal [11:50:43] no no, no need to touch it, but I am wondering if we have an easter egg with those connections going to some hosts they're not supposed to go [11:51:29] well mwscript is the wrapper for all maintenance scripts so if ti does some sort of lb setup on its own [11:51:43] that's going to be a drag to find and fix [11:52:24] given they are just sleeping connections they are clearly not doing us any good :-P [11:52:46] hehe yeah I know :)