[14:53:05] claime, effie: thoughts on deploying the config change now? The serviceops window conflicts with dinner, and the mediawiki backport window is past brain-shutoff time... I'd really like to get this out this week. [14:53:54] duesen: go [14:54:08] ty [14:54:19] this will be my first time using spiderpig! [14:55:26] claime: oh, can you give a "this won't break the servers" +1? https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/1270765 [14:55:30] a momentous event! [14:56:32] duesen: +1'd [14:56:42] ty [14:57:02] now if it breaks the servers, you've got someone to blame :D [14:57:06] and spiderpig is awesome [14:58:41] Yea... doing deployments via the browser still feels wrong... [14:59:11] If I vanish, what happens to my deployment? Will someone else who logs into spiderpig be able to see it and interact with it? [14:59:22] duesen: yes [15:11:52] ok, done [15:12:27] thank you for using spiderpig. how would you rate your experience? [15:12:28] (sorry. :P ) [15:12:46] On a scale of spider to pig, how scap'd out are you? [15:14:17] :D [15:43:02] matthieulec: https://phabricator.wikimedia.org/T423619 [15:52:18] ihurbain: "how likely are you, on a scale from one to five, to recommend spierpig to your collegues?" [15:53:22] spiderpig may be one of the few things in existence to which i'd answer 5 rather than a sigh and an eyeroll [15:54:28] (mostly because it may be one of the few things in existence for which this question actually may make sense :D ) [15:56:23] jynus: thanks let's follow up on the task [15:56:45] matthieulec: just did [15:58:08] wow that was fast [17:39:43] duesen: congrats on using spiderpig for the first time. Your concern about getting detached from the browser session is totally reasonable and also one of the things that sucked on the cli that spiderpig just takes care of. We all see the same job queue and can answer prompts or cancel without needing prior preparation. [18:56:04] swfrench-wmf: re https://phabricator.wikimedia.org/T364245#11816724, just out of curiosity if nothing else, would it've been possible for the same root-cause investigated in that task to occur outside of a deployment if e.g. a pod was terminating for another reason? (apologies if there's an obvious answer to this question :) ) [19:04:42] A_smart_kitten: no apologies necessary - that's a solid question! in retrospect, my response might have been a bit too high-level on this point (specifically the "or other disruption that could [...]" part). [19:04:42] there are indeed reasons that a pod could be terminated outside of a deployment, even if deployments are by far the most common cause (e.g., if the k8s worker node is being drained). [19:04:42] in this case, and which I didn't in the comment, was that I checked the k8s event logs to confirm the absence of those kinds of operations during that those time window (I actually ended up doing that before checking the SAL and scap logs). [19:08:25] ahh, right. thanks swfrench-wmf :) so i guess the good news is that there may be one less root-cause of these sort of DeferredUpdates being terminated, but the bad news (pending a re-run of the analysis) is that there may be _another_ cause of them being terminated? (would that be a fair assessment with the current information?) [19:16:35] exactly, yes - I'm hopeful that by repeating the analysis following the improvements we made (particularly, the graceful-drain that should reduce the number of requests that are interrupted on termination), we should get a better sense of how frequently this happens from "other sources" (e.g., if I understand correctly, anything that might thwart post-send deferred updates) [19:25:24] ack, thanks for the explanations! they are appreciated :) [19:26:05] no problem, and thank you for asking :)