[08:56:32] ejoseph: want to reconnect on concurrency? [09:01:02] Can you give me 20 mins [09:01:20] have as much as you want - let me know when you're ready [09:12:52] I am now [09:15:19] https://meet.google.com/sbu-vvvu-iaf [09:54:11] we had an incident this week-end [09:54:34] on the updater running k8s, it was not captured by the lag metric as seen by the udpater [09:56:20] it was down for 7hours and caused edits to be throttled [09:58:05] my bad the SLO graph is using the right metric and has properly degraded the SLO after the incident [10:04:54] what happened? [10:05:18] k8s related, filing a task to investigate [10:05:45] is wdqs12 down? [10:06:05] probably [10:06:19] I mean the jvm is likely stuck [10:06:54] I'm restarting blazegraph there [10:08:35] in any case, lag caught up [10:08:40] which is good I guess [10:09:44] but we're below 99% now :( [10:12:34] impact was real tho, 7hours of edits being throttled is bad [10:13:22] true [11:05:57] launch/errand [11:14:20] lunch! [11:18:38] longer errand [13:48:18] ejoseph, ryankemper: I've got confirmation from Brian that he is enrolled in the ES training. Could you confirm that you're all set as well? [13:50:28] I have a meeting with janet today to register [14:02:06] Greetings! [14:18:34] mpham: I think I mentioned the 6 weeks checkin to you already: https://docs.google.com/document/d/18Y5cAdN4HfKJktzTmVb1EWaFhsOj8F9UA0UfJRWi3wY/edit (doc already up to date - no action required) [14:19:33] mpham: there is also a feedback document, due by Feb 18: https://docs.google.com/document/d/1TNzmTBJAlLwVMjlKlsAiuevhJ2urMfpXwf3yRI02mgE/edit. I'll go through it on my side, but feel free to also add your notes if you want. [14:32:50] dcausse sorry to hear about the updater, I do see we were getting alerts for it last wk. I'm adding a flag to my mail so it bugs me a little more, LMK if we need to tune the alerts or if I can do anything else to help [14:33:31] inflatador: no worries, still trying to understand what gone wrong [14:50:53] ACK [14:52:13] Merging https://gerrit.wikimedia.org/r/c/mediawiki/vagrant/+/759227 shortly unless anyone objects [14:59:27] inflatador: go ahead :) [15:06:11] Think I may have gotten disconnected, but ^^ is merged [15:29:08] gehel: yup, i'm enrolled for feb 14 - feb 17 us time [15:41:56] appt’s over running a bit, might be 10 mins late to triage [16:00:23] yup, back in 10 [16:02:04] ejoseph: triaging meeting: https://meet.google.com/qho-jyqp-qos [17:59:40] lunch/errands, back in ~1h [18:04:49] looks like ejoseph might not be able to attend the ES training this month, it's already fully booked. Janet is working on an alternative date (or maybe someone will cancel) [18:49:04] dc-ops has a list of servers here that should be decom'd when possible to reduce the total power consumption (as an organization we're a bit over our forecasted power consumption) [18:49:29] we have a good number of hosts in that list, the elastic hosts will be decom'd as part of the eqiad & codfw elastic refresh hosts that inflatador and I are bringing in this week [18:49:57] besides the elastic hosts there are a handful of wdqs hosts that need to be decom'd as well; I just made https://phabricator.wikimedia.org/T291982 to track that [18:50:41] won't get to the wdqs hosts till next week given the other stuff we have in flight (elastic etc) [19:18:22] oh thought I linked to the spreadsheet, https://docs.google.com/spreadsheets/d/1EYK0x4GCxv1fG77Co-S-uBiPqrvXRiLzIW34kOAJWws/edit#gid=342761434 should have been in my first message above [19:24:37] ryankemper yeah, I noticed that at the mtg earlier. We can pair at 2 PM your time if you like, have to pick up my son before then [20:48:31] curious. Removing the ' from shouldn't in a comment in HQL fixes a ParseException :S [20:50:22] that's...odd [20:50:29] you think that'd be the kind of thing that'd have gotten patched long ago [20:50:36] you'd think* [20:55:08] yea it wasn't exactly my first thought looking for the error :) curiously naively removing the rest of the query and only testing the comment and ' doesn't trigger the error, it's some combination of things [21:17:05] cable guy is here fixing my connection, I will be flapping for a little bit [21:27:36] picking up my son, back in ~30 [21:55:49] back [22:48:15] * ebernhardson wonders how exactly spark describe() functionality decides the mean of the string query col is '5.681818181818182E97'