[07:00:21] hello folks [07:00:54] there are some tmux sessions on wdqs nodes for user zpapier*ski, ok to kill? (puppet is broken atm) [07:12:47] elukey: go ahead [07:12:58] ryankemper: ack! [07:14:48] done [07:16:31] elukey: thanks! [08:07:32] o/ [08:47:37] o/ [08:47:42] dcausse: welcome back [08:57:09] folks wdqs1013 may need a little restart, shows up in icinga [08:57:22] elukey: looking [08:57:47] <3 [08:59:16] gehel: I would not be available for our meeting this morning [08:59:24] ejoseph: ack [08:59:51] ejoseph: want to reschedule? Or cancel for this week in favour of the Elasticsearch training? [09:02:42] we've had another incident on WDQS this weekend. This probably raises the priority of T293862 [09:02:43] T293862: Investigate using jvmquake to limit the time a JVM is unusable due to GC overhead - https://phabricator.wikimedia.org/T293862 [09:04:25] gehel: do you have a link or is it 2022-03-06_wdqs-categories ? [09:04:44] it's that one [09:04:47] https://wikitech.wikimedia.org/wiki/Incident_documentation/2022-03-06_wdqs-categories [09:07:27] 3 servers were unresponsive for several hours when a fourth one started to behave [09:07:56] that's clearly something jvmquake is meant to fix [09:08:36] too many things breaking! [10:40:28] Lunch [11:02:02] lunch 2 [12:53:05] ebernhardson: about T280487, I see a local redirect in the nginx config on wcqs-beta-01. Is there a more permanent redirect as well somewhere? [12:53:05] T280487: Redirect requests from wcqs-beta.wmflabs.org to the final URL for WCQS - https://phabricator.wikimedia.org/T280487 [12:54:17] I moved that task back to in progress [12:55:28] ryankemper / ebernhardson: is there anything more to do on T293462 ? It looks all good to me, but it is still in "needs review" [12:55:29] T293462: Add user blocking in WCQS - https://phabricator.wikimedia.org/T293462 [12:56:26] mpham: we have stats from wcqs-beta in https://drive.google.com/drive/u/1/folders/1ojrcehL7Bz0Cc4wKgdtD8CccruNK8lyh, can we close T299062 ? [12:56:27] T299062: Save stats from wcqs-beta - https://phabricator.wikimedia.org/T299062 [12:57:06] dcausse: should we close T302396 in favor of moving to S3 ? [12:57:07] T302396: Investigate EOFException when performing the first checkpoint after restoring from a savepoint - https://phabricator.wikimedia.org/T302396 [13:01:45] inflatador: any news on T297907 ? [13:01:46] T297907: SRE Onboarding - Brian King, Search Platform team - https://phabricator.wikimedia.org/T297907 [13:15:59] gehel: for T302396 I think we should keep it in waiting while we make sure this is solved by s3 [13:16:00] T302396: Investigate EOFException when performing the first checkpoint after restoring from a savepoint - https://phabricator.wikimedia.org/T302396 [14:01:01] Greetings [14:09:37] inflatador: o/ [14:10:02] o/ [14:48:46] dcausse: coudl you have a quick look at T302779 ? Looks all good to delete from me, but you might have something you want to salvage [14:48:46] T302779: Check home/HDFS leftovers of zpapierski - https://phabricator.wikimedia.org/T302779 [14:49:02] sure looking [14:49:18] thanks! [14:51:02] gehel: I don't see anything to salvage there [14:51:14] thanks! [14:51:31] I'll comment on the task [15:56:57] \o [15:58:48] gehel: for T299062 I would ideally like to pull stats one more time this month (after beta 1 has been shut down) to doublecheck usage of beta 2 [15:58:48] T299062: Save stats from wcqs-beta - https://phabricator.wikimedia.org/T299062 [16:04:34] https://wikitech.wikimedia.org/wiki/Incident_documentation/2022-03-06_wdqs-categories [17:22:22] dinner [18:00:13] airflow failure looks to be an infa issue being worked out, likely needs to be retried later in the day [18:20:43] razzi ryankemper just got back in, I'm in https://meet.google.com/qjx-rjvd-gva if y'all wanna chat [18:21:32] inflatador: try reloading? We're in that room [18:21:46] inflatador: join ours :) meet.google.com/qjx-rjvd-gva [18:51:49] * ebernhardson realizes that while we added @expect_failure tag to some integration tests, we never told cindy to skip thoses tests. Fixing [19:27:27] lunch, back in ~45 [20:09:54] * razzi is back from lunch [20:15:58] back [20:19:46] razzi cool, I'm up at https://meet.google.com/kpn-hvtr-qkt if/when you want to join [20:20:13] Cool I'll join there in a minute inflatador [21:44:42] ryankemper we're at https://meet.google.com/sey-fwpx-fmo if/when you wanna join. Looking at spicerack stuff [21:45:10] inflatador: cool, finishing up lunch so will hop on in 5 mins [21:55:17] network is flapping , sorry razzi [23:55:53] * ebernhardson kinda wishes more languages had the ??? sugar from scala, kinda conveient