[04:46:35] dcausse: I haven't deployed WDQS `0.3.104` because I wasn't sure if doing so might screw things up given the rollback to a previous checkpoint. I can deploy it tomorrow around the time of the weds meeting if we need to deploy wdqs [05:13:50] Incident report: https://wikitech.wikimedia.org/wiki/Incident_documentation/2022-02-22_wdqs_updater_codfw [07:42:22] ryankemper: thanks for the report! [07:42:55] dcausse: codfw is still depooled, I'm waiting for your confirmation before repooling [08:03:19] dcausse / ebernhardson: is there anything left to do on T293462 ? The last patches seem to be about cleanup (which is great!), but I'm not clear if there is more work to be done. [08:03:19] T293462: Add user blocking in WCQS - https://phabricator.wikimedia.org/T293462 [09:01:55] Physiotherapy with Oscar, back in 1h [09:07:42] will try to reproduce the produce in yarn so that we can repool codfw soon [09:27:59] Hello search team, I want to follow up on the codfw incident from yesterdy, as there is a patch by addshore reverting changes done to mitigate the temporary loss of codfw. IS codfw back up and running? [09:29:26] itamarWMDE: yes it's back up but we'd like to keep down a little to debug something [09:40:59] itamarWMDE: I'll comment on https://gerrit.wikimedia.org/r/c/operations/puppet/+/764830 once we're OK if you're fine with this [10:19:46] dcausse: ofc, thank you for the quick response! [10:53:15] lunch [11:24:11] lunch [12:50:13] ejoseph: I might be a few minutes late for our meeting, the plumber is still trying to fix the toilets [12:50:41] alright [12:51:52] ejoseph: and he is done ! I'll be there on time ! [13:38:49] gehel: about T293462 I think remaining thing was to double check that the user_text is propagated to the query logs and then a cleanup indeed [13:38:50] T293462: Add user blocking in WCQS - https://phabricator.wikimedia.org/T293462 [13:42:33] ryankemper: it's perfectly fine to deploy 0.3.104 to wdqs too [14:03:14] Greetings [14:04:11] o/ [15:00:00] o/ [15:07:35] o/ [15:16:09] dcausse: can we continue [15:16:27] ejoseph: troubleshooting something at the moment [15:16:43] ejoseph: did you find the log4j.properties file? [15:16:56] Yh [15:17:11] It seems correct [15:17:36] it should say in which file the deprecation warnings are being logged [15:17:53] And also tried different queries but couldn’t trigger the deprecation error [15:19:54] ejoseph: deprecation warnings, I think, are in another log file (not elasctisearch.log) [15:45:40] Oh [15:52:31] can start the flink app in yarn but not in k8s... [17:23:45] workout, back in ~30 [18:03:13] and back [19:10:37] lunch, back in ~45 [19:41:38] fwiw, poking at some yarn worker nodes they hava a /usr/lib/jvm/java-11-openjdk-amd64 directory [19:42:10] i dunno how to convince flink in yarn to use it to test if that triggers the load error [19:50:36] aaand back [19:59:29] looks like there's -yD yarn.taskmanager.env.JAVA_HOME [19:59:37] I could try that perhaps [20:00:30] * ebernhardson uses the ugliest hacks to get that info..something like spark.range(1).map(lambda x: subprocess.check_output('ls -l ...')).collect() [20:00:47] :) [20:00:49] seems worth trying perhaps, but try tomorrow :) [20:32:31] ryankemper: https://meet.google.com/aib-canx-cgh [21:41:45] does anyone know if the instances in the WMCS "search" project have proper DNS names (and if so, what they are)? [21:42:07] ..eqiad1.wikimedia.cloud applies to all projects [21:43:18] excellent! Thanks taavi [21:51:10] names get longer and longer :P i'm sad on instances where foo.eqiad.wmflabs stops working [21:51:28] i'll have to find some way to tab complete instance names or something [21:52:55] yeah, maybe we could scrape Designate or something? taavi may have a better idea [21:54:44] you could do something like https://wikitech.wikimedia.org/wiki/User:Razzi/ssh_single_letter_domain_shortcut for the projects you use the most [21:55:08] for tab completion I don't really have any ideas for most projects [21:55:18] thats a nice trick, fixes the length problem [21:56:14] yeah, this is actually quite helpful [22:53:12] * ebernhardson notices that super_detect_noop does a weird thing, where it can noop some updates to the document but let others through, so you end up paying the full update cost but still only update some values [22:53:51] essentially it noops per-field, instead of deciding if it will update the document then applying all updates