[08:13:42] * gehel needs to be in Neuchâtel this morning. [08:13:59] ejoseph: I might be a few minutes late for our 1:1 [08:14:27] Alright [08:14:32] Good morning [08:14:38] o/ [08:14:43] And good morning! [09:00:27] ejoseph: and I'm on time! [13:01:47] errand: i need to get to the Travel Agent's office [13:10:39] Happy Searchin' Monday [13:34:16] o/ [13:46:49] dcausse any updates on the jvmquake package build? I think ryan-kemper was supposed to talk to moritz-m [13:49:35] inflatador: I think we can move forward [13:50:14] moritzm: when you get a change would mind double-checking the last bits we change on the debian rules in https://gitlab.wikimedia.org/repos/search-platform/jvmquake/-/merge_requests/2/diffs to disable tests? [13:50:22] s/change/chance [13:50:51] Thanks to both of you, let me know if/when the packages are available and I can help deploy [13:53:12] inflatador: the next bit (when the package is deployed) is https://gerrit.wikimedia.org/r/c/operations/puppet/+/770978 (for which I ran a PCC visible at https://puppet-compiler.wmflabs.org/pcc-worker1001/1257/) [13:56:01] cool, added myself as reviewer [14:14:41] dcausse: sure, in a meeting, but will have a look later [14:15:00] thanks! [14:15:20] o/ do you all have a list anywhere of Wikipedia languages that do not follow the pattern of whitespace being an indicator of word separation. That is, languages that don't use whitespace or where whitespace e.g., appears between syllables and not words? context: i'm working on describing Wikipedia edit diffs in a nice structured way and one part of that is indicating how much text has been changed. for whitespace-delimited languages [14:15:20] like English, splitting on spaces gives us useful word counts. But for languages like Japanese, we switch to counting up the number of characters changed because determining word counts is very tricky and probably beyond the scope of the library for now. we have the start of a list of languages where we should count characters instead of words here ( https://github.com/geohci/edit-types/blob/main/mwedittypes/constants.py#L12 ) but I'm [14:15:20] pretty sure it's incomplete [14:16:11] Trey314159: this question might be for you ^ [14:17:42] isaacj: quickly looking through Trey's notes I see https://www.mediawiki.org/wiki/User:TJones_(WMF)/Notes/Spaceless_Writing_Systems_and_Wiki-Projects [14:19:03] dcausse: thank you!! that looks like a fantastic start. the internet had a lot of conflicting info (i think because some languages use both approaches sometimes) so i figured i'd need some Trey help on this one [14:55:30] Sorry, @isaacj, I got a late start today. @dcausse, thanks for pointing him in the right direction! Isaac, that list isn't quite what you need because it doesn't include Vietnamese, which has spaces between syllables; many words are one syllable, but some are mroe than one. There are Vietnamese tokenizers that will come up with multi-syllable tokens, for example. I'm not sure if there are other languages like that, though none [14:55:30] that I know of off the top of my head. If you want to talk more, let me know. [14:56:48] no worries Trey314159 -- that's useful context. this list is already a great start but maybe i'll drop in on the next office hours to talk through it more. the tool is under development so we don't need a perfect solution yet, just one that i felt pretty good about for the moment :) [14:57:29] isaacj: it's a good topic for the language nerd meetup, too! [14:57:38] Or the linguist-guild slack channel [14:58:19] :thumbs up: (ahh i was wondering - i looked for "search" on the channel browser but didn't think to check "lingu*") [15:02:15] ejoseph: triaging meeting is starting: https://meet.google.com/eki-rafx-cxi [15:03:09] I just got back from the Visa agency [15:46:14] is it possible to have gmail always use a font i choose instead of one the incoming mails are? There's one regular email in particular that is always in a tiny font [16:27:10] dcausse: not 100% whether that's correct Make syntax [16:27:15] echo "Tests disabled (they are not portable)" [16:27:16] would work in case [16:27:30] but it it builds, it's probably fine as well [17:05:44] moritzm: thanks for looking, I'm fine cleaning this up, but assuming it builds would it be OK to deploy this to our apt repo? [18:01:47] lunch, back in ~40 [18:25:26] back [18:37:06] dcausse: sure thing, just go ahead and merge and I'll build/upload to apt.wikimedia.org tomorrow [19:43:56] ryankemper gehel (or anyone else) I'm adding LibUp monitoring for jvmquake ( https://phabricator.wikimedia.org/T303957 ) , anyone have strong feelings about which phab board we should get a ticket when it updates? Was thinking WQDS myself [19:59:21] inflatador: gehel: good question, wdqs makes sense to me for now and we could reeval if we ever roll it out to ES [19:59:35] * ebernhardson is surprised to see cloudelastic reindexing faster than eqiad and codfw, 4.3M docs completed vs 3.1M [20:00:56] inflatador: any phab board is fine, the important thing is that we see it and triage it [20:03:02] ACK, will get the PR out soon [20:45:24] patch is up https://gerrit.wikimedia.org/r/c/labs/libraryupgrader/config/+/772499 but want to hear back from lego-ktm before merging [20:50:09] quick break, back in ~10 [20:56:38] inflatador: CI is failing [20:56:48] I think id is supposed to be an int [20:57:50] RhinosF1 ah, good catch. Will fix shortly [20:58:18] inflatador: I can drop a new PS on top of you're busy [21:00:17] RhinosF1 my tox is still failing locally [21:00:25] Hmm [21:01:49] inflatador: do you still get 2 fails or just test_formatting [21:02:25] Just test_formatting [21:04:17] inflatador: check tabs v spaces [21:04:49] Looks like it's going for 4 spaces [21:32:52] Thanks, working on it now [22:04:48] ebernhardson: you around? could use some help pondering the proper way to handle replication during elasticsearch upgrades [22:05:20] ryankemper: ya, sec [22:05:22] for context, it looks like a replica shard cannot be scheduled to a host on a lower ES version than where the primary shard is scheduled. that means that relforge is permanently stuck in yellow during the upgrade [22:05:42] ryankemper: hmm, known problem but we never think of it until the restart is going [22:05:50] :P [22:06:04] when we upgrade production we won't have the constraint of only having two hosts, but I fear that the row awareness might make us end up in a similar place [22:06:12] ebernhardson: https://meet.google.com/iqe-kwqc-txq [22:31:37] Some context for backlog: https://phabricator.wikimedia.org/T301955#7794845 [23:27:19] always fun, the elasticsearch-oss:6.5.4 image can't securely fetch the plugins from apt.wikimedia.org, it has an expired cert in the chain for lets encrypt [23:30:22] well, wget can't. but curl can. I'm often surprised that not everything uses the system certs (but should know better by now...)