[06:43:43] 10Lift-Wing, 10artificial-intelligence, 10articlequality-modeling, 10Machine-Learning-Team (Active Tasks): Migrate articlequality models - https://phabricator.wikimedia.org/T307416 (10kevinbazira) [06:55:42] 10Lift-Wing, 10artificial-intelligence, 10articlequality-modeling, 10Machine-Learning-Team (Active Tasks): Upload articlequality model binaries to storage - https://phabricator.wikimedia.org/T307417 (10kevinbazira) [07:03:30] 10Lift-Wing, 10artificial-intelligence, 10articlequality-modeling, 10Machine-Learning-Team (Active Tasks): Create articlequality inference services - https://phabricator.wikimedia.org/T307418 (10kevinbazira) [07:33:35] hello folks [07:33:52] I am going to reimage ores2001 again, to see if the same error as on 2002 appears [08:03:05] good morning! [08:31:27] (03CR) 10AikoChou: "Hi, the pipeline should work now because the changes in integrations/config https://gerrit.wikimedia.org/r/c/integration/config/+/786934 g" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/785848 (https://phabricator.wikimedia.org/T301766) (owner: 10AikoChou) [08:42:56] very interesting, ores2001 now shows the same behavior as 2002 [08:59:51] Well, at least it's now consistently broken [09:00:12] Though I am puzzled how an extra file showed up during/after the first reimage [09:11:36] klausman: hi :) so one thing that I remember is that the ores repo on deploy1002 was not updated when I first reimaged 2001 [09:11:50] then I did a git pull and a scap deploy --limit [09:12:10] but it shouldn't matter that much [09:12:21] Ah, so the repo was in _some_ state that made it work? [09:12:44] Curiouser and curiouser [09:12:50] it is the only thing that was different [09:14:37] also there is another thing [09:16:20] IIUC we have, for the ORES wheels repo, mirroring from gerrit to github [09:16:21] https://github.com/wikimedia/research-ores-wheels/ [09:16:38] that happened the last time that Aaron merged the changes [09:16:42] but it doesn't know [09:16:59] and if you see the wheels like https://github.com/wikimedia/research-ores-wheels/blob/master/Flask-1.0.4-py2.py3-none-any.whl [09:17:08] there is something like version https://git-lfs.github.com/spec/v1 [09:27:36] the other curious bit is that if I do file .* in my ores wheel local repo, all wheels are zips [09:28:00] but the git-lfs version is higher of course [09:28:17] deploy1002 is on buster [09:28:20] I am on bullseye [09:28:37] But that still should not affect repo contents [09:30:23] in theory yes [09:32:28] So I just clone the wheels repo and at HEAD, _some_ wheels are ASCII text, according to file(1) [09:34:43] And the tree contents is identical, according to md5sum [09:35:12] E.g. more_itertools-7.2.0-py3-none-any.whl is ASCII text both on ores2002 and my local machine. Is it a ZIP for you? [09:59:44] yep! (sorry in a meeting) [10:22:07] klausman: have you checked the master branch or the python37 one? [10:23:10] but it seems confirmed by https://gerrit.wikimedia.org/g/research/ores/wheels/+/refs/heads/python37 [10:23:21] some are zips and some lfs objects [10:32:04] I am on the same commit as the checkout on ores2002 (1e9f545) [10:32:29] Note that since the wheels dir is a submodule, you have to use git status there, not up in the main repo [10:32:57] sure sure, but if you checkout the repo on your laptop? [10:33:09] do you see zips or a mix? [10:33:14] A mix [10:33:33] I just removed the wheels repo and recloned, I see all zips [10:34:06] `git clone "https://gerrit.wikimedia.org/r/research/ores/wheels"` is all I did [10:35:07] ah no ok I have this [10:35:08] git clone "ssh://elukey@gerrit.wikimedia.org:29418/research/ores/wheels" && scp -p -P 29418 elukey@gerrit.wikimedia.org:hooks/commit-msg "wheels/.git/hooks/" [10:35:14] that comes from gerrit autogenerated [10:35:46] so I must have a git hook that does the magic and pulls lfs files [10:35:59] sec, trying that [10:36:43] that scp fails for me [10:36:54] "subsystem request failed on channel 0" [10:37:51] But that's only a commi msg hook, so it shouldn't matter here [10:38:16] But still, both HEAD and 1e9f545 on that repo have a mix of ZIP wheels and ASCII files. [10:38:45] (and one subdir, nltk) [10:39:56] so in my git hooks dir I see a lot of git-lfs related commands [10:40:56] Well, since I can't scp off of gerrit, I don't have those, but they shouldn't matter for what's in the initial clone [10:41:22] there is a post checkout step, git lfs post-checkout [10:42:11] klausman: https://gerrit.wikimedia.org/r/admin/repos/research/ores/wheels [10:42:20] there is also the anonymous fetch etc.. [10:42:23] if you want to test it [10:42:42] need to go, will keep going after lunch [10:42:50] (I have downtimed 2001) [10:43:11] Oh _lovely_ [10:43:29] That page is missing important stuff if you're not logged into gerrit (and of course my credentials expired) [10:45:12] https://phabricator.wikimedia.org/P27352 [10:45:17] Note how the scp still fails [10:45:59] ah, -O is needed [10:46:26] scp somewhat recently changed to using sftp-the-protocol-and-subsystem by default, and the gerrit host does not have/allow that [10:46:46] Ok, so now all the files are ZIPs indeed [10:47:11] elukey: I wonder if the update to Buster gave us an scp that has the same problem as my bookworm install (needing -O) [10:47:52] hmno, th manpage on ores2002 doesn't mention -O [10:49:50] Still, my best guess is that the hooks are not installed correctly when things are deployed on these hosts. God knows if setting up from scratch would work on the old installs [11:17:15] (03PS3) 10AikoChou: outlink: handle http bad request [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/785112 (https://phabricator.wikimedia.org/T306029) [11:25:37] (03CR) 10AikoChou: outlink: handle http bad request (032 comments) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/785112 (https://phabricator.wikimedia.org/T306029) (owner: 10AikoChou) [11:38:23] It might also be that the initial clone/deploy was done before git-lfs was installed? [11:38:40] Because that's exactly what I get if I uninstall git-lfs and do the clone [12:01:49] Good morning! [12:04:29] o/ morning [12:45:55] heya both of you :) [13:32:49] o/ [13:32:56] klausman: ah nice so you can see the zips as well [13:34:26] ok so on deploy1002 there may be an issue then [13:37:50] 10Lift-Wing, 10Machine-Learning-Team: Support (or not) the ORES batch scoring in Lift Wing - https://phabricator.wikimedia.org/T306986 (10achou) Based on discussions at the team meeting last week, there are a few things need to be investigated before we decide to support the batch scoring feature on Lift Wing:... [13:41:45] klausman: ok so in the ores-deploy repo, I can see the same set up as on deploy1002 for the wheels submodule [13:41:51] mixed ASCII/ZIP [13:42:10] what we do is just git submodules update --init [13:45:10] does deploying from that then fix 2001/2? [13:49:01] (03CR) 10Kevin Bazira: [V: 03+2 C: 03+2] "LGTM!" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/785848 (https://phabricator.wikimedia.org/T301766) (owner: 10AikoChou) [13:50:49] Reminder I won't be in the first 30minutes of our team meeting, but might be in the last 30 minutes, depending how the meeting goes. [14:02:52] kevinbazira: ping team meeting :) [14:03:10] thanks elukey o/ [14:57:47] taking a little break [16:28:23] klausman: downtimed the ores200[12] nodes for a week :) [16:28:35] going afk now, have a nice rest of the day folks! [17:26:29] bye Luca :)