[07:17:01] Morning o/ [08:24:09] 06Machine-Learning-Team, 13Patch-For-Review: Investigate the inconsistent load test results (locust) for revertrisk - https://phabricator.wikimedia.org/T361881#9798547 (10isarantopoulos) 05Open→03Resolved [09:20:29] 06Machine-Learning-Team, 10Structured-Data-Backlog (Current Work): [SPIKE] Send an image thumbnail to the logo detection service within Upload Wizard - https://phabricator.wikimedia.org/T364551#9798701 (10mfossati) [09:36:52] Morning! [09:41:25] guten tag o/ [09:48:57] \o [09:50:00] isaranto: o/ https://gerrit.wikimedia.org/r/c/machinelearning/liftwing/inference-services/+/1025805 hasn't been merged yet [09:56:16] Thanks for pointing that out! [10:05:11] (03PS6) 10Ilias Sarantopoulos: revertrisk: update locust results [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1025805 (https://phabricator.wikimedia.org/T361881) [10:08:12] (03CR) 10Ilias Sarantopoulos: [V:03+2 C:03+2] revertrisk: update locust results [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1025805 (https://phabricator.wikimedia.org/T361881) (owner: 10Ilias Sarantopoulos) [10:27:05] hello folks! Opened https://github.com/ROCm/k8s-device-plugin/issues/65 [10:28:04] o/ Grazie signore :D [10:29:29] we should say thanks to jayme :) [10:29:44] big gift from serviceops [10:30:07] now why allowing mknod makes access(F_OK) working is not clear to me [10:31:56] I wondered if udev/hotplug was involved (accessing the device meaning creating it), but I can't for the life of me think of a scenario in which that would happen [10:32:55] maybe there is some default docker behavior that causes access to fail [10:33:39] yeah either in docker or in the cgroup defaults [10:34:32] going afk for lunch, I suspect that the AMD upstream folks will take a big to reply :( [10:34:53] 10Lift-Wing, 06Machine-Learning-Team: GPU errors in hf image in ml-staging - https://phabricator.wikimedia.org/T362984#9799040 (10elukey) Opened https://github.com/ROCm/k8s-device-plugin/issues/65 Thanks a lot Janis <3 [10:57:51] * isaranto afk lunch! [11:02:33] * klausman lunch as well [12:34:53] * elukey back! [14:00:03] (03PS1) 10Kevin Bazira: logo-detection: process image objects instead of image URLs [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1031590 (https://phabricator.wikimedia.org/T363506) [14:01:33] (03CR) 10CI reject: [V:04-1] logo-detection: process image objects instead of image URLs [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1031590 (https://phabricator.wikimedia.org/T363506) (owner: 10Kevin Bazira) [14:05:41] (03PS2) 10Kevin Bazira: logo-detection: process image objects instead of image URLs [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1031590 (https://phabricator.wikimedia.org/T363506) [14:06:32] (03PS3) 10Kevin Bazira: logo-detection: process image objects instead of image URLs [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1031590 (https://phabricator.wikimedia.org/T363506) [14:12:54] (03CR) 10Kevin Bazira: "To avoid reinventing the wheel and build a custom in-memory batch processing solution, I continued leveraging Keras' inbuilt `image_datase" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1031590 (https://phabricator.wikimedia.org/T363506) (owner: 10Kevin Bazira) [14:13:42] (03CR) 10Kevin Bazira: "This functionality has been tested locally and here are the results:" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1031590 (https://phabricator.wikimedia.org/T363506) (owner: 10Kevin Bazira) [15:30:30] 06Machine-Learning-Team, 10Structured-Data-Backlog (Current Work): [SPIKE] Send an image thumbnail to the logo detection service within Upload Wizard - https://phabricator.wikimedia.org/T364551#9800682 (10matthiasmullie) a:03matthiasmullie [15:39:22] https://docs.kernel.org/admin-guide/cgroup-v2.html#device-controller [15:39:25] very interesting [15:55:34] I think I need a bit translation! [15:55:46] is this what happens in our case? [15:57:47] iiuc it is caused by cgroupsv2 which uses mknod (?) [15:58:14] so IIUC in cgroups v2 (that we use for our containers) a device is attached together with an ebpf program, that runs when the device is accessed.. if the perm is not granted, it returns EPERM (what the syscall access(F_OK) returns to us) [15:58:38] mknod is used to create a device, but my theory is that access(F_OK) for some weird reason, uses mknod [15:59:04] and since the k8s-device-plugin only allows to read/write from devices (not create new ones) it fails [16:02:40] tiny nitpick: we get EACCESS, not EPERM [16:03:35] clear! thanks for the explanation [16:03:37] But yeah, the BPF program may be what causes this weird discrepancy. Now if I understood how to get the currently-running eBPF program of a container.... [16:06:16] klausman: nope [16:06:27] we get EACCESS for ioctl, EPERM for access() [16:06:33] ah, right, my bad [16:06:57] I found https://dropbear.xyz/2023/05/23/devices-with-cgroup-v2/ that is useful, but the decoded program is not very readable [16:07:01] at least for me [16:13:41] yeah, it's basically assembler. Currently digging through my old biookmarks. There was a tool that could make a more readable version out of a eBPF bytecode program [16:15:52] elukey: try the disasm subcommand of seccomp-tools [16:18:14] can't find seccomp-tools on debian, also does it work with ebpf? [16:28:57] no luck for today, going afk folks! Have a nice evening [16:29:13] enjoy your evening! [16:35:04] have a nice evening! [16:35:13] I'm going afk too, cu tomorrow! [16:42:36] same here \o [17:35:55] Hi folks, thanks for all the work you do! I’m a researcher who used ORES in the past as part of a research study, and I’m looking for historical records on precision and recall for the goodfaith model in the period from July 2019 through the end of June 2020. There used to be endpoints for querying this information for a given language and a given model— do you know if they are still archived somewhere? [19:00:40] Hi natematias ! Thank you for your kind words! Unfortunately some of the functionality is no longer supported. [19:02:23] However please share one or more example requests if you have them handy and I can look into this tomorrow [19:10:26] Oh, thanks isaranto! It’s in relation to this study, where we used ORES to evaluate 4 edits from nearly 75 thousand accounts on Polish,German, Arabic, and Persian language Wikipedias, between August 2 2019 and Feb 11 2020. [19:10:28] https://citizensandtech.org/2020/06/effects-of-saying-thanks-on-wikipedia/ [19:10:48] If there’s information about the reported precision and recall of the good faith ORES during that period, we would like to report it in our paper. [19:13:12] And if there happens to be a longer table of precision/recall of different models over time, I would find it interesting and could identify the relevant results from the list/table. [20:17:30] (oh, and for context, if it’s helpful, queries like the following used to provide the precision/recall data, and other model documentation) https://ores.wikimedia.org/v3/scores/enwiki/?model_info&models=damaging [21:51:19] (03PS1) 10Umherirrender: i18n: Replace mw: interwiki with url to mediawiki.org [extensions/ORES] - 10https://gerrit.wikimedia.org/r/1032071