[16:17:43] cteam: testing testing testing! [16:17:58] pong :) [16:18:17] Worked! [20:18:33] bd808: poke, on toolhub is there a way to flag dupe tool entries (and/or get access to delete dupe tools)? [20:19:23] TheresNoTime: T293518 [20:19:23] T293518: Figure out how to deal with duplicate toolinfo records - https://phabricator.wikimedia.org/T293518 [20:19:37] ah :D [21:57:42] TheresNoTime: if you have ideas I am more than happy to hear them. I would be especially interested to understand where the duplicates are coming from in the first place. [21:59:45] well the dupe I noticed was https://toolhub.wikimedia.org/tools/toolforge-bullseye and https://toolhub.wikimedia.org/tools/bullseye, the first of which was imported from `toolsadmin.wikimedia.org` and the second manually created [22:00:28] most (all?) of them I have seen are like that. one record imported from Striker and one manually added somehow [22:01:34] the toolforge-bullseye record comes from https://toolsadmin.wikimedia.org/tools/id/bullseye [22:02:32] the other was manually created by GeneralNotability within Toolhub directly [22:02:56] you rang? [22:03:00] I suppose there's a bit of confusion between one being named `toolforge-bullseye` and `bullseye` [22:03:25] I think I made the manual one before the automation was a thing [22:04:10] The toolinfo export from Striker predates the invention of Toolhub by ~4 years [22:05:22] you created it (https://toolhub.wikimedia.org/tools/bullseye/history) 4 days after the automated import (https://toolhub.wikimedia.org/tools/toolforge-bullseye/history), ideally it'd 'see' the conflict there, but given that `toolforge-bullseye` and `bullseye` don't conflict.. [22:05:26] then I have no idea why there are two [22:05:33] timestamps show that the toolforge-bullseye record is 4 days older than the manual one [22:05:38] okay [22:05:42] I suck, then :) [22:05:47] you can delete the manual one [22:06:06] (unless the conflict checking could ignore `toolforge-` prefixes when checking?) [22:06:13] and this is not about picking on you GenNotability, just an example of what's happening in some palces [22:06:35] bd808: oh, no worries, I'm not feeling picked on [22:06:41] * TheresNoTime is absolutely picking on GenNotability though [22:06:53] I was going to say... [22:07:33] TheresNoTime: the whole point of the prefix is actually to avoid name collisions because it turns out that naming things is hard and the toolinfo.json spec was kind of horrible to use a freeform string as the primary key. [22:07:38] !log admin systemctl restart libvirt-guests.service on cloudvirt1019 to get ceph/rbd working on VMS on this hypervisor [22:07:41] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [22:08:16] oof :/ [22:09:00] One thing I think we could do better would be to surface likely collisions at creation time in the Toolhub UI. We could for instance match URLs [22:09:38] That would not eliminate all duplicates, but it might help prevent a few more from being created [22:10:34] matching authors + similar names would likely prevent quite a few too.. [22:10:57] how do you define a "similar name"? [22:11:32] not asking out of snark, looking for ideas :) [22:15:15] toolhub is Python right? Make defining a similar name someone else's problem - https://github.com/luozhouyang/python-string-similarity#overview :D [22:20:55] n-gram shingles are probably what would work best at scale for us since we have Elasticsearch too, but I guess I meant something other than string identity comparison. matching `toolforge-bullseye` and `bullseye` would take more than a simple Levenshtein distance (excepting that stripping a `toolforge-` prefix would be an obvious optimization) [22:21:51] I should run some queries to see what URL matching surfaces. That seems like a mostly cheap check. [22:22:47] I'd bet the majority of dupes are with `toolforge-` prefixes [22:23:01] Adding a normalizer for the tools.wmflabs.org -> toolforge.org URL changes would be a reasonable thing to in URL matching [22:30:26] can someone fix the ownership /data/project/cvn/bots/CVNBot14/Lists.sqlite ? I can't `take` it because the group got copied when I scp'd it in [22:32:01] AntiComposite: {{done}} [22:32:09] thanks [22:32:24] !log tools.cvn `sudo chown tools.cvn:tools.cvn /data/project/cvn/bots/CVNBot14/Lists.sqlite` per IRC request [22:32:26] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cvn/SAL [22:35:23] TheresNoTime: T324717 [22:35:24] T324717: Suggest possible duplicates at toolinfo creation time via UI - https://phabricator.wikimedia.org/T324717 [22:35:35] patches welcome! ;) [22:35:38] \o/ [22:51:03] !log clouddbservices remove read_only mode on clouddb1001 (SET GLOBAL read_only = 0;) [22:51:04] dhinus: Unknown project "clouddbservices" [22:51:12] !log clouddb-services remove read_only mode on clouddb1001 (SET GLOBAL read_only = 0;) [22:51:12] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Clouddb-services/SAL