[11:54:15] can a person grind like a bajilion hours of sleep and not need to sleep for that timeframe [11:54:22] science! [17:05:13] did you guys manage to figure out how to not make the reviewer ai hallucinate [17:05:47] write proper descriptions [17:05:59] idk when i submitted mine it got one attempt [17:06:25] ? wdym [17:08:50] like it didn't need a human reviewer [17:08:55] if that's what u mean 😭 [17:48:10] You mean the in house one? [18:02:25] miraheze gpt wrapper [18:09:56] [1/4] TL;DR of current state, to best of my understanding (with obvious disclaimer that I am not tech): [18:09:56] [2/4] * The in-house gear was assessed to be insufficient for running a model with consistent outputs, though we can use it for other things. [18:09:56] [3/4] * Upsizing in-house hardware with current config isn't practically possible, and hardware costs for a different config remain through the moon [18:09:56] [4/4] * We would like to eventually see the ability to leverage an interface layer that could use different endpoints so we can more readily switch depending on cost/ethics without a total rewrite, but that requires time/expertise/availability of the right people [18:37:00] weren't there 2 GPUs bought [18:37:31] I don't know the specs but I wasn't expecting them to be like anything fancy in the GPU department but something like gemma4 would "probably" run on them [18:38:14] if you've checked out groq (not elon's, they do inference), they have a fairly wide range of models from different providers and in my mind are on the more ethical side of inference providers compared to the likes of google and sam [18:38:54] although I'm pretty sure the interface layer is built on the openai assistants api which is supposed to be sunset anyway so a rewrite would probably be in order anyway [18:56:03] Yeah, a few flavors of models were tried, something we'll need to revisit later. [18:59:43] from what I remember meta models are pretty shit [18:59:52] might just be the fact I don't like meta that influences that decision though [19:01:20] The biggest issue (to my recollection) is that to get something that both physically fit in the rack and wouldn't break the bank, we ended up sacrificing on both memory and recency of GPU, so it just didn't have adequate onboard capacity to do the thing sufficiently [19:02:30] what were the specs if you remember? [19:02:47] No clue, not my expertise [19:02:54] but also realistically couldn't it just run async as some sort of service that handles it slowly [19:03:13] it does run the risk that if too many requests come in then it is going to get through them very slowly but