[09:11:14] looking at the vectors in relforge for simplewiki I think they're not normalized, but we use l2 for the vector field, not sure how this all need to be configured but looking at opensearch pre-trained models all models that do not normalize their output are either innerproduct or cosinesim [09:16:16] also not sure what's the advantage of using raw vectors with cosinesim vs normalizing and using l2, I'm probably missing something but should be more performant to normalize at index time and use l2? [11:17:09] lunch [11:53:50] dcausse: the cosine similarity measures the angle between two vectors, thus the length of the vectors doesnt play a role and you will get the same result independent of whether you normalize or not. also cosine distance is usually a more meaningful distance measure when the number of dimensions becomes large. This is because of the curse of dimensionality where most vectors will be concentrated within a narrow sphere such that angles are [11:53:50] better capturing their distance than euclidean measure. https://en.wikipedia.org/wiki/Curse_of_dimensionality#Distance_function Thus I would recommend using cosine similarity of raw vectors. [13:10:27] mgerlach: thanks! [16:44:07] heading out