[19:03:42] hmm... @ragesoss any other details you remember or something in particular you wanted to cite from that paper? [19:14:49] i'm mainly interested in what data and techniques (and code) were used to do the clustering... if i'm remembering right (big if), the paper was partly about interpreting the unsupervised clustering results, and it either was not cross-lingual or it resulted in clusters that were not language-specific. like, i think the biggest cluster was something pretty broad like 'entertainment and culture'. definitely, broader topics than the ones in [19:14:49] Figure 4 from the WikiPDA paper, and i believe it was a relatively small number of topics for the entirety of Wikipedia, with some figure or table showing the proportional size of coverage.