Dispersion loss counteracts embedding condensation in small language models

(chenliu-1996.github.io)

18 points | by E-Reverance  2 hours ago

4 comments