Pollux – a natively vector quantized LLM with 0.76 bits per parameter

(github.com)

1 points | by pollux_llm  5 hours ago

1 comments