Show HN: A Zero-Copy 1.58-bit LLM Engine hitting 117 Tokens/s on single CPU core

(github.com)

4 points | by dhilipsiva  6 hours ago

No comments yet.