Hypura – A storage-tier-aware LLM inference scheduler for Apple Silicon

(github.com)

217 points | by tatef  2 days ago

69 comments