I made a kernel 2.2x faster. It made my training loop 3x slower

(kyrieblunders.bearblog.dev)

15 points | by vishal-padia  2 days ago

1 comments