Is One Layer Enough? A Single Transformer Layer Matches Full-Parameter RL Train

(arxiv.org)

36 points | by tcp_handshaker  2 hours ago

9 comments