DeepSWE: A contamination-free benchmark for long-horizon coding agents

(deepswe.datacurve.ai)

29 points | by ammar_x  6 hours ago

9 comments