Does this really work? Seems likely to be more hype.
Codex makes all kind of terrible blunders that it presents as "correct". What's to stop it from just doing that in the loop? The LLM is still driving, same as when a human is in the loop.
ralphs is good at letting things complete, but is far from "making commercial software for $10"
just the initial coding first requires you to actually define what the output is
if somebody can make a cleanroom agent that can explore and document specifications for commercial software, you could maybe throw ralph at building it, but then you still have to work out the parts that dont have documentation/training details, like how you are going to maintain it
the loop is pretty perfect for something like "my dependency updated. decide whether to update to match, and then execute"
itll do the mediocre job and then keep trying till it gets something working, at probably the most expensive token cost possible
While being an insightful satire of mass training LLMs with (negative) reinforcement learning, it's actually from the 1993 episode "Last Exit to Springfield", thought by many (including me) to be the single greatest Simpsons episode of all time (https://www.reddit.com/r/Simpsons/comments/1f813ki/last_exit...).
Does this really work? Seems likely to be more hype.
Codex makes all kind of terrible blunders that it presents as "correct". What's to stop it from just doing that in the loop? The LLM is still driving, same as when a human is in the loop.
ralphs is good at letting things complete, but is far from "making commercial software for $10"
just the initial coding first requires you to actually define what the output is
if somebody can make a cleanroom agent that can explore and document specifications for commercial software, you could maybe throw ralph at building it, but then you still have to work out the parts that dont have documentation/training details, like how you are going to maintain it
the loop is pretty perfect for something like "my dependency updated. decide whether to update to match, and then execute"
itll do the mediocre job and then keep trying till it gets something working, at probably the most expensive token cost possible
The Ralph Wiggum Plugin, so you don't need the bash while loop:
https://github.com/anthropics/claude-code/blob/main/plugins/...
The README.md has prompt examples.
Does this clear the context after each iteration?
A thousand digital monkeys and a thousand terminals..
Highly relevant Simpsons clip: https://www.youtube.com/watch?v=no_elVGGgW8
While being an insightful satire of mass training LLMs with (negative) reinforcement learning, it's actually from the 1993 episode "Last Exit to Springfield", thought by many (including me) to be the single greatest Simpsons episode of all time (https://www.reddit.com/r/Simpsons/comments/1f813ki/last_exit...).
I can see why Anthropic would like this idea...