Working on reading through Agent Skills, it seems we've converged on a lot of the same points, and I've never seen it, so trying to get an understanding of it.
The best way to prompt an LLM is to describe the outcome you want, that's it. They are trained as task completers. A clear outcome is way better than a process.
If the LLM fails, either you didn't describe your outcome sufficiently or is misinterpreted what you said or it couldn't do it (rare).
Common errors should be encoded as context for future similar tasks, don't bloat skills with stuff that isn't shown to be necessary.
> The best way to prompt an LLM is to describe the outcome you want, that's it. They are trained as task completers. A clear outcome is way better than a process.
This is not true for anything complex. They’re instruction followers, of which task completion is just one facet.
They’re also extremely eager to complete tasks without enough information, and do it wrongly. In the case of just describing task completion, despite your best efforts, there are always some oversights or things you didn’t even realize were underspecified.
So it helps a lot to add some process around it, eg “look up relevant project conventions and information. think through how to complete the task. ask me clarifying questions to resolve ambiguities. blah blah”. This type of prompt will also help with the new Opus 4.7 adaptive thinking to ensure it thinks through the task properly.
If there is anything we have learned in decades of Software engineering, it's "A clear outcome" is not easy to describe. In many cases, it's impossible unless people from 4 different domains collaborate. That's why process matters. It allows for software to be built is a "semi-standardized" way that can allow iterations to get us closed towards the expected outcome, that might emerge over time.
Yes, not everything I use LLMs for going to have the same level of ambiguity or complex requirements. Optimizing by choosing to skip over parts of the process is exactly Addy is talking in this article.
a skill is just reusuable/shareable context. It's just text, really. It's useful for things like documentation on how to use an API (this works better than MCP in my opinion), or a non consensus way of doing something. For example, you can use remotion to generate video. There are useful remotion skills that allow you to reliably generate specific types of videos. Captions of a certain style, for example.
I agree that many skills are overblown and unnecessary. But there's a lot of value in giving AI the right process. See how much more effective Claude can be for moderate or large changes when using the superpowers skill.
That seems a bit reductive. Even with humans, there’s a range of interpretations and ways that something can be built or a task completed. Engineers remember stuff so you don’t have to keep repeating yourself. Skills are a way to describe your outcome without similar repetition.
This kind of "overprompting" is one technique that even the best skills/agents use to compensate for under-invocation, which happens when more demure advisory language tends to be rationalized away by LLMs.
It shouldn't be your default, but should absolutely be tried when your skill/agent test suite displays evidence that it's not being reliably invoked without it.
I would love to know how many people are actually using superpowers.
I showed up on the agentic dev scene prior to superpowers, and I am getting concerned that >50% of my self-rolled processes are now covered by superpowers.
I no longer trust gh stars, can anyone chime in? Is superpowers now truly adopted?
If it is truly valuable, why hasn't Boris integrated the concepts yet?
I adopted superpowers, but then adapted it. I've changed some things, added some things. I suspect that my set of agent skills is probably overlapping with OP's by quite a lot now.
I also found that I have different skills for different tasks; at work security is a huge concern and I over-emphasise security in the skills. At play I'm less bothered about security and so the skills I've written to help me build stupid one-shot exploratory websites are less about security and more about refactoring and exploring concepts.
I just removed superpowers from my own setup. In my opinion, given the quality of the planning modes in both claude code and codex, superpowers was really just slowing things down and burning more tokens than vanilla.
I have written zero skills, so not sure how normal it is. I counted the words in couple of them and they seem to be around 2k range. So 5 skills would be around 10K. Even at a small LLM context of 128k, that's still around 10%. And for a 1M context window like the big ones, it barely registers.
I've been using Agent Skills on a new side project and I'm really impressed so far! It really holds my hand a lot of the way and really lets me focus on developing a product instead of figuring out how to build it. I get to focus much more energy on high level architecture and product design.
Very grateful for this repository and everyone who contributed to it!
Thanks for this, going to steal a lot of this. I would install your plugin, but I worry about being able to delete it later. I also think that each one of these is better served customized to a developer. That said, I'm still going to grab some of these, thanks!
> This isn’t a coincidence. It’s the same SDLC every functioning engineering organisation runs, just in different vocabulary. [...] Amazon calls it the working-backwards memo and the bar raiser. Every healthy team has some version of this loop.
This (sdlc == working backwards & bar raiser) is so horribly wrong, that I hope this was an LLM hallucination.
In general, I'm starting to see these agent scaffolding systems as an anti-pattern: people obsess over systems for guiding agents and construct elaborate rube-goldberg machines and then others cargo-cult them wholesale, in an effort to optimize and control a random process and minimize human involvement.
The problem is it’s so rarely A/B tested, definitely not at scale. An engineer, who writes all these my-workflow-but-for-agents skills, proceeds to get the good outcome, while also seeing affirmations that the agent did follow the prescribed processes - that is considered a victory. In reality the outcome could’ve been just as good if they fed Claude a spec + acceptance criteria, or even a basic prompt for the simpler tasks.
This is how similarly we collectively approach Taylorism, isn't it? However, the world favors capitalism, of which Taylorism becomes a handy scaffolding.
This is why I created the /do router, to route to all skills. I also have anti rationalization, progressive context discovery etc.
I only make it for me, so it's a bit complex and targeted towards me, and what I do, but it's pretty easy to adjust things.
https://github.com/notque/vexjoy-agent
Working on reading through Agent Skills, it seems we've converged on a lot of the same points, and I've never seen it, so trying to get an understanding of it.
The best way to prompt an LLM is to describe the outcome you want, that's it. They are trained as task completers. A clear outcome is way better than a process.
If the LLM fails, either you didn't describe your outcome sufficiently or is misinterpreted what you said or it couldn't do it (rare).
Common errors should be encoded as context for future similar tasks, don't bloat skills with stuff that isn't shown to be necessary.
> The best way to prompt an LLM is to describe the outcome you want, that's it. They are trained as task completers. A clear outcome is way better than a process.
This is not true for anything complex. They’re instruction followers, of which task completion is just one facet.
They’re also extremely eager to complete tasks without enough information, and do it wrongly. In the case of just describing task completion, despite your best efforts, there are always some oversights or things you didn’t even realize were underspecified.
So it helps a lot to add some process around it, eg “look up relevant project conventions and information. think through how to complete the task. ask me clarifying questions to resolve ambiguities. blah blah”. This type of prompt will also help with the new Opus 4.7 adaptive thinking to ensure it thinks through the task properly.
If there is anything we have learned in decades of Software engineering, it's "A clear outcome" is not easy to describe. In many cases, it's impossible unless people from 4 different domains collaborate. That's why process matters. It allows for software to be built is a "semi-standardized" way that can allow iterations to get us closed towards the expected outcome, that might emerge over time.
Yes, not everything I use LLMs for going to have the same level of ambiguity or complex requirements. Optimizing by choosing to skip over parts of the process is exactly Addy is talking in this article.
Sometimes people don't know what they want.
I prefer the start small and iterate approach to arrive at a result.
Then I ask it to summarize. Sometimes after that I ask it to generalize.
a skill is just reusuable/shareable context. It's just text, really. It's useful for things like documentation on how to use an API (this works better than MCP in my opinion), or a non consensus way of doing something. For example, you can use remotion to generate video. There are useful remotion skills that allow you to reliably generate specific types of videos. Captions of a certain style, for example.
I agree that many skills are overblown and unnecessary. But there's a lot of value in giving AI the right process. See how much more effective Claude can be for moderate or large changes when using the superpowers skill.
That seems a bit reductive. Even with humans, there’s a range of interpretations and ways that something can be built or a task completed. Engineers remember stuff so you don’t have to keep repeating yourself. Skills are a way to describe your outcome without similar repetition.
From an SEO/LLMO perspective, the discoverability of these skills will be difficult without a rename: https://agentskills.io/
If Addy reads this, how do you pitch this vs. Superpowers? https://github.com/obra/superpowers
Does superpowers actually work? The main skill file doesn't inspire much confidence:
This kind of "overprompting" is one technique that even the best skills/agents use to compensate for under-invocation, which happens when more demure advisory language tends to be rationalized away by LLMs.
It shouldn't be your default, but should absolutely be tried when your skill/agent test suite displays evidence that it's not being reliably invoked without it.
I would love to know how many people are actually using superpowers.
I showed up on the agentic dev scene prior to superpowers, and I am getting concerned that >50% of my self-rolled processes are now covered by superpowers.
I no longer trust gh stars, can anyone chime in? Is superpowers now truly adopted?
If it is truly valuable, why hasn't Boris integrated the concepts yet?
I adopted superpowers, but then adapted it. I've changed some things, added some things. I suspect that my set of agent skills is probably overlapping with OP's by quite a lot now.
I also found that I have different skills for different tasks; at work security is a huge concern and I over-emphasise security in the skills. At play I'm less bothered about security and so the skills I've written to help me build stupid one-shot exploratory websites are less about security and more about refactoring and exploring concepts.
I just removed superpowers from my own setup. In my opinion, given the quality of the planning modes in both claude code and codex, superpowers was really just slowing things down and burning more tokens than vanilla.
Thank you for the data point.
To give back as much as I can, I use the two built-in CC review processes when appropriate. But, those only do "is this PR good code?"
Far too late did I finally roll my own custom review skill that tests: "does this PR accomplish what the specs required?"
If I could ask for one more vanilla CC skill, it might be that. However, maybe rolling your own repo-aware skill via prompt is better?
Looks like a bunch of canned skills served through a plugin?
I was surprised how long some of these skills are. They are pages and pages long with tables and checkbox lists and code examples, etc.
Curious how normal that is - it would only take a couple of these to really fill the context alot.
I have written zero skills, so not sure how normal it is. I counted the words in couple of them and they seem to be around 2k range. So 5 skills would be around 10K. Even at a small LLM context of 128k, that's still around 10%. And for a 1M context window like the big ones, it barely registers.
Naming things is such a hard problem that many devs don't even bother trying.
That being said, this post is full of reasonable assertions, so I'm looking forward to experimenting with this... whatever it is.
Wait, shit, are people using LLMs to name things now? I'm definitely out of a job then!
I've been using Agent Skills on a new side project and I'm really impressed so far! It really holds my hand a lot of the way and really lets me focus on developing a product instead of figuring out how to build it. I get to focus much more energy on high level architecture and product design.
Very grateful for this repository and everyone who contributed to it!
Thanks for this, going to steal a lot of this. I would install your plugin, but I worry about being able to delete it later. I also think that each one of these is better served customized to a developer. That said, I'm still going to grab some of these, thanks!
> This isn’t a coincidence. It’s the same SDLC every functioning engineering organisation runs, just in different vocabulary. [...] Amazon calls it the working-backwards memo and the bar raiser. Every healthy team has some version of this loop.
This (sdlc == working backwards & bar raiser) is so horribly wrong, that I hope this was an LLM hallucination.
In general, I'm starting to see these agent scaffolding systems as an anti-pattern: people obsess over systems for guiding agents and construct elaborate rube-goldberg machines and then others cargo-cult them wholesale, in an effort to optimize and control a random process and minimize human involvement.
The problem is it’s so rarely A/B tested, definitely not at scale. An engineer, who writes all these my-workflow-but-for-agents skills, proceeds to get the good outcome, while also seeing affirmations that the agent did follow the prescribed processes - that is considered a victory. In reality the outcome could’ve been just as good if they fed Claude a spec + acceptance criteria, or even a basic prompt for the simpler tasks.
This is how similarly we collectively approach Taylorism, isn't it? However, the world favors capitalism, of which Taylorism becomes a handy scaffolding.
I wonder how does this compare to superpowers
I adopted a couple of these, the api design and ui testing ones have been particularly helpful.