He's also missed a major step, which is to feed your skill into the LLM and ask it to critique it - after all, it's the LLM that's going to act on it, so asking it to assess first is kinda important. I've done that for his skills, here's the assessment:
==========
Bottom line
Against the agentskills.io guidance, they look more like workflow specs than polished agent skills.
The largest gap is not correctness. It is skill design discipline:
# stronger descriptions,
# lighter defaults,
# less mandatory process,
# better degraded-mode handling,
# clearer evidence that the skills were refined through trigger/output evals.
Skill Score/10
write-a-prd 5.4
prd-to-issues 6.8
issues-to-tasks 6.0
code-review 7.6
final-audit 6.3
==========
LLM metaprogramming is extremely important, I've just finished a LLM-assisted design doc authoring session where the recommendations of the LLM are "Don't use a LLM for that part, it won't be reliable enough".
I do similar, but my favorite step is the first: /rubberduck to discuss the problem with the agent, who is instructed by the command to help me frame and validate it. Hands down the most impactful piece of my workflow, because it helps me achieve the right clarity and I can use it also for non coding tasks.
After which is the usual: write PRDs, specs, tasks and then build and then verify the output.
I started with one the spec frameworks and eventually simplify everything to the bone.
I do feel it’s working great but someday I fear a lot of this might still be too much productivity theater.
How to avoid these becoming insanely long? It's like I need to plug all these little holes in the side of a water jug. Once I plugged the biggest holes I realize there's these micro holes that I need to plug. Because it doesn't have a theory of the world, or common sense, or knowledge of my area, or whatever you want to call it.
I think most of us are ending up with a similar workflow.
Mine is: 1) discuss the thing with an agent; 2) iterate on a plan until i'm happy (reviewing carefully); 3) write down the spec; 4) implement (tests first); 5) manually verify that it works as expected; 6) review (another agent and/or manually) + mutation testing (to see what we missed with tests); 7) update docs or other artifacts as needed; 8) done
No frameworks, no special tools, works across any sufficiently capable agent, I scale it down for trivial tasks, or up (multi-step plans) as needed.
The only thing that I haven't seen widely elsewhere (yet) is mutation testing part. The (old) idea is that you change the codebase so that you check your tests catch the bugs. This was usually done with fuzzers, but now I can just tell the LLM to introduce plausible-looking bugs.
My workflow is also highly inspired by Matt's skills, but I'm leveraging Linear instead of Github.
/grill-me (back-and-forth alignment with the LLM) --> /write-a-prd (creates project under an initative in Linear) --> /prd-to-issues (creates issues at the project level).
I'm making use of the blockedBy utility when registering the issues. They land in the 'Ready for Agent' status.
A scheduled project-orchestrator is then picking up issues with this status leveraging subagents. A HITL (Human in the loop) status is set on the ticket when anything needs my attention. I consider the code as the 'what', so I let the agent(s) update the issues with the HOW and WHY. All using Claude Code Max subscription.
Some notes:
- write-a-prd is knowledge compression and thus some important details occasionally get lost
- I might have to implement a simplify + review + security audit, call it a 'check', to fire at the end of the project. Could be in the form of an issue.
Waterfall was bad due to the excessively long feedback loops (months-to-years from "planning" to "customer gets to see it/ we receive feedback on it"). It was NOT bad because it forced people to think before writing code! That part we should recover, it's not problematic at all.
That's not what's considered waterfall, though. Specs are always required for any work, even if they're only in your head, even if the work takes 15 minutes. It's the length of the feedback loop and the resistance to spec change that makes waterfall, and by his use of tracer bullets I very much doubt it's the case here, if there was any doubt at all to have.
No /s here so just in case this is a serious point:
Agile is a set of four principles for software development.
Scrum is the two-week development window thing, but Scrum doesn't mandate a two week _release_ window, it mandates a two week cadence of planning and progress review with a focus on doing small chunks of achievable work rather than mega-projects.
Scrum prefers lots of one-to-three day projects generally, I've yet to see training on Scrum that does not warn off of repeatedly picking up two-week jobs. If that's been your experience, you should review how you can break work down more to get to "done" on bits of it faster.
All good points here (and yeah I didn't add /s, hopefully "now you know!" was sufficiently obvious over-the-top).
All that said, in most orgs I've worked with, they were following agile processes over agile principles - effectively a waterfall with a scrum-master and dailies.
This is not to diss the idea of agile, just an observation that most good ideas, once through the business process MBA grinder, end up feeling quite different.
No kids, don´t put yourself through this suffering. If you have to invest so much deliberate effort to sort of make it work - while you still handle the most tenuous and boring parts yourself, then what is the point? Lets keep the LLM vendors to their word - they promised intelligent machines that would just work so well to the point of causing mass unemployment. Why on earth do we have to work around the LLMs to make them work? What is the point? Where is my nation of datacenter PhDs or a PocketPhd, depending on whose CEOs misleading statement one quotes?
Why is everyone compelled to write one of these articles? Do they think that their workflow is so unique that they've unlocked the secret to harnessing the power of a pattern generator? Every single one of these reads like influencer vomit.
My workflow hasn't changed since 2022: 1. Send some data. 2. Review response. 3. Fix response until I'm satisfied. 4. Goto 1.
Documenting what I do is fun and relaxing and for me so I write. Only time I had to share mine was to a friend who wanted was getting into coding lately. https://www.nadeem.blog/writing/workflows
It is OK. I actually love looking around other people’s work. Perhaps, I will never follow exactly but one a while, I get the gotchas where I can steal and adapt to mine. Let it be, let people express. If not for the veterans with years of experience, people coming in recently should find these things something to read up and learn.
I think your take is overly negative. Regardless of what they think, sharing ones experiences with others is how we advance, both as individuals and as a community/mankind. Talking about AI workflows, I am personally interested in how the people who are happy working with AI work, so that I could also be happier with my work. If they write their workflow, I can either learn from it and improve my work, or learn that they are doing something completely different from what I do, which might explain the disparity between people's experiences with AI, or learn that they are spouting nonsense, reaffirming that it might really be mostly hype. Either way, each one of these is a net positive information for me.
Nobody writes about their work thinking the whole world will read it. They write it for their friends, maybe a small group of regular readers, also for themselves. I for one really like it, even if I get bored after reading 5 similar articles, because maybe someone will only ever read one of them, and it’ll help them improve their own work.
He's also missed a major step, which is to feed your skill into the LLM and ask it to critique it - after all, it's the LLM that's going to act on it, so asking it to assess first is kinda important. I've done that for his skills, here's the assessment:
==========
==========LLM metaprogramming is extremely important, I've just finished a LLM-assisted design doc authoring session where the recommendations of the LLM are "Don't use a LLM for that part, it won't be reliable enough".
This is pretty much a spec driven workflow.
I do similar, but my favorite step is the first: /rubberduck to discuss the problem with the agent, who is instructed by the command to help me frame and validate it. Hands down the most impactful piece of my workflow, because it helps me achieve the right clarity and I can use it also for non coding tasks.
After which is the usual: write PRDs, specs, tasks and then build and then verify the output.
I started with one the spec frameworks and eventually simplify everything to the bone.
I do feel it’s working great but someday I fear a lot of this might still be too much productivity theater.
> write PRDs, specs.
How to avoid these becoming insanely long? It's like I need to plug all these little holes in the side of a water jug. Once I plugged the biggest holes I realize there's these micro holes that I need to plug. Because it doesn't have a theory of the world, or common sense, or knowledge of my area, or whatever you want to call it.
I think most of us are ending up with a similar workflow.
Mine is: 1) discuss the thing with an agent; 2) iterate on a plan until i'm happy (reviewing carefully); 3) write down the spec; 4) implement (tests first); 5) manually verify that it works as expected; 6) review (another agent and/or manually) + mutation testing (to see what we missed with tests); 7) update docs or other artifacts as needed; 8) done
No frameworks, no special tools, works across any sufficiently capable agent, I scale it down for trivial tasks, or up (multi-step plans) as needed.
The only thing that I haven't seen widely elsewhere (yet) is mutation testing part. The (old) idea is that you change the codebase so that you check your tests catch the bugs. This was usually done with fuzzers, but now I can just tell the LLM to introduce plausible-looking bugs.
My workflow is also highly inspired by Matt's skills, but I'm leveraging Linear instead of Github.
/grill-me (back-and-forth alignment with the LLM) --> /write-a-prd (creates project under an initative in Linear) --> /prd-to-issues (creates issues at the project level). I'm making use of the blockedBy utility when registering the issues. They land in the 'Ready for Agent' status.
A scheduled project-orchestrator is then picking up issues with this status leveraging subagents. A HITL (Human in the loop) status is set on the ticket when anything needs my attention. I consider the code as the 'what', so I let the agent(s) update the issues with the HOW and WHY. All using Claude Code Max subscription.
Some notes:
- write-a-prd is knowledge compression and thus some important details occasionally get lost
- The UX for the orchestrator flow is suboptimal. Waiting for this actually: https://github.com/mattpocock/sandcastle/issues/191#issuecom...
- I might have to implement a simplify + review + security audit, call it a 'check', to fire at the end of the project. Could be in the form of an issue.
Congrats! You just rediscovered something called water-fall model.
Waterfall was bad due to the excessively long feedback loops (months-to-years from "planning" to "customer gets to see it/ we receive feedback on it"). It was NOT bad because it forced people to think before writing code! That part we should recover, it's not problematic at all.
That's not what's considered waterfall, though. Specs are always required for any work, even if they're only in your head, even if the work takes 15 minutes. It's the length of the feedback loop and the resistance to spec change that makes waterfall, and by his use of tracer bullets I very much doubt it's the case here, if there was any doubt at all to have.
Did you know that agile is just waterfall scaled down to two weeks? Now you know!
No /s here so just in case this is a serious point:
Agile is a set of four principles for software development.
Scrum is the two-week development window thing, but Scrum doesn't mandate a two week _release_ window, it mandates a two week cadence of planning and progress review with a focus on doing small chunks of achievable work rather than mega-projects.
Scrum prefers lots of one-to-three day projects generally, I've yet to see training on Scrum that does not warn off of repeatedly picking up two-week jobs. If that's been your experience, you should review how you can break work down more to get to "done" on bits of it faster.
All good points here (and yeah I didn't add /s, hopefully "now you know!" was sufficiently obvious over-the-top).
All that said, in most orgs I've worked with, they were following agile processes over agile principles - effectively a waterfall with a scrum-master and dailies.
This is not to diss the idea of agile, just an observation that most good ideas, once through the business process MBA grinder, end up feeling quite different.
Here's mine: code to spec until I get stuck -> search Google for the answer -> scan the Gemini result instead of going to StackOverflow.
No kids, don´t put yourself through this suffering. If you have to invest so much deliberate effort to sort of make it work - while you still handle the most tenuous and boring parts yourself, then what is the point? Lets keep the LLM vendors to their word - they promised intelligent machines that would just work so well to the point of causing mass unemployment. Why on earth do we have to work around the LLMs to make them work? What is the point? Where is my nation of datacenter PhDs or a PocketPhd, depending on whose CEOs misleading statement one quotes?
Why is everyone compelled to write one of these articles? Do they think that their workflow is so unique that they've unlocked the secret to harnessing the power of a pattern generator? Every single one of these reads like influencer vomit.
My workflow hasn't changed since 2022: 1. Send some data. 2. Review response. 3. Fix response until I'm satisfied. 4. Goto 1.
Documenting what I do is fun and relaxing and for me so I write. Only time I had to share mine was to a friend who wanted was getting into coding lately. https://www.nadeem.blog/writing/workflows
It is OK. I actually love looking around other people’s work. Perhaps, I will never follow exactly but one a while, I get the gotchas where I can steal and adapt to mine. Let it be, let people express. If not for the veterans with years of experience, people coming in recently should find these things something to read up and learn.
I think your take is overly negative. Regardless of what they think, sharing ones experiences with others is how we advance, both as individuals and as a community/mankind. Talking about AI workflows, I am personally interested in how the people who are happy working with AI work, so that I could also be happier with my work. If they write their workflow, I can either learn from it and improve my work, or learn that they are doing something completely different from what I do, which might explain the disparity between people's experiences with AI, or learn that they are spouting nonsense, reaffirming that it might really be mostly hype. Either way, each one of these is a net positive information for me.
> Why is everyone compelled to write one of these articles?
LinkedIn clout.
Nobody writes about their work thinking the whole world will read it. They write it for their friends, maybe a small group of regular readers, also for themselves. I for one really like it, even if I get bored after reading 5 similar articles, because maybe someone will only ever read one of them, and it’ll help them improve their own work.
> Do they think that their workflow is so unique that they've unlocked the secret to harnessing the power of a pattern generator?
Yes, just like everyone were thinking their .vimrc was amazing 20 years ago. It is vomit.
>What is AI actually good at? Implementation. What is it genuinely bad at? Figuring out what you actually want
I've found it to be pretty bad at both.
If what you're doing is quite cookie cutter though it can do a passable job of figuring out what you want.
My AI-Results