It's an interesting idea, but I feel like it's missing almost the most important thing; the context of the change itself. When I review a change, it's almost never just about the actual code changes, but reviewing it in the context of what was initially asked, and how it relates to that.
Your solution here seems to exclusively surface "what" changes, but it's impossible for me to know if it's right or not, unless I also see the "how" first and/or together with the change itself. So the same problem remains, except instead of reviewing in git/GitHub/gerrit + figure out the documents/resources that lays out the task itself, I still have to switch and confirm things between the two.
I agree, that's also really important and something we're brainstorming
Currently on Stage we also generate a PR summary next to the chapters and that's where we want to do more "why" that pulls in context from Linear, etc.
And I know there's a lot of cool teams like Mesa and Entire working on embedding agent context into git history itself so that could an interesting area to explore as well
I assume this problem could be solved if we write up what we actually want (like a GH issue) and maybe in the future the guys at Stage could use github issues as part of their PR review?
> Stage automatically analyzes the diff, clusters related changes, and generates chapters.
Isn't that what commits are for? I see no reason for adding this as an after-thought. If the committers (whether human or LLM) are well-behaved, this info is already available in the PR.
In our experience, it's difficult to create well-mannered commits as you code and new ideas pop into your head or you iterate on different designs (even for LLMs). One concept we toyed around with was telling an LLM to re-do a branch using "perfect commits" right before putting up a PR. But even then you might discover new edge cases and have to tack them on as additional commits.
We thought git wasn't the right level of abstraction and decided to tackle things at the PR level instead. Curious to hear your experiences!
I concur. I cannot accept that we are so disconnected from what we're building that we can't go back and revise our commits or something else to make it make sense.
> more and more engineers are merging changes that they don't really understand
You cannot solve this problem by adding more AI on top. If lack of understanding is the problem, moving people even further away will only worsen the situation.
Looks kind of neat like devon.ai review / reviewstack crossover. But as i tell every of the dozens projects trying to make a commercial review tool: i would rather spend a week vibe copying this than onboarding a tool i have to pay for and am at the mercy of whoever made it. Its just over for selling saas tools like this. For agents i also need this local not on someones cloud. Its just a matter of time until someone does it.
Thanks for the feedback! re: local vs cloud, I do think there is a cool work to be done around unifying the writing/reviewing experience locally, but we started with cloud because we designed this as a collaborative product with teams in mind
Why is this a service and not an open source project? It doesn't seem to do much other than organize your commits within a PR (could be run once on a dev machine and shipped in the code, then displayed separately) and builds a dashboard for PRs that's not too far off from what github already offers, but could also be represented with fairly small structured data and displayed separately.
Looks amazing. I've been trying different stacking PR tools and Graphite and this looks to be the most human-centric so far. I'll have a shot at using this within our team soon. Congrats on the launch!
Thanks! I think we're really focused on making the overall review experience as guided and obvious as possible for the human. Chapters is a great start but we're coming up with more ideas on how we can make the process even easier
Exactly. "Why was this change made"? "What were the options"? "Why this is a good way of doing it"? "What are the subtle things I came across while making this change"?
Totally different part of the reviewing experience, but I would love to see PR comments (or any revisions really) be automatically synced back to the context coding agents have about a codebase or engineer. There’s no reason nowadays for an engineer or a team of engineers to make the same code quality mistake twice. We manually maintain our agents.md with codebase conventions, etc, but it’d be great not to have to do that.
100%. A big part of code review in my mind is to automate away specific mistakes and anti-patterns across a team. I think there are a lot of interesting things to be done to merge the code writing and code reviewing cycles.
It keeps a repository with markdown files as the agent context, makes those available (via a simple search and summarise MCP) and when closing a merge request it checks whether the context needs updating based on the review comments. If it needs updating a PR is opened on the context repository with suggested changes/additions.
It's possible, but at the same time it's been years and they haven't copied things like Graphite's dashboard or stacked PR interface yet. We have the advantage of speed :)
Chapters are regenerated every time a new commit is pushed to a PR. Our thinking is that the chapters should serve as "auto stacked diffs" since they should follow a logical order.
Do you or your team use stacking in your workflows?
The idea of a workplace where people can’t be bothered to read what the ai is coding but someone else is expected to read and understand if it’s good or slop just doesn’t really add up.
I personally see the value of code review but I promise you the most vocal vibe coders I work with don’t at all and really it feels like something that could be just automated to even me.
The age of someone gatekeeping the codebase and pushing their personal coding style foibles on the rest of the team via reviews doesn’t feels like something that will exist anymore if your ceo is big on vibe coding.
Agree that agents are definitely handling more and more of the coding side, and there's almost no doubt they will get better slop-wise.
In our view, even vibe coders should understand how the codebase works, and we think review is a natural place to pause and make sure you know what you and your coworkers are shipping. And we should have tools to reduce the mental load as much as possible.
Do you think there's a problem of cognitive debt among your coworkers who aren't reading the code or reviewing PRs?
Yeah, but we're a small company and sometimes cut corners to move faster, so if a tool can solve this instead of potentially adding more friction to other engineers I'm all for it.
This is really cool and we definitely have this problem as well. I really like the flowchart deciding on where to put each learning. Will have to try it out!
Do you find that this list of learnings that end up BUGBOT.md or LESSONS.md ever gets too long? Or does it do a good job of deduplicating redundant learnings?
The framing of "humans back in control" resonates. A lot of AI tooling
right now optimizes for speed over correctness — the assumption being
that AI output is good enough to ship. Stage seems to push back on that.
What's been the biggest surprise from early users so far?
Thanks! Yeah we believe strongly that humans need to be in the code review loop to some extent
I think one thing we've seen from early users that surprised us is how chapters was quickly becoming the unit of review for them as opposed to files - and they've asked us to add functionality to mark chapters as viewed and comment on them as a whole
Another big surprise: now that agents are the ones writing most (if not all) the code right now, we've found that a lot of early users are using Stage to not only review others PRs but also their own PRs, before they have others review it
It's an interesting idea, but I feel like it's missing almost the most important thing; the context of the change itself. When I review a change, it's almost never just about the actual code changes, but reviewing it in the context of what was initially asked, and how it relates to that.
Your solution here seems to exclusively surface "what" changes, but it's impossible for me to know if it's right or not, unless I also see the "how" first and/or together with the change itself. So the same problem remains, except instead of reviewing in git/GitHub/gerrit + figure out the documents/resources that lays out the task itself, I still have to switch and confirm things between the two.
I agree, that's also really important and something we're brainstorming
Currently on Stage we also generate a PR summary next to the chapters and that's where we want to do more "why" that pulls in context from Linear, etc.
And I know there's a lot of cool teams like Mesa and Entire working on embedding agent context into git history itself so that could an interesting area to explore as well
I assume this problem could be solved if we write up what we actually want (like a GH issue) and maybe in the future the guys at Stage could use github issues as part of their PR review?
> Stage automatically analyzes the diff, clusters related changes, and generates chapters.
Isn't that what commits are for? I see no reason for adding this as an after-thought. If the committers (whether human or LLM) are well-behaved, this info is already available in the PR.
In our experience, it's difficult to create well-mannered commits as you code and new ideas pop into your head or you iterate on different designs (even for LLMs). One concept we toyed around with was telling an LLM to re-do a branch using "perfect commits" right before putting up a PR. But even then you might discover new edge cases and have to tack them on as additional commits.
We thought git wasn't the right level of abstraction and decided to tackle things at the PR level instead. Curious to hear your experiences!
I feel that grouping related change in commits can be challenging, as git really presents commits as grouping in time, not topic.
It is certainly possible to do topic-grouping in commits, but it requires significant effort to het that consistent on a team level.
I concur. I cannot accept that we are so disconnected from what we're building that we can't go back and revise our commits or something else to make it make sense.
> more and more engineers are merging changes that they don't really understand
You cannot solve this problem by adding more AI on top. If lack of understanding is the problem, moving people even further away will only worsen the situation.
I agree, and that's why we're not building a code review bot which aims to take humans out of the loop
We don't think of Stage as moving people further away from code review, but rather using AI to guide human attention through the review process itself
Looks kind of neat like devon.ai review / reviewstack crossover. But as i tell every of the dozens projects trying to make a commercial review tool: i would rather spend a week vibe copying this than onboarding a tool i have to pay for and am at the mercy of whoever made it. Its just over for selling saas tools like this. For agents i also need this local not on someones cloud. Its just a matter of time until someone does it.
Thanks for the feedback! re: local vs cloud, I do think there is a cool work to be done around unifying the writing/reviewing experience locally, but we started with cloud because we designed this as a collaborative product with teams in mind
Why is this a service and not an open source project? It doesn't seem to do much other than organize your commits within a PR (could be run once on a dev machine and shipped in the code, then displayed separately) and builds a dashboard for PRs that's not too far off from what github already offers, but could also be represented with fairly small structured data and displayed separately.
Open source is something we're thinking about! We've just been focused on building for now but its definitely not off the table
Looks amazing. I've been trying different stacking PR tools and Graphite and this looks to be the most human-centric so far. I'll have a shot at using this within our team soon. Congrats on the launch!
Thank you! Let us know any ways we can make it better
This is a really cool idea but where's the moat? What's stopping someone from replicating the functionality?
Thanks! I think we're really focused on making the overall review experience as guided and obvious as possible for the human. Chapters is a great start but we're coming up with more ideas on how we can make the process even easier
Hmm. All of the examples simply describe what the code is doing. I need a tool that explains the intent and context behind a change.
Exactly. "Why was this change made"? "What were the options"? "Why this is a good way of doing it"? "What are the subtle things I came across while making this change"?
Yep that's something we're actively working on! would love to hear any perspectives on best ways to approach this
Totally different part of the reviewing experience, but I would love to see PR comments (or any revisions really) be automatically synced back to the context coding agents have about a codebase or engineer. There’s no reason nowadays for an engineer or a team of engineers to make the same code quality mistake twice. We manually maintain our agents.md with codebase conventions, etc, but it’d be great not to have to do that.
100%. A big part of code review in my mind is to automate away specific mistakes and anti-patterns across a team. I think there are a lot of interesting things to be done to merge the code writing and code reviewing cycles.
I've been working on that as a small open source tool: https://github.com/smithy-ai/smithy-ai
It keeps a repository with markdown files as the agent context, makes those available (via a simple search and summarise MCP) and when closing a merge request it checks whether the context needs updating based on the review comments. If it needs updating a PR is opened on the context repository with suggested changes/additions.
I assume Gitlab/Github will add these sort of features to their products within the next few months
It's possible, but at the same time it's been years and they haven't copied things like Graphite's dashboard or stacked PR interface yet. We have the advantage of speed :)
Does Stage work for PRs that have multiple commits? These could be considered "stacked diffs", but in the same PR.
Chapters are regenerated every time a new commit is pushed to a PR. Our thinking is that the chapters should serve as "auto stacked diffs" since they should follow a logical order.
Do you or your team use stacking in your workflows?
The idea of a workplace where people can’t be bothered to read what the ai is coding but someone else is expected to read and understand if it’s good or slop just doesn’t really add up.
I personally see the value of code review but I promise you the most vocal vibe coders I work with don’t at all and really it feels like something that could be just automated to even me.
The age of someone gatekeeping the codebase and pushing their personal coding style foibles on the rest of the team via reviews doesn’t feels like something that will exist anymore if your ceo is big on vibe coding.
Agree that agents are definitely handling more and more of the coding side, and there's almost no doubt they will get better slop-wise.
In our view, even vibe coders should understand how the codebase works, and we think review is a natural place to pause and make sure you know what you and your coworkers are shipping. And we should have tools to reduce the mental load as much as possible.
Do you think there's a problem of cognitive debt among your coworkers who aren't reading the code or reviewing PRs?
I like the chapters thing, a lot of PRs I review should really be like 5 prs so its nice to have it auto split like that.
Do you see a world where it splits them up on the git level?
> a lot of PRs I review should really be like 5 prs
Can't you push back on that? I feel like this tool is trying to fix misbehaved colleagues...
Yeah, but we're a small company and sometimes cut corners to move faster, so if a tool can solve this instead of potentially adding more friction to other engineers I'm all for it.
Yeah that could be useful, especially with the increased popularity of stacked PRs
But I see it working together with chapters, not instead of bc it's still good to see the granularity within a PR
We have the same problem, and I came up with this:
https://sscarduzio.github.io/pr-war-stories/
Basically it’s distilling knowledge from pr reviews back into Bugbot fine tuning and CLAUDE.md
So the automatic review catches more, and code assistant produces more aligned code.
This is really cool and we definitely have this problem as well. I really like the flowchart deciding on where to put each learning. Will have to try it out!
Do you find that this list of learnings that end up BUGBOT.md or LESSONS.md ever gets too long? Or does it do a good job of deduplicating redundant learnings?
Thanks! We have ~1000PRs/year. Seniors are way less than juniors and a lot of knowledge is transferred via pr messages.
The deduplication and generalisation steps really help, and the extra bugbot context ends up in just about 2000 tok.
Global LESSONS.md has less than 20 “pearls” with brief examples
Nice! Will try it out
Y’all are a bit nuts if you want 50% more per month than Claude Pro for this.
Can reviewers adjust the chapter splits manually if they disagree with how it grouped the PR, or are the chapters fixed once generated?
We don't support that currently, but would love to see examples where you disagree with the chapters so we can figure out the best interface
You can regenerate the chapters anytime, but it might lead to similar results as the first time
We're also planning on adding functionality to support some sort of CHAPTERS.md file that lets you specify how you want things broken down!
CHAPTERS.md sounds like a good idea for when the auto-grouping doesn't match the actual structure of the work. Looking forward to seeing it.
The framing of "humans back in control" resonates. A lot of AI tooling right now optimizes for speed over correctness — the assumption being that AI output is good enough to ship. Stage seems to push back on that. What's been the biggest surprise from early users so far?
Thanks! Yeah we believe strongly that humans need to be in the code review loop to some extent
I think one thing we've seen from early users that surprised us is how chapters was quickly becoming the unit of review for them as opposed to files - and they've asked us to add functionality to mark chapters as viewed and comment on them as a whole
Another big surprise: now that agents are the ones writing most (if not all) the code right now, we've found that a lot of early users are using Stage to not only review others PRs but also their own PRs, before they have others review it