Boy I'd like to read a compact non-LLM version of the key concepts here. The signal ratio is very low, and crafted with weird LLM-isms throughout, and very hard to parse.
I've been experimenting with mapping a zettelkasten system over to my agents with a few goals in mind, not least applying the idea of more 'test time compute' to the storing of memories as a way to add useful structure that can be tapped later during retrieval. (github.com/vessenes/zet - MIT license - no warranties)
There's some good and some bad, but I think it's better than just a raw embedding memory store for agents. It's definitely better for a human in that it's navigable and understandable, while remaining useful for agents.
But, I'd really like to read more about the space and get ideas -- this blog post was just too difficult to parse for me, sadly.
I started reading this and right away hit something that doesn't really make any sense to me:
> the extractor. the thing that reads conversation transcripts and decides what to keep.
> the most consequential choice an extractor makes is timing. extract eagerly, after every message, and you spend tokens on small talk that goes nowhere. extract lazily, at the end of a session, and the context you needed to resolve a pronoun is already gone.
If the input is coming from a transcript, then either that transcript contains enough context to understand what a particular pronoun refers to, or it doesn't.
If it does, why would waiting until the end of a session be a problem? What am I missing?
good catch - the example is sloppy. the real issue is lost-in-the-middle on long transcripts: the extracting model attends worse to material between endpoints, so "the transcript is still there" doesn't mean the extraction sees it equally.
separate tradeoff worth naming - do you want "memories" available within session vs after the conversation has ended? that was what i was trying to convey in this paragraph
This is a great post and I really appreciate making the cognitive science terminology clear.
the author is doing a great job telling what is missing from the current memory frameworks for agents but what is missing in my opinion is also an argument about the necessity or not of these missing components.
fair — this post mapped the gaps without making the case for whether filling them changes what an agent can do. the interesting ones are procedural and prospective. both deserve their own post.
Hopefully I didn’t sound too critical of the post because this wasn’t my intention. The post delivered what was needed and thank you for this!
The reason I asked the question is because in the case we don’t need the rest, it would be better to not use this terminology for these systems. We already anthropomorphize LLMs too much and although I get the marketing value of that, it’s not always to the benefit of the people who interact with them.
yeah i agree with you on not using the terminology, although it's intuitive it's also confusing enough. it's tempting to do that, but i share your sentiment
Thanks for writing this, and look forward to the one on procedural memory.
Seems like teams are encoding procedural knowledge in skills repositories, and I wonder if there’s additional utility from an auto created procedural memory layer
Boy I'd like to read a compact non-LLM version of the key concepts here. The signal ratio is very low, and crafted with weird LLM-isms throughout, and very hard to parse.
I've been experimenting with mapping a zettelkasten system over to my agents with a few goals in mind, not least applying the idea of more 'test time compute' to the storing of memories as a way to add useful structure that can be tapped later during retrieval. (github.com/vessenes/zet - MIT license - no warranties)
There's some good and some bad, but I think it's better than just a raw embedding memory store for agents. It's definitely better for a human in that it's navigable and understandable, while remaining useful for agents.
But, I'd really like to read more about the space and get ideas -- this blog post was just too difficult to parse for me, sadly.
fair - repetition and density got in the way in places. cleaning that up for next.
the zet description sounds interesting - test-time compute at storage time especially.
is the repo public somewhere? github.com/vessenes/zet 404s for me.
I started reading this and right away hit something that doesn't really make any sense to me:
> the extractor. the thing that reads conversation transcripts and decides what to keep.
> the most consequential choice an extractor makes is timing. extract eagerly, after every message, and you spend tokens on small talk that goes nowhere. extract lazily, at the end of a session, and the context you needed to resolve a pronoun is already gone.
If the input is coming from a transcript, then either that transcript contains enough context to understand what a particular pronoun refers to, or it doesn't.
If it does, why would waiting until the end of a session be a problem? What am I missing?
good catch - the example is sloppy. the real issue is lost-in-the-middle on long transcripts: the extracting model attends worse to material between endpoints, so "the transcript is still there" doesn't mean the extraction sees it equally.
separate tradeoff worth naming - do you want "memories" available within session vs after the conversation has ended? that was what i was trying to convey in this paragraph
This is a great post and I really appreciate making the cognitive science terminology clear.
the author is doing a great job telling what is missing from the current memory frameworks for agents but what is missing in my opinion is also an argument about the necessity or not of these missing components.
fair — this post mapped the gaps without making the case for whether filling them changes what an agent can do. the interesting ones are procedural and prospective. both deserve their own post.
thanks for the read.
Hopefully I didn’t sound too critical of the post because this wasn’t my intention. The post delivered what was needed and thank you for this!
The reason I asked the question is because in the case we don’t need the rest, it would be better to not use this terminology for these systems. We already anthropomorphize LLMs too much and although I get the marketing value of that, it’s not always to the benefit of the people who interact with them.
Please do write the rest of the posts!
not at all, I appreciate your comments!
yeah i agree with you on not using the terminology, although it's intuitive it's also confusing enough. it's tempting to do that, but i share your sentiment
Thanks for writing this, and look forward to the one on procedural memory.
Seems like teams are encoding procedural knowledge in skills repositories, and I wonder if there’s additional utility from an auto created procedural memory layer
Spidey senses going off here. The first two comments read like an LLM.
yeah i used cc to help me write the post itself and the comment, my bad
everything is computer
thats beautiful! wow!
A seminal post
thanks for reading