We’ve been working on solving the ARC-AGI benchmark and found that standard Transformers hit a hard ceiling on algorithmic search tasks (the "Compositional Drift" problem mentioned in the abstract).
We decided to try a different architectural approach: The Dual-Stream Programmatic Learner (DSPL).
Instead of one monolithic model, we use a Bicameral Latent Space:
1. Logic Stream: A recursive planner that handles abstract algorithmic planning.
2. Canvas Stream: An execution state that handles the pixel grid.
The Engineering Bottleneck:While this separation solves the reasoning drift (accuracy is high), the inference cost of the recursive loop is proving difficult to scale.
We are currently using a Gated Cross-Attention Interface to sync the two streams at every step, but this <$O(N^2)$> sync cost is melting our servers under load.
My question to the HN crowd:
For those working on dual-stream or "System 2" architectures—is strictly synchronous Cross-Attention necessary? Has anyone successfully decoupled the "Planner" loop from the "Executor" loop (running them asynchronously) without breaking causality?
We are debating switching to a Linear Attention mechanism (like Mamba) for the interface, but worried about losing the "Sacred Signature" type-safety.
Paper link is above. Happy to discuss the trade-offs of Recursion vs. Depth.
We’ve been working on solving the ARC-AGI benchmark and found that standard Transformers hit a hard ceiling on algorithmic search tasks (the "Compositional Drift" problem mentioned in the abstract).
We decided to try a different architectural approach: The Dual-Stream Programmatic Learner (DSPL).
Instead of one monolithic model, we use a Bicameral Latent Space:
1. Logic Stream: A recursive planner that handles abstract algorithmic planning. 2. Canvas Stream: An execution state that handles the pixel grid.
The Engineering Bottleneck:While this separation solves the reasoning drift (accuracy is high), the inference cost of the recursive loop is proving difficult to scale.
We are currently using a Gated Cross-Attention Interface to sync the two streams at every step, but this <$O(N^2)$> sync cost is melting our servers under load.
My question to the HN crowd: For those working on dual-stream or "System 2" architectures—is strictly synchronous Cross-Attention necessary? Has anyone successfully decoupled the "Planner" loop from the "Executor" loop (running them asynchronously) without breaking causality?
We are debating switching to a Linear Attention mechanism (like Mamba) for the interface, but worried about losing the "Sacred Signature" type-safety.
Paper link is above. Happy to discuss the trade-offs of Recursion vs. Depth.