The planner says one thing, the executor does another

The plan was right when it was written. Then the world moved, the observation never fed back, and the agent confidently executed a plan that no longer fit.

B

Balagei G Nagarajan

4 MIN READ


Short answer. Your agent does the wrong thing because the part that plans and the part that acts have drifted apart: the planner committed to one step, the executor did another, or the plan was made up front and never updated after the world changed. Close the loop, replan on every observation and reconcile the executor's action with the plan before it fires, and intent and action stay pointed the same way.

Two linked AI nodes, a planner and an executor, connected by a light filament that is fracturing as they point in diverging directions

Planner and executor, linked but diverging. The plan points one way, the action goes another. Hero image.

Key facts.

  • In a study of 1,600+ failed multi-agent runs across 7 frameworks, reasoning-action mismatch, where the agent's stated reasoning and its actual action diverge, accounted for 13.2% of failures, and repeating a step it had already done another 15.7%, the single most common mode (Cemri et al., MAST, arXiv:2503.13657, NeurIPS 2025).
  • That taxonomy puts the bulk of failures in coordination and system design, not the model's raw capability, which is why a stronger model rarely fixes a plan that has drifted from the actions (MAST, 2025).

What is planner-executor desync?

Plenty of agents split thinking from doing: a planner decides the next step or the whole plan, and an executor calls the tools. Desync is when those two stop agreeing. The planner commits to "update the staging record" and the executor calls the production endpoint. Or the plan was right when it was made and the world moved, a file got deleted, a ticket got closed, and nothing told the plan. MAST names the clean version of this reasoning-action mismatch, where the agent's narrated reasoning and its actual tool call point in different directions. The output still reads coherent, because the planner's prose is fluent. The action underneath it is wrong.

Why does the plan drift from reality?

Because most agent loops plan more confidently than they observe. A plan made up front assumes the world holds still, but each executed step returns new information, and if that observation does not feed back into a replan, the agent keeps running a plan that no longer fits. Interleaving reasoning and acting so each observation updates the next thought was the whole point of ReAct (Yao et al., ReAct, arXiv:2210.03629). Skip that loop, or let planner and executor run as separate agents passing terse handoffs, and state stops being shared. The planner is then reasoning about a world that existed three steps ago, which is how an agent confidently does the wrong, once-correct thing.

A loop where an executor's observation fails to feed back into the planner, so a stale plan keeps driving actions that diverge from the current world state

The missing feedback edge: observations never reach the planner, so a stale plan keeps driving the executor. Diagram.

How do you keep them in sync?

Close the loop and make the executor accountable to the plan. Feed every observation back into a replan step so the plan reflects the current state, not the starting state. Have the executor echo the planned action and the actual call it is about to make, and block when they disagree, the same read-back idea that catches silent success. Share explicit state across any planner-to-executor handoff instead of a one-line summary, since MAST traces most of these failures to coordination and design rather than model strength. And bound re-planning so a confused agent does not loop on the same step, the most common failure mode in the data.

The pattern is simple: a plan is a hypothesis, not a fact, and an executor that never checks its actions against the current plan will ship the wrong work fluently. Replan on every observation, reconcile the executor's action with the planner's intent before it fires, and share real state across the handoff. None of that is a bigger model. It is a coordination layer that keeps intent and action pointed the same way, which is what VibeModel builds as the Pattern Intelligence Layer.

Frequently asked questions

Is this just the agent hallucinating?
No. The reasoning can be sound and the plan correct; the failure is that the executed action does not match it, or the plan was never updated after the world changed. MAST tracks reasoning-action mismatch as its own mode for that reason.

Does a single-agent design avoid it?
It reduces the handoff version, but not the stale-plan version. A single agent that plans up front and never replans on new observations drifts the same way. The fix is the observation-to-replan loop, not the agent count.

What is the one change that helps most?
Replan on every meaningful observation and reconcile the executor's next action against the current plan before it fires. That single gate catches both the drifted plan and the action that does not match it.


Share this post

Join the discussion

Have a take, a war story, or a question? Sign in with GitHub to comment and react. Comments are powered by GitHub Discussions, ad-free and yours to moderate.

Continue Reading

Find where your agent breaks, before you build it

Faultmap maps where your agent will fail from the goal and your data, then hands you the first test suite it has to pass.