Designing AI Agents That Think Before They Act

Most AI agents are reflexes dressed up as reasoning. They receive a message, reach for the nearest tool, emit an output, and repeat. There is no moment of consideration, no internal debate about whether the action is the right one, no model of what happens three steps from now. The loop is fast, which feels like intelligence — but speed without deliberation is just a very expensive lookup table.

This is the quiet problem sitting at the center of most agent architectures built in the last two years. Developers got access to function-calling APIs, wrapped them in a while not done loop, and shipped. The results were impressive enough in demos, brittle enough in production. The agents would hallucinate tool arguments, repeat the same failing call in a loop, or — most dangerously — fire off irreversible side effects on the basis of a single ambiguous instruction. Deleting the wrong file. Sending a draft email. Committing to a booking that should have been a quote.

The fix is not a smarter model. The fix is structure: giving your agent an explicit moment to plan, to critique itself, and to decide whether it is actually ready to act.

The Reactive Loop and Why It Breaks

The standard agentic pattern looks like this: parse input, select tool, call tool, observe result, loop. It mirrors the architecture of early rule-based systems, except the rules are now inferred rather than written. The model is doing the routing, but routing is all it is doing. There is no representation of intent beyond the immediate step, and no mechanism to notice when the immediate step is wrong.

The failure modes are predictable once you know what to look for. Premature commitment happens when the agent treats the first plausible interpretation of an instruction as the correct one and acts on it without checking. Loop blindness happens when the agent retries a failing tool call without any diagnosis of why it failed — the observation feeds back in, but nothing changes in the reasoning. Cascade errors happen when an early wrong action constrains all subsequent actions, and the agent has no mechanism to backtrack.

"A reactive agent is optimised for the case where the first answer is right. Deliberative agents are optimised for the case where it probably isn't."

These are not edge cases. In any domain with ambiguous user intent, noisy tool outputs, or actions that cannot be undone, the reactive loop will fail regularly. The question is whether you find out during development or after deployment.

The Deliberative Pattern: Plan → Act → Observe → Reflect

Deliberative agents introduce explicit cognitive phases between receiving input and producing action. The canonical form is a four-stage cycle. Plan: before calling any tool, the agent produces a written account of its current understanding of the goal, the sub-tasks it believes are required, and the order in which it intends to execute them. Act: it executes one step of the plan. Observe: it records the result, including failures and partial outputs. Reflect: it evaluates whether the observation changes the plan, and if so, how.

The loop does not close back to Act — it closes back to Plan. Every action is preceded by a planning step that incorporates everything learned so far. This sounds slow. In wall-clock terms it is slightly slower. In terms of tokens consumed per successful task completion, deliberative agents are almost always cheaper, because they make fewer wrong turns.

Implementing this is straightforward in principle. The system prompt establishes the four phases and requires the model to label its output with the current phase. A thin orchestration layer parses the phase label and decides what to do next — whether to call a tool, wait for human input, or route the reflection back into a new planning prompt. The model does the reasoning; the orchestrator enforces the rhythm.

Chain-of-Thought Before Every Tool Call

The most immediately useful intervention is also the simplest: require the model to produce a brief reasoning trace before every tool invocation. Not a full plan — just a sentence or two that makes the intent explicit.

# Instead of calling the tool directly, prompt for intent first:

system: "Before calling any tool, output a REASONING block explaining:
         1. What you expect this call to return
         2. How you will use the result
         3. What you will do if the call fails or returns nothing"

# The model's response before tool call:
# REASONING: I'm calling search_memory with query="onboarding checklist"
# because the user asked about their setup steps. I expect to get a list of
# items they previously saved. If nothing is returned, I'll ask them to
# describe what they remember saving before searching with alternate terms.

This is not just documentation. The act of articulating intent before calling a tool measurably improves tool argument quality and dramatically reduces cases where the model calls the wrong tool entirely. It also produces a structured trace that makes debugging infinitely easier — you can see exactly what the model thought it was doing at every step.

Self-Critique Loops and Confidence Thresholds

Some tasks require more than a single round of planning. For these, a self-critique loop — where the agent evaluates its own proposed output before committing to it — produces substantially better results at the cost of one additional inference call.

The pattern is simple: after the agent produces a plan or a draft response, you inject a critique prompt that asks it to evaluate the output against a small rubric. Does this actually answer what was asked? Are there assumptions here that haven't been verified? Is there a simpler path to the goal? The critique feeds back into a revision, and the revision goes out rather than the original.

For external actions — anything that touches the outside world in a way that cannot be trivially undone — you can pair this with a confidence threshold. Ask the model to rate its confidence in the action on a scale before executing. If the confidence falls below a set value, the agent pauses and surfaces a clarification request rather than proceeding. This is crude but effective: it turns implicit uncertainty into an explicit checkpoint.

The Pause and Think Primitive

The most important structural addition to any agent handling irreversible actions is a dedicated reasoning step that fires before the action executes — no exceptions. I call this the pause-and-think primitive, and it is the single change that has done the most to reduce production incidents across every agent I have built.

The implementation is a function in your orchestration layer called before any tool that modifies state: file writes, API calls with side effects, emails, database mutations, anything that matters. The function constructs a prompt that presents the proposed action and asks the model to answer three questions: What is the worst case if this action is wrong? Can this action be undone? Is there information I don't yet have that would change whether I take this action?

async function pauseAndThink(proposedAction, context) {
  const prompt = `
    You are about to execute: ${proposedAction.description}
    
    Before proceeding, answer:
    1. Worst-case impact if this action is incorrect?
    2. Is this reversible? If so, how?
    3. Is any required information still ambiguous?
    
    Output: { proceed: boolean, concerns: string[], clarify: string | null }
  `;
  const review = await model.complete(prompt, context);
  if (!review.proceed || review.clarify) {
    return { hold: true, reason: review.clarify ?? review.concerns[0] };
  }
  return { hold: false };
}

When hold is true, the orchestrator surfaces the concern to the user rather than executing. This has caught misunderstood instructions, ambiguous target resources, and — on two memorable occasions — actions that were correct in isolation but catastrophic given the user's broader context.

Lessons from SnapMemory and randomnoise.space

Building the retrieval agent behind SnapMemory taught me the reactive loop's limits faster than any benchmark could. Early versions would receive a vague query like "find that thing about the API docs" and immediately fire a vector search, return the top result, and call it done. When the top result was wrong — which happened often, because memory retrieval is genuinely ambiguous — the agent had no way to recover. It had already committed to an answer.

The fix was a planning step before search that forced the agent to enumerate what it actually knew about the query before generating a search vector. What time period might this memory be from? What project or context? What format — a link, a note, a screenshot? That structured decomposition improved first-call retrieval accuracy by a significant margin, not because the underlying retrieval model changed, but because the queries became precise.

For the content agent on randomnoise.space, the critical intervention was the pause-and-think gate on any action that touched the publishing pipeline. Drafts could be generated freely, but the moment an action would modify a scheduled post, update metadata, or trigger a distribution webhook, it had to pass through the review step. Twice in the first month that gate caught instructions that had been garbled somewhere in a multi-turn conversation — actions that would have published malformed content or overwritten posts that weren't meant to be touched.

Slower Agents, Faster Outcomes

There is a version of this argument that sounds like it is asking you to make your agent worse — to add latency, to insert friction, to interrupt the smooth flow of automated action with hesitation and review. That framing is wrong, and it is worth being precise about why.

A reactive agent that completes a task in two seconds and gets it wrong has a total cost that includes the time to notice the error, the time to diagnose it, the time to repair whatever the error caused, and — if the error reached a user — the trust that evaporated in the process. A deliberative agent that takes five seconds and gets it right has a lower total cost by almost any measure that matters.

The deliberative pattern does not make agents slow. It makes them reliable. And reliability, compounded across thousands of runs, is what makes an agent worth having in the first place. The pause is not an obstacle to thinking — it is where thinking actually happens. Build it in from the start, not as an afterthought bolted on when something goes wrong. Your future self, staring at an incident report, will be grateful you did.

Designing AI AgentsThat Think Before They Act