Agents
Agents & MCP
Agents are Claude in a loop: think, act, observe, repeat. Learn when agents solve problems that one-shot calls can't, how to build MCP servers, and patterns for keeping agent state consistent.
What Makes an Agent? The Agentic Loop
An agent is Claude making decisions and taking actions in a loop. Unlike a single API call where Claude responds once, an agent can reason, call tools, observe the results, and decide what to do next. This loop continues until the agent concludes the task is complete.
Most agent loops follow this pattern: Claude receives a task, decides whether to use a tool or provide an answer, calls a tool if needed, receives the result, then decides what to do with that information. This cycle repeats, with each iteration giving Claude more context to make better decisions.
When agents make sense: Tasks that require exploration (gather data, then analyze it). Multi-step workflows where one tool's output feeds into another. Situations where Claude needs to reflect and adjust its approach. When a single prompt doesn't contain enough information to answer in one go.
When they don't: Simple tasks that a direct prompt handles fine. Highly latency-sensitive applications (each loop adds latency). When the problem is well-structured and doesn't need Claude's reasoning ability.
Agent Design: Scope, Tools, and Expectations
Good agents have clear boundaries. Define what the agent is responsible for and what it's not. If you're building a research agent, does it also book meetings? No. If you're building a customer support agent, does it process refunds? Maybe, but only after human approval.
The tools you give the agent shape its behavior. A research agent needs web search and data fetching tools. A support agent needs ticket creation, knowledge base lookup, and escalation tools. Too many tools confuse agents into doing too much. Too few make them feel helpless.
Set clear stopping criteria. How does the agent know when the task is done? Write this into the system prompt and tool definitions. "Research this company" is vague. "Research this company and provide a one-page summary covering their founding, funding, and recent news" gives the agent a concrete goal.
Limit loop iterations. Set a maximum number of tool calls (e.g., 10) to prevent infinite loops or agents that get stuck. When the limit is hit, either force the agent to wrap up or escalate to a human.
MCP Servers: The Tool Layer Abstraction
MCP (Model Context Protocol) servers are a standardized way to expose tools and resources to Claude. Instead of hardcoding tool definitions in your agent, you write an MCP server that Claude can connect to. MCP handles the communication, so Claude doesn't know or care whether it's calling a local function or a remote API.
Building an MCP server means implementing standard methods: one that lists available tools, one that handles tool calls. Your server is agnostic to which Claude is calling it. This modularity matters because you can write one MCP server for database access, another for email, another for web search, and mix and match them across different agents.
MCP server patterns: Keep servers focused. One server should do one thing well: database queries, external API calls, file operations. Document the tools clearly (same rules as before: specific descriptions, clear parameters). Handle errors gracefully — MCP servers should never crash, just return error messages that Claude can work with.
State Management: Keeping the Agent Consistent
Agents maintain state across multiple turns: the task they're working on, data they've gathered, decisions they've made. Managing this state is crucial. If the agent forgets what it's doing or loses context between turns, it becomes ineffective.
The simplest approach: store all messages in the conversation. Every turn, send the full message history to Claude. Claude always knows what's happened so far. This works for small conversations but gets expensive (tokens) and slow for long runs.
Smarter approaches: Summarize old messages when the history gets long. Keep only recent turns and a summary of earlier ones. Store agent state in structured data (what task is being worked on, what's been completed) and pass that instead of raw messages. Use a database to track agent runs, allowing pauses and resumption.
Be explicit about state changes. When the agent gathers new data or makes decisions, confirm it explicitly. "I've learned that X is true. I'll now proceed to step Y." This prevents silent failures where Claude thinks it did something but actually didn't.
Failure Recovery: When Agents Get Stuck
Agents fail. A tool returns unexpected data. A prerequisite step didn't complete. Claude misunderstands what a tool does. Your job is to make failures recoverable.
Detect failures early: If a tool returns an error, tell Claude immediately. Don't hide it and hope the agent figures it out. "Tool call failed: database timeout, retrying..." gives Claude context to adjust.
Provide escape hatches: If the agent has tried the same approach multiple times and keeps failing, break the loop and ask for human help. Add a tool like "escalate_to_human" that the agent can call when it's stuck.
Log everything: Every tool call, every result, every decision. If an agent produces wrong output, logs let you trace what went wrong. Was it a bad system prompt, a misunderstanding of a tool, or bad data?
Ready to test your knowledge?
Take the Agents & MCP practice test to verify your understanding of agent design.
Take the test →