MCP Tools: Engineering Actionable Capabilities for Production AI Agents

What Are MCP Tools

MCP Tools are the executable functions exposed by an MCP Server that AI agents can discover and invoke through the Model Context Protocol. Each tool represents a single, well‑defined action — querying a database, creating a ticket, sending a message, or adjusting infrastructure — and is described to the agent via structured metadata: name, description, and input/output JSON Schemas.

MCP Tools are the only MCP primitive that can mutate state in external systems. Unlike Resources (read‑only context that agents fetch) and Prompts (templates that stay within the LLM’s reasoning loop), tools cross the boundary from language to action. They are how an agent turns a user’s request into a real‑world effect.

The technical interface is simple: clients call tools/call with a tool name and parameters, servers execute the corresponding logic and return a result over JSON‑RPC 2.0. Every successful tool call is a bet that the agent correctly inferred intent from the user’s natural language; every failed call exposes a gap in your schema, validation, or authorization design.

Why MCP Tools Matter

Before MCP, every AI agent needed bespoke connector code for every tool it wanted to use. A GitHub integration for Claude looked nothing like a GitHub integration for GPT‑4. The same Postgres connector got rewritten three times.

MCP solves this by providing a single tool implementation that works against any compatible LLM. Write a GitHub MCP server once, and it works with Claude, GPT‑4o, Gemini, and any agent stack that speaks MCP. That cross‑LLM portability is the reason MCP has become the de facto standard for AI tool use. By early 2026, over 10,000 active MCP servers and 97 million monthly SDK downloads had been recorded.

From an engineering perspective, MCP Tools transform tooling from a cost centre into a strategic asset. Each well‑designed tool becomes a building block that can be reused across agents, across teams, and across LLM providers. The protocol handles discovery, versioning, and standard error formats. Your team focuses on business logic, not integration glue.

MCP Capability Model: Tools in Context

MCP defines three primitives, each with a distinct purpose and architectural role.

Capability	Purpose	Direction	Mutability	When to Use
Tools	Execute actions, call APIs, run workflows	Agent → External system	Yes (write, mutate, delete)	Real‑time queries that produce side effects — transactions, state changes, notifications
Resources	Provide read‑only context, documentation, facts	External system → Agent	No	Static knowledge, configuration, reference data that the agent consults
Prompts	Guide conversation with reusable templates	Agent internal	No	Structuring multi‑step tasks, enforcing output formats, maintaining consistency

The most common architecture failure in MCP development is mixing these responsibilities. A tool that writes to a database but also returns large amounts of read‑only documentation is both a state change and a data fetch — it cannot be safely cached, retried, or reasoned about. A resource that triggers side effects violates the read‑only contract that clients rely on for predictability.

Practical guidance: If the operation changes state (creates, updates, deletes, sends, triggers), it belongs in a tool. If it only retrieves data with no side effects, it belongs in a resource. Keep the boundary sharp.

MCP Tools in Agent Architecture

In a typical agent runtime, tools are called after planning and before final response generation. The MCP layer sits between the agent framework and the external system.

The diagram shows a single client connecting to multiple MCP servers. Each server advertises its tool catalogue via tools/list. The agent’s planner selects which tool to call, the MCP client validates parameters against the tool’s JSON Schema, and the router dispatches the call to the appropriate server.

Why this architecture matters: The MCP layer becomes the single point of control for authentication, observability, rate limiting, and error handling. Tools from CRM, document stores, and cloud APIs appear to the agent as a unified catalogue, despite running on different servers with different backends.

MCP Tool Lifecycle

Every MCP tool progresses through a lifecycle from initial design to eventual deprecation. The lifecycle is not a linear walkthrough — it is a continuous loop of feedback, monitoring, and refinement.

Stage Details

Stage	Purpose	Failure Mode
Tool design	Define name, description, input/output schemas, handler logic.	Ambiguous description leads agent to misuse tool.
Registration	Add tool to server’s capability registry before transport starts.	Registering after connection fails silently.
Discovery	Client calls `tools/list` after initialisation; caches schemas.	Server doesn’t declare `tools` capability.
Parameter validation	Client checks parameters against JSON Schema before sending.	Agent hallucinates parameters; validation catches it.
Tool invocation	Server executes handler, enforces auth and quotas, returns result.	Timeout, backend failure, invalid parameters.
Observability	Log every invocation with tool name, user, duration, success/failure.	No observability → silent failures, no debugging path.
Cache invalidation	Server sends `tools/list_changed` when tool catalogue changes.	Client misses notification → operates with stale tool list.

MCP Tool Architecture

A production‑grade MCP server organises its tools into a clear, layered architecture. Each layer has a specific responsibility, and changes to one layer should not cascade unpredictably into others.

Component Responsibilities

Layer	Responsibility	Critical Configurations
Transport	Accepts JSON‑RPC messages; negotiates protocol version.	stdio for local, Streamable HTTP for remote. Streamable HTTP with compression is preferred for high‑throughput deployments.
Auth	Validates OAuth 2.1 tokens on every request. Enforces scopes per tool.	Per‑request validation (not just session‑level).
Registry	Maintains list of available tools, their schemas, and metadata.	Versioned schemas; support for `list_changed` notifications.
Validation	Checks that incoming parameters match the tool’s JSON Schema.	Strict mode: reject any extra parameters.
Tool Handlers	Execute the actual business logic.	Idempotent where possible; timeout set per handler.
Safety	Rate limits, timeouts, sandboxing, audit logging.	Per‑tool rate limits; execution timeouts (e.g., 30s per tool).
Observability	Exports OpenTelemetry traces and Prometheus metrics.	Tool name, duration, success, user ID as attributes.

MCP Tool Design Principles

Building MCP tools that are reliable, secure, and usable by AI agents requires discipline. The protocol gives you a framework; design principles turn that framework into a production system.

1. Single Responsibility Per Tool

Each tool should do exactly one thing. A tool named manage_document that can both read and write files, and also send notifications, is three tools wearing a trench coat. The agent cannot reliably predict its behaviour, and you cannot effectively authorize or monitor it.

Good: read_document, write_document, send_notification as separate tools. Bad: document_manager with an action parameter.

2. Bounded Context per Server

An MCP server should expose well‑defined tools, resources, and prompts that serve a coherent domain — not act as a one‑to‑one API wrapper. A single server focused on “order management” (tools: get_order, update_order, cancel_order; resources: order schemas, status definitions) is far more maintainable than a monolithic server that mixes orders, customers, inventory, and shipping.

3. Schema First, Always

Define clear input/output schemas for every tool. Typed schemas are not optional decorations — they are the contract that prevents ambiguity when AI clients ask for actions. Use strict typing, documented error cases, and consistent naming.

4. Descriptions That Guide (Without Poisoning)

The tool description is what the LLM sees to decide whether to call the tool and how to parameterize it. Write descriptions that are clear, concrete, and safe. Avoid language that could be interpreted as an instruction to bypass security controls.

A tool description should answer three questions for the LLM: (1) What does this tool do? (2) When should I use it instead of another tool? (3) What happens if it fails?

5. Stateless by Default

Tools should not maintain session state across invocations. If state is required (e.g., a multi‑turn approval workflow), store it in the client or in an external state store, not in the tool handler. Stateless tools scale horizontally and survive crashes without losing context.

6. Idempotency for State‑Changing Tools

Any tool that creates, updates, or deletes data must accept an idempotency_key parameter. The server should store the key and the result of the first execution, returning the same result for subsequent calls with the same key. This prevents duplicate operations when the client retries due to network timeouts.

7. Fail Explicitly

Return structured errors that clients can interpret. A raw stack trace or an ambiguous {"error": "something went wrong"} leaves the agent guessing. Use error codes that distinguish between client errors (4xx, malformed request) and server errors (5xx, temporary failures), so the client knows whether to retry.

8. Observability by Default

Every tool invocation must produce a structured log entry and a metric span. Without observability, a tool failure is a black box. With observability, you can trace a hallucinated parameter back to a vague description, or a timeout back to an unoptimized query.

9. Least Privilege in Tool Permissions

Each tool should require the minimal OAuth scope necessary for its operation. A tool that only reads orders should not require write scopes. This limits the blast radius of a compromised client or a misused token.

10. Version Tools, Not Schemas

When a tool’s behaviour changes in a backward‑incompatible way, create a new tool version (search_orders_v2) rather than modifying the existing tool in place. Keep both versions registered during migration. Clients that expect the old behaviour continue to work; new clients can adopt the new version.

MCP Tool Lifecycle in Production

Tool Design

A well‑designed MCP tool starts with a clear statement of intent. Before writing a single line of code, answer: What user goal does this tool serve? What external system does it touch? What are the success and failure modes?

Example — flight search tool:

{
  "name": "search_flights",
  "description": "Search for available flights. Returns a list of flight options with prices and availability. Use this when the user asks to find, check, or look up flights.",
  "inputSchema": {
    "type": "object",
    "properties": {
      "origin": { "type": "string", "description": "IATA airport code (e.g., 'JFK', 'LAX')", "pattern": "^[A-Z]{3}$" },
      "destination": { "type": "string", "description": "IATA airport code", "pattern": "^[A-Z]{3}$" },
      "date": { "type": "string", "description": "Flight date in YYYY-MM-DD format", "pattern": "^\\d{4}-\\d{2}-\\d{2}$" },
      "passengers": { "type": "integer", "minimum": 1, "maximum": 9, "default": 1 }
    },
    "required": ["origin", "destination", "date"]
  }
}

The description tells the LLM when to use this tool and what it returns. The input schema provides strict validation — airports must be three uppercase letters, dates must be in ISO format, passenger count is bounded.

Tool Registration

Registration is the act of telling the MCP server which tools it exposes and how to execute them. Registration must complete before the transport is opened — in the TypeScript SDK, calling registerTool after connect() fails with a “Cannot register capabilities after connecting to transport” error. The exact registration API varies by SDK: some use mcp.tool(), others app.register_tool().

# FastMCP registration (Python)
from mcp.server.fastmcp import FastMCP

mcp = FastMCP("Flight Search")

@mcp.tool()
def search_flights(origin: str, destination: str, date: str, passengers: int = 1) -> list:
    """Search for available flights."""
    return flight_service.search(origin, destination, date, passengers)

The FastMCP decorator automatically extracts the function signature, docstring, and type hints to generate the JSON Schema — removing boilerplate while keeping the contract explicit.

Tool Discovery

After initialisation, clients call tools/list to discover available tools. The server must respond with the full catalogue, including pagination for large tool sets.

Server response (excerpt):

{
  "tools": [
    {
      "name": "search_flights",
      "description": "Search for available flights...",
      "inputSchema": { ... }
    }
  ],
  "nextCursor": null
}

Clients should cache the tool list and subscribe to notifications/tools/list_changed to be notified when the catalogue changes, avoiding unnecessary polling.

Tool Invocation

When the agent decides to call a tool, the client sends tools/call with the tool name and parameters.

Request:

{
  "jsonrpc": "2.0",
  "id": 3,
  "method": "tools/call",
  "params": {
    "name": "search_flights",
    "arguments": {
      "origin": "JFK",
      "destination": "LHR",
      "date": "2026-12-15",
      "passengers": 2
    }
  }
}

The server validates the parameters against the schema, executes the handler, and returns a result.

Result Processing

A successful tool call returns a structured result that the agent can use in subsequent reasoning.

{
  "content": [
    {
      "type": "text",
      "text": "Found 3 flights: UA123 at $450 (non-stop), BA456 at $520 (non-stop), DL789 at $390 (one stop)"
    }
  ],
  "isError": false
}

For errors, set isError: true and include an error description that the agent can interpret. Raw stack traces belong in logs, not in the response to the agent.

MCP Tool Input Design

Input parameters are the primary interface between the agent’s natural language understanding and your tool’s logic. A well‑designed input schema prevents mis‑calls before they reach your backend.

Use JSON Schema Draft 2020‑12

MCP Tools inputSchema MUST conform to JSON Schema Draft 2020‑12 (or the protocol extension endorsed in SEP‑2106). Stick to the subset that LLM prompting can reliably populate: type, properties, required, description, enum, minimum/maximum, pattern, and default.

Provide Clear Descriptions for Every Parameter

{
  "status": {
    "type": "string",
    "enum": ["pending", "approved", "rejected", "shipped"],
    "description": "Order status filter. One of: pending, approved, rejected, shipped"
  }
}

The description guides the LLM toward valid values and explains what each option means.

Use Enums for Closed Sets

If a parameter has a fixed set of valid values, use enum. The LLM will see the options and select among them, reducing hallucination.

Avoid Deeply Nested Objects

LLMs struggle to generate correctly nested JSON for deeply nested objects. Keep input structures flat. If you need nested data, consider splitting the tool into two simpler tools.

Validate Twice: Schema + Business Logic

Schema validation catches type errors and missing required fields. Business logic validation catches semantic errors — e.g., “refund amount cannot exceed original payment” or “destination airport cannot be the same as origin”. Always validate at both layers.

MCP Tool Output Design

Tool outputs are returned to the MCP client as content or blob. For text‑based tools (most agent interactions), return structured data that the agent can parse and present naturally.

Return Structured, Not Raw

{
  "content": [
    {
      "type": "text",
      "text": "Order #12345: Delivered on 2026-06-15. Total: $49.99."
    }
  ]
}

The text should be a complete, human‑readable answer that the agent can present directly. Avoid returning raw JSON that the agent would need to interpret and format.

Include Context for Follow‑up Tools

If the tool is part of a chain (e.g., search_flights followed by book_flight), include enough context in the output for the next tool. Use consistent identifiers across tools.

Distinguish Client Errors from Server Errors

Error Type	HTTP‑inspired Code	Retry?	Description
Bad request	`INVALID_PARAMS`	No	Missing required parameter, wrong type, out of range.
Not found	`NOT_FOUND`	No	Resource does not exist.
Unauthorized	`UNAUTHORIZED`	No (unless new credentials available).	Missing or invalid authentication.
Too many requests	`RATE_LIMITED`	Yes, with backoff	Rate limit exceeded.
Timeout	`TIMEOUT`	Yes	Tool execution exceeded timeout.
Internal error	`INTERNAL_ERROR`	Yes, limited	Unexpected server or backend failure.

Return isError: true and include an error object with a code and message. The agent can then decide whether to retry, ask the user, or choose a different tool.

MCP Tool Categories

MCP tools span a wide range of domains. These categories reflect real production deployments, not theoretical possibilities.

Category	Description	Example Tools	Production Use Cases
API Tools	Call external REST, GraphQL, or SOAP APIs.	`get_order`, `create_ticket`, `send_slack`, `refund_transaction`	OIC MCP server exposing enterprise integrations; Grafana Assistant calling GitHub APIs to apply cloud cost optimizations.
Database Tools	Execute read or write queries against SQL, NoSQL, or vector databases.	`execute_sql`, `search_vectors`, `insert_record`, `run_migration`	Business Central MCP server querying ERP records; Document servers retrieving customer data.
Knowledge Tools	Search and retrieve information from knowledge bases.	`search_docs`, `get_article`, `find_similar`, `list_faqs`	Customer support agents accessing help centres; enterprise RAG workflows.
Productivity Tools	Interact with calendars, email, documents, task managers.	`book_calendar`, `send_email`, `create_doc`, `add_task`	Personal assistant agents; team coordination workflows.
Infrastructure Tools	Manage cloud resources, containers, or configuration.	`list_instances`, `scale_deployment`, `restart_service`, `get_metrics`	Cloud FinOps agents identifying and applying cost optimisation actions.
Development Tools	Interact with Git, CI/CD pipelines, code analysis.	`create_pr`, `run_tests`, `analyze_code`, `deploy_service`	GitHub MCP server enabling Claude Code to manage repositories.
Enterprise Systems	Connect to CRM, ERP, integration platforms.	`get_customer`, `search_invoices`, `trigger_workflow`, `approve_expense`	Invoice processing agents using OIC as an MCP tools provider; Oracle Integration exposing existing integrations as discoverable tools.

MCP Tools and Agent Planning

Planning determines which tool to call, when to call it, and how to sequence multiple tool calls. The MCP layer does not do planning — it executes the plan produced by the agent’s planner.

The planner decides on search_flights first because the flight must be found before it can be selected, and selected before it can be booked. This logical dependency is encoded in the plan, not in the MCP protocol.

MCP’s contribution is to make each of these steps reliable and observable. When the planner calls search_flights, the MCP layer validates parameters, executes the search, and returns a consistent, structured result. The workflow engine handles retries and fallbacks without the planner needing to know about timeouts or network blips.

MCP Tools and Agent Workflows

Workflows orchestrate sequences of tool calls, handling parallelism, conditionals, and error recovery. The MCP client is a participant in the workflow — it dispatches tool calls and returns results, but the workflow engine decides the order.

Sequential Tool Chaining

Each tool’s output becomes input to the next. The workflow engine must manage data flow: the output of search_flights must be available to select_cheapest, and the selected flight ID must be passed to book_flight.

Parallel Tool Execution

Parallel execution reduces total latency from the sum of tool durations to the maximum duration. The MCP client must support concurrent tools/call requests; the workflow engine must handle fan‑out and fan‑in without blocking.

Conditional Execution

The workflow engine evaluates conditions (e.g., if delayed then refund) and routes execution accordingly. MCP tools are invoked only when the condition is met — the protocol does not support conditional execution directly.

MCP Tools and Security

Security is the most critical — and most frequently neglected — aspect of MCP tool development. As of early 2026, only 8.5% of MCP servers in the ecosystem use OAuth. The remaining 91.5% rely on static API keys, shared tokens, or no authentication at all.

The MCP specification mandates OAuth 2.1 with PKCE for all remote servers. Anonymous connections must be rejected at the transport layer. Static API keys are not a substitute for a properly implemented OAuth flow.

Authentication Checklist

OAuth 2.1 with PKCE – Required for all remote MCP servers per the March 2025 specification update. Anonymous connections must be rejected at the transport layer.
Per‑request token validation – Validate the token’s signature, issuer (iss), audience (aud), expiry (exp), and required scope on every tools/call. Do not rely on session‑level authentication only.
Short‑lived access tokens – Tokens must expire within minutes, not hours. Use refresh tokens with rotation.
Redirect URI whitelisting – The OAuth authorization server must maintain a strict allowlist of permitted redirect URIs to prevent interception attacks.

Authorization & Tool Permissions

Each tool should require a specific OAuth scope. Clients request the minimum set of scopes needed. The server validates that the token includes the required scope for that specific tool before execution.

Example scope design:

orders:read – tools that only read order data (get_order, list_orders)
orders:write – tools that update order state (update_status, cancel_order)
payments:write – high‑risk payment tools (refund, charge)

Per‑tool authorization (allowlist by scope) is a core security control. Without it, a client with a token for orders:read could call refund if the server does not check scope granularity.

Input Validation and Injection Defense

Validation failures are a leading cause of vulnerabilities. Among 2,614 MCP implementations surveyed, 82% use file operations vulnerable to path traversal, and more than a third are susceptible to command injection.

Validate all inputs twice: first against the JSON Schema (structure and type), then against business rules (allowed values, range limits, permissions). Never pass unvalidated parameters to shell commands, SQL queries, or system calls.

Secrets Management

Never hardcode secrets in the server binary or configuration files. Use environment variables and a secret store (HashiCorp Vault, AWS Secrets Manager). Inject secrets at deployment time, not at build time. Never log secrets; redact them from logs and error responses.

Audit Logging

Log every tool invocation with:

Timestamp
Client / user identifier (from token claims)
Tool name and parameters (redact sensitive fields such as passwords, API keys, or PII)
Outcome (success/error)
Duration
Request ID for correlation

Audit logs must be stored in a tamper‑evident system. They are your primary evidence for compliance and post‑incident review.

MCP Tool Observability

MCP servers have no built‑in observability. Without instrumentation, tool‑call latency, errors, and performance baselines remain invisible, and you are effectively gambling in production.

Observability improves reliability, identifies security risks (60% of incidents involve unexpected access), and helps allocate resources effectively — 20% of tools handle 80% of requests.

Three Observability Layers

Layer	What to Monitor	Why It Matters
Transport/Protocol	Handshake success rate (target > 99% for HTTP), session duration, JSON‑RPC error codes (-32601 method not found, -32603 internal error). Handshake failures can appear 15–30 minutes before full system outage.	Early warning of connectivity issues before they cause tool failures.
Tool Execution	Latency (p95 target < 500ms for typical tools), success/error rate (distinguish 4xx client errors from 5xx server errors), saturation (rate‑limited calls). Parameter validation errors should remain below 0.5%.	Direct performance and reliability of each tool.
Agentic Performance	Task Success Rate (TSR) — mature systems achieve 85–95%. Turns‑to‑Completion (TTC) — 2–5 turns per task. Tool hallucination rate — 2–8% in mature systems.	End‑to‑end agent effectiveness using your tools.

Metrics and Tools

Prometheus + Grafana – Export tool‑level metrics: tool_calls_total, tool_duration_seconds, tool_errors_total with labels for tool name and error type.
OpenTelemetry – Add spans around each tool handler. Platforms like Heimdall provide OpenTelemetry‑native observability for MCP servers.
Observability SDKs – mcp-observability and observatory-mcp add comprehensive analytics with 2–3 lines of code, capturing latency, success rates, protocol messages, and errors.
Elastic APM – Instrument MCP servers with OpenTelemetry to analyze tool‑call traces and identify slow tools via the same agent that produced them.

MCP Tools in Popular Agent Frameworks

Every major agent framework now supports MCP tools through adapters or native integration. This table compares the key approaches.

Framework	Integration Method	MCP Support	Strengths	Limitations
LangGraph	`langchain-mcp-adapters`	Tools only (via conversion to LangChain tools)	Deep workflow orchestration, stateful graphs, checkpointing, parallelism. Multi‑server client supports stdio and HTTP.	Tool conversion adds slight overhead; requires adapter library.
CrewAI	`crewai-tools` library	Tools only	Simple role‑based tool assignment; agents defined with role/goal/backstory.	No native MCP discovery; tools must be pre‑registered.
AutoGen	`autogen-ext-mcp` package	Tools only	Conversational multi‑agent workflows; supports MCP tool discovery and invocation.	No resource or prompt support via MCP.
OpenAI Agents SDK	Native via `HostedMCPTool`	Tools only	Tight OpenAI integration; simple handoffs; built‑in tracing.	Vendor‑lock risk; limited to OpenAI stack.
Semantic Kernel	Plugin integration via MCP	Tools, resources (limited)	Enterprise .NET/Java; plugin ecosystem.	MCP support is less mature than Python frameworks.
LlamaIndex	MCP integration via `llamaindex-mcp`	Tools, resources	Strong RAG workflows; data‑first architecture.	Smaller MCP community adoption.

LangGraph + MCP Example (Python):

from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
from langchain_mcp_adapters.tools import load_mcp_tools
from langgraph.prebuilt import create_react_agent

server_params = StdioServerParameters(
    command="python",
    args=["/path/to/math_server.py"],
)

async with stdio_client(server_params) as (read, write):
    async with ClientSession(read, write) as session:
        await session.initialize()
        tools = await load_mcp_tools(session)
        agent = create_react_agent("openai:gpt-4.1", tools)
        response = await agent.ainvoke({"messages": "what's (3 + 5) x 12?"})

The adapter converts MCP tools into native LangChain tools, making them usable in LangGraph agents. The same pattern works for MultiServerMCPClient, which connects to multiple MCP servers simultaneously.

CrewAI + MCP Example (High‑Level):

Connect CrewAI agents to MCP servers using the crewai-tools library. The tools library converts MCP server capabilities into native CrewAI tools, allowing role‑based agents to discover and invoke MCP‑exposed functions seamlessly.

MCP Tool Best Practices (Checklist)

Design‑Time

Each server exposes a single bounded context — one domain per server.
Each tool has single responsibility — one action, clearly named.
Input and output schemas use JSON Schema Draft 2020‑12 and are fully described.
Tool names follow a consistent convention — verb_noun (search_orders, create_ticket).
Descriptions answer: What does it do? When to use it? What are the parameters?

Implementation

Handlers are stateless — no side effects beyond the explicit tool operation.
Idempotency keys are accepted for all state‑changing tools.
Timeouts are set per tool (not globally).
Inputs are validated twice — schema + business logic.
Outputs are structured text, not raw JSON.

Security & Compliance

OAuth 2.1 with PKCE is implemented and mandatory for all remote connections.
Tokens are validated per request — signature, issuer, audience, expiry, scope.
Each tool has a minimum required scope — least privilege per tool.
Audit logging is enabled — every tool call recorded with user ID, timestamp, parameters (redacted), outcome.
Secrets are stored in a vault (not environment variables or source code).

Deployment & Operations

All tool calls are instrumented with OpenTelemetry or Prometheus metrics.
Observability data is routed to a monitoring dashboard (Grafana, Heimdall) for alerting and analysis.
Change notifications (tools/list_changed) are implemented and tested.
Rate limits are configured per tool and per user.
Health checks and readiness probes are defined for the MCP server.

Testing & Validation

Unit tests cover tool handler logic with mocked backends.
Integration tests validate tool discovery and invocation against a real MCP client.
Security tests include parameter injection attempts, malformed tokens, and path traversal.
Load tests verify rate limit thresholds and latency SLIs.

Common MCP Tool Mistakes

Mistake	Consequence	Fix
Exposing too many tools per server	Client overwhelmed; token cost for tool descriptions high.	Keep < 20 tools per server; split into bounded‑context servers.
Poor schema design (missing enums, vague descriptions)	LLM hallucinates invalid parameters; tool fails silently.	Use strict typing; provide examples in descriptions; validate before execution.
No authentication for remote servers	Server becomes open proxy to downstream systems.	Implement OAuth 2.1 + PKCE before production deployment.
Static API keys instead of OAuth	Keys leaked; no per‑request identity; difficult to rotate.	Replace with OAuth 2.1 + short‑lived tokens.
Session‑level authentication only	Revoked tokens still work for already‑authenticated sessions.	Validate token on every `tools/call`.
No tool‑level authorization	Client with a read token can call write tools.	Enforce per‑tool OAuth scopes.
Missing change notifications (`tools/list_changed`)	Clients use stale tool definitions after server update.	Implement and send notifications when tool catalogue changes.
No observability instrumentation	Cannot debug latency, failures, or security incidents.	Add OpenTelemetry traces and Prometheus metrics from day one.
No idempotency for state‑changing tools	Retry after timeout creates duplicate side effects (e.g., double refund).	Accept and enforce `idempotency_key`.
Overly complex nested parameters	LLM struggles to generate valid nested JSON; failures increase.	Flatten parameters; split tool if necessary.

Case Study: Enterprise Cloud Operations MCP Server

Scenario: A large enterprise deploys an MCP server that exposes cloud cost management and infrastructure tools to AI agents. The server integrates with AWS APIs, Kubernetes clusters, and an internal ticketing system. Agents are used by FinOps teams, on‑call engineers, and security analysts.

Tool Catalogue

Tool	Input Schema	Output	OAuth Scope	Risk Level
`list_instances`	`{ region: string, status: optional string }`	List of EC2 instance IDs and states	`cloud:read`	Low (read‑only)
`get_cost_forecast`	`{ start_date: date, end_date: date, service: optional string }`	Cost forecast by service	`billing:read`	Low
`rightsize_instance`	`{ instance_id: string, new_type: string, reason: string }`	Success / failure	`cloud:write`	High (state change)
`create_savings_plan`	`{ commitment: number, term: string, payment_option: string }`	Savings plan ID	`billing:write`	High (financial)
`deploy_terraform`	`{ config_path: string, auto_approve: boolean }`	Deployment ID	`infra:write`	Critical (infrastructure change)

Tool Architecture

Security Model

OAuth 2.1 with PKCE — Tokens issued by corporate IdP (Okta). Every request validates signature, issuer, audience, expiry, and scope.
Per‑tool scopes — cloud:read for read operations, cloud:write for rightsizing, billing:write for savings plans, infra:write for Terraform.
Human‑in‑the‑loop for high‑risk tools — deploy_terraform and create_savings_plan require explicit human approval via a separate workflow before execution.

Observability Implementation

Metrics — Prometheus exports tool_calls_total, tool_duration_seconds (p50/p95/p99), tool_errors_total labelled by tool name and error type. Dashboard in Grafana shows success rates, latency trends, and rate‑limited calls.
Traces — OpenTelemetry spans for every tool call, propagated from the client to the server.
Audit logs — JSON structured logs with {timestamp, user_id, tool_name, parameters(redacted), success, duration_ms}. Stored in a tamper‑evident S3 bucket.

Governance

Tool ownership — Each tool has a designated owner (FinOps team for cost tools, Platform team for infra tools). Ownership is documented in the tool metadata.
Change management — Tool schema changes require approval from tool owner and security team. Breaking changes introduce _v2 tools.
Lifecycle policy — Tools are reviewed quarterly; deprecated tools are marked with a warning in the description for one release cycle before removal.

Production Outcomes

After deployment, the Cloud Ops MCP server processed over 50,000 tool calls per month:

Tool success rate: 99.2% (target >99%)
P95 latency: list_instances 180ms; rightsize_instance 1.2s (includes change review)
Human‑in‑the‑loop rate: 15% of rightsize_instance calls, 100% of deploy_terraform
Estimated cost savings: $240,000 annually from rightsizing recommendations executed automatically

The server runs on Kubernetes with 3 replicas, handling up to 300 concurrent tool calls. No security incidents were recorded in the first 12 months of production.

FAQ

1. What is the difference between an MCP Tool and an MCP Resource?
A tool is an action the agent can execute — it has side effects, calls APIs, modifies state, and is model‑controlled. A resource is read‑only context the agent can fetch — no side effects, no state modification, and is application‑controlled. Use tools for “do something”; use resources for “know something”.

2. How many tools should a single MCP server expose?
Keep the tool catalogue focused — typically 5–15 tools per server. Too many tools overwhelm the LLM (selection accuracy drops), increase token cost for tool descriptions, and make authorization boundaries unclear. Split across bounded‑context servers if you need more.

3. What is the purpose of a tool description in MCP?
The description tells the LLM what the tool does, when to use it, and what parameters to pass. It is the primary signal for tool selection. Write descriptions that are clear, concrete, and safe — avoid language that could be interpreted as an instruction to bypass security.

4. Should MCP tools be stateless?
Yes — tools should not maintain session state across invocations. Stateless tools scale horizontally and survive crashes without losing context. If state is required (multi‑turn approval workflows), store it in the client or an external state store, not in the tool handler.

5. How do I handle tool failures in MCP?
Distinguish transient failures (retry with exponential backoff) from permanent failures (surface structured error to agent, which may choose an alternative tool or ask the user). Set per‑tool timeouts and circuit breakers for downstream dependencies.

6. What is the most common MCP tool security vulnerability?
Lack of authentication — as of early 2026, only 8.5% of MCP servers use OAuth. The second most common is insufficient input validation, leading to path traversal (82% of file‑accessing servers) and command injection vulnerabilities.

7. Is OAuth 2.1 with PKCE mandatory for MCP tools?
Yes — for all remote MCP servers per the March 2025 specification update. Local stdio servers are exempt because they run in a trusted process. Static API keys are not a substitute.

8. How do I version an MCP tool?
For backward‑compatible changes (adding optional fields, expanding enums), update the schema in place. For breaking changes (renaming fields, changing types, removing parameters), create a new tool version (e.g., search_orders_v2). Keep both versions registered during migration.

9. How do I test MCP tools without a real backend?
Unit test tool handlers with mocked backend dependencies. Integration test the full MCP JSON‑RPC flow using a test client. Use langchain-mcp-adapters with a mock MCP server to test agent‑tool interactions.

10. What is the typical latency for an MCP tool call?
The MCP protocol adds minimal overhead (< 1ms). Most latency comes from network round trip (5–100ms), tool handler execution (database query, API call), and result serialization. Cached read‑only tools can achieve < 5ms p95 latency. Complex tools may take seconds — set appropriate timeouts.

11. How do LangGraph and CrewAI differ in their MCP tool support?
LangGraph (via langchain-mcp-adapters) offers deep workflow orchestration, stateful graphs, and parallel tool execution. CrewAI (via crewai-tools) focuses on role‑based assignment, with a simpler, less granular abstraction. LangGraph is better for complex, stateful workflows; CrewAI excels at linear, role‑based collaboration.

12. What is the purpose of notifications/tools/list_changed?
When the server’s tool catalogue changes (new tool added, tool removed, schema updated), the server sends this notification. The client invalidates its cache and re‑fetches the tool list. Without this, clients may operate with stale tool definitions, leading to method not found errors.

13. Can an MCP tool call another MCP tool?
Not directly. MCP tools are atomic operations — one tool call, one handler execution. If your logic requires multiple backend calls, handle them inside the tool handler. For tool orchestration across different domains, the agent’s workflow engine should sequence the calls, not the tool itself.

14. What is the difference between an MCP Tool and an A2A (Agent‑to‑Agent) tool?
MCP tools are defined locally on an MCP server. A2A tools are external agents registered as tools in a gateway — when the client calls the tool, the gateway forwards the request to the external agent, which may return a tool‑like result. A2A tools extend MCP to cross‑agent communication.

15. How do I monitor MCP tool performance in production?
Instrument every tool handler with OpenTelemetry spans and Prometheus metrics. Export to Grafana for dashboards and alerting. Track success rate (target >99%), p95 latency, error rate by tool, and saturation (rate‑limited calls). Platforms like Heimdall provide OpenTelemetry‑native observability for MCP servers.

16. What is the biggest mistake teams make when building MCP tools?
Designing tools that mirror internal API structures instead of agent‑oriented interfaces. An API with a perform_operation(mode="read") parameter is not a good tool — split it into read_data and write_data. Optimise for LLM comprehension, not backend convenience.

17. How do I handle rate limiting for MCP tools?
Implement rate limiting at the MCP server level — per user, per tool, and globally. Use a token bucket or sliding window algorithm. When a rate limit is exceeded, return a RATE_LIMITED error with a retry_after hint. The client may retry after the suggested interval.

18. Can MCP tools be used in multi‑agent systems?
Yes — multiple agents can share the same MCP server. Each agent authenticates independently, and the server enforces per‑agent authorization and rate limits. Shared tools reduce duplication and ensure consistency across agents.

19. How do I debug “method not found” errors for MCP tools?
The client’s tool list is stale — the server likely sent a tools/list_changed notification that the client missed, or the client never fetched the list. Re‑fetch the tool list with tools/list and verify the tool name. In production, ensure both sides implement change notifications correctly.

20. What are the most production‑tested MCP tool frameworks?
LangGraph with langchain-mcp-adapters is the most battle‑tested for complex, stateful workflows. CrewAI excels for role‑based collaboration. Oracle Integration, Grafana, and Workato have demonstrated enterprise‑grade MCP tool deployments across cloud cost, invoice processing, and CRM use cases.

Continue Your Journey

Now that you understand how to engineer production‑grade MCP tools, explore the rest of the MCP ecosystem:

MCP Basics – MCP Introduction (protocol fundamentals and architecture)
MCP Servers – MCP Server (building and deploying capability providers)
MCP Clients – MCP Client (connecting agents to servers)
MCP Resources – MCP Resources (exposing read‑only context)
MCP Prompts – MCP Prompts (reusable templates for agents)
MCP Security – MCP Security (authentication, authorization, audit)
Tool Calling – Tool Calling (how agents select and invoke tools)
Agent Workflows – Agent Workflows (orchestrating multiple tools)

Or return to the Agent Learning Path to see where MCP Tools fit in your overall agent engineering roadmap.

This article is part of the AgentDevPro Production Agent Engineering Handbook. Updated for Q2 2026.

What Are MCP Tools​

Why MCP Tools Matter​

MCP Capability Model: Tools in Context​

MCP Tools in Agent Architecture​

MCP Tool Lifecycle​

Stage Details​

MCP Tool Architecture​

Component Responsibilities​

MCP Tool Design Principles​

1. Single Responsibility Per Tool​

2. Bounded Context per Server​

3. Schema First, Always​

4. Descriptions That Guide (Without Poisoning)​

5. Stateless by Default​

6. Idempotency for State‑Changing Tools​

7. Fail Explicitly​

8. Observability by Default​

9. Least Privilege in Tool Permissions​

10. Version Tools, Not Schemas​

MCP Tool Lifecycle in Production​

Tool Design​

Tool Registration​

Tool Discovery​

Tool Invocation​

Result Processing​

MCP Tool Input Design​

Use JSON Schema Draft 2020‑12​

Provide Clear Descriptions for Every Parameter​

Use Enums for Closed Sets​

Avoid Deeply Nested Objects​

Validate Twice: Schema + Business Logic​

MCP Tool Output Design​

Return Structured, Not Raw​

Include Context for Follow‑up Tools​

Distinguish Client Errors from Server Errors​

MCP Tool Categories​

MCP Tools and Agent Planning​

MCP Tools and Agent Workflows​

Sequential Tool Chaining​

Parallel Tool Execution​

Conditional Execution​

MCP Tools and Security​

Authentication Checklist​

Authorization & Tool Permissions​

Input Validation and Injection Defense​

Secrets Management​

Audit Logging​

MCP Tool Observability​

Three Observability Layers​

Metrics and Tools​

MCP Tools in Popular Agent Frameworks​

MCP Tool Best Practices (Checklist)​

Design‑Time​

Implementation​

Security & Compliance​

Deployment & Operations​

Testing & Validation​

Common MCP Tool Mistakes​

Case Study: Enterprise Cloud Operations MCP Server​

Tool Catalogue​

Tool Architecture​

Security Model​

Observability Implementation​

Governance​

Production Outcomes​

FAQ​

Continue Your Journey​