Playbook

Building an MCP server takes 30 minutes. Building one that works reliably in production takes deliberate design: good tool descriptions, proper error handling, timeouts, and transport awareness.

Custom Server: TypeScript

The TypeScript SDK is the most mature. Requires Node.js 18+ and moduleResolution: NodeNext in tsconfig.json.

import { McpServer } from "@modelcontextprotocol/server";
import { StdioServerTransport } from "@modelcontextprotocol/server";
import * as z from "zod/v4";
import type { CallToolResult } from "@modelcontextprotocol/server";
 
const server = new McpServer(
  { name: "app-metrics", version: "1.0.0" },
  {
    instructions: "Server for querying application metrics. Use when asked about performance data, error rates, or user analytics."
  }
);
 
server.registerTool(
  "get-error-rate",
  {
    title: "Error Rate",
    description: "Get the error rate for a service over a time range. Returns rate as a decimal, total request count, and error count. Use for monitoring dashboards and incident investigation.",
    inputSchema: z.object({
      service: z.string().describe("Service name, e.g. 'api-gateway'"),
      hours: z.number().default(24).describe("Lookback window in hours")
    }),
    outputSchema: z.object({
      rate: z.number(),
      total_requests: z.number(),
      error_count: z.number()
    })
  },
  async ({ service, hours }): Promise<CallToolResult> => {
    const data = await queryMetrics(service, hours);
    const output = {
      rate: data.errors / data.total,
      total_requests: data.total,
      error_count: data.errors
    };
    return {
      content: [{ type: "text", text: JSON.stringify(output) }],
      structuredContent: output
    };
  }
);
 
const transport = new StdioServerTransport();
await server.connect(transport);

Three patterns define production quality: descriptions that tell Claude when to use the tool (not just what it does), structured output schemas for machine-readable responses, and the instructions field on the server itself for cross-cutting context.

Custom Server: Python (FastMCP)

Python's FastMCP provides decorator-based registration. Docstrings become tool descriptions automatically.

from mcp.server.fastmcp import FastMCP
import json
 
mcp = FastMCP("app-metrics", json_response=True)
 
@mcp.tool()
def get_error_rate(service: str, hours: int = 24) -> dict:
    """Get the error rate for a service over a time range.
    Returns rate as decimal, total requests, and error count.
    Use for monitoring and incident investigation.
 
    Args:
        service: Service name, e.g. 'api-gateway'
        hours: Lookback window in hours (default 24)
    """
    data = query_metrics(service, hours)
    return {
        "rate": data["errors"] / data["total"],
        "total_requests": data["total"],
        "error_count": data["errors"]
    }
 
@mcp.resource("services://all")
def list_services() -> str:
    """List all monitored services."""
    return json.dumps(get_service_list())
 
if __name__ == "__main__":
    mcp.run(transport="stdio")

A more complete Python server with multiple tools and a resource template:

from mcp.server.fastmcp import FastMCP
import json
import subprocess
 
mcp = FastMCP("ci-pipeline", json_response=True)
 
@mcp.tool()
def get_pipeline_status(repo: str, branch: str = "main") -> dict:
    """Get CI pipeline status for a branch.
    Returns latest run status, duration, and failing jobs.
    Use when investigating build failures or checking deploy readiness.
 
    Args:
        repo: Repository name, e.g. 'myorg/api-service'
        branch: Git branch name (default 'main')
    """
    result = subprocess.run(
        ["gh", "run", "list", "--repo", repo, "--branch", branch,
         "--limit", "1", "--json", "status,conclusion,name,updatedAt,databaseId"],
        capture_output=True, text=True, timeout=15
    )
    runs = json.loads(result.stdout)
    return runs[0] if runs else {"error": "No runs found"}
 
@mcp.tool()
def get_failed_jobs(repo: str, run_id: int) -> list:
    """Get details of failed jobs in a CI run.
    Returns job names, step names, and error messages.
 
    Args:
        repo: Repository name
        run_id: The workflow run ID from get_pipeline_status
    """
    result = subprocess.run(
        ["gh", "run", "view", str(run_id), "--repo", repo,
         "--json", "jobs", "--jq", '.jobs[] | select(.conclusion == "failure")'],
        capture_output=True, text=True, timeout=15
    )
    return json.loads(f"[{result.stdout}]") if result.stdout else []
 
@mcp.resource("repos://{repo}/workflows")
def list_workflows(repo: str) -> str:
    """List all CI workflows configured for a repository."""
    result = subprocess.run(
        ["gh", "workflow", "list", "--repo", repo, "--json", "name,state,id"],
        capture_output=True, text=True, timeout=10
    )
    return result.stdout
 
if __name__ == "__main__":
    mcp.run(transport="stdio")

SDK version warning: Python SDK V1 to V2 is a breaking migration. Pin your version explicitly. FastMCP was the V1 class name; V2 uses McpServer. Check your import paths.

Error Handling Pattern

Never throw from a tool handler. Return isError: true to communicate failures to Claude as structured data rather than crashing the server.

async ({ url }): Promise<CallToolResult> => {
  try {
    const res = await fetch(url);
    if (!res.ok) {
      return {
        content: [{ type: "text", text: `HTTP ${res.status}: ${res.statusText}` }],
        isError: true
      };
    }
    return { content: [{ type: "text", text: await res.text() }] };
  } catch (error) {
    return {
      content: [{
        type: "text",
        text: `Failed: ${error instanceof Error ? error.message : String(error)}`
      }],
      isError: true
    };
  }
}

Claude interprets isError: true as a failed operation and adjusts its strategy -- retrying, trying an alternative tool, or reporting the failure to the user. Unhandled exceptions crash the server process entirely.

Timeout Pattern

Every tool handler needs a timeout. Claude blocks on tool responses, and a tool that hangs for 60 seconds freezes the entire conversation.

server.registerTool("slow-query", {
  description: "Execute an analytics query (30s timeout)",
  inputSchema: z.object({ query: z.string() })
}, async ({ query }): Promise<CallToolResult> => {
  const timeout = new Promise((_, reject) =>
    setTimeout(() => reject(new Error("Query timed out after 30s")), 30000)
  );
  try {
    const result = await Promise.race([executeQuery(query), timeout]);
    return { content: [{ type: "text", text: JSON.stringify(result) }] };
  } catch (error) {
    return {
      content: [{ type: "text", text: `Timeout: ${error.message}` }],
      isError: true
    };
  }
});

30 seconds is a reasonable ceiling for most tools. For legitimately slow operations, split into "start" and "poll" tool pairs.

Remote Transport: Streamable HTTP

For servers deployed to cloud infrastructure, use Streamable HTTP instead of stdio.

import { randomUUID } from "node:crypto";
import { NodeStreamableHTTPServerTransport } from "@modelcontextprotocol/node";
 
const transport = new NodeStreamableHTTPServerTransport({
  sessionIdGenerator: () => randomUUID()
});
await server.connect(transport);

Streamable HTTP handles bidirectional messaging over a single endpoint, with optional SSE for streaming. It survives proxy timeouts and load balancer quirks better than the deprecated SSE transport.

A complete Streamable HTTP server with Express, suitable for cloud deployment:

import express from "express";
import { randomUUID } from "node:crypto";
import { McpServer } from "@modelcontextprotocol/server";
import { NodeStreamableHTTPServerTransport } from "@modelcontextprotocol/node";
import * as z from "zod/v4";
 
const app = express();
app.use(express.json());
 
const server = new McpServer(
  { name: "deploy-status", version: "1.0.0" },
  { instructions: "Deployment status and health monitoring." }
);
 
server.registerTool("get-deploy-status", {
  title: "Deployment Status",
  description: "Get the current deployment status for a service. Returns version, health, and last deploy timestamp.",
  inputSchema: z.object({
    service: z.string().describe("Service name, e.g. 'api-gateway'"),
    environment: z.enum(["staging", "production"]).default("production")
  })
}, async ({ service, environment }) => {
  const status = await fetchDeployStatus(service, environment);
  return {
    content: [{ type: "text", text: JSON.stringify(status, null, 2) }]
  };
});
 
const transport = new NodeStreamableHTTPServerTransport({
  sessionIdGenerator: () => randomUUID()
});
 
app.all("/mcp", async (req, res) => {
  await transport.handleRequest(req, res);
});
 
await server.connect(transport);
app.listen(3001, () => console.error("MCP server on :3001/mcp"));

Configure it in .mcp.json:

{
  "mcpServers": {
    "deploy-status": {
      "type": "http",
      "url": "http://localhost:3001/mcp"
    }
  }
}

Database Servers

Two production-grade options cover most relational databases.

DBHub (Bytebase) -- universal bridge for PostgreSQL, MySQL, MariaDB, SQL Server, SQLite:

claude mcp add --transport stdio db -- npx -y @bytebase/dbhub \
  --dsn "postgresql://readonly:pass@prod.db.com:5432/analytics"

Postgres MCP Pro -- PostgreSQL-specific with safe SQL parsing, execution limits, and connection pooling:

claude mcp add --transport stdio postgres-pro -- npx -y @crystaldba/postgres-mcp \
  --connection-string "postgresql://readonly:pass@prod.db.com:5432/analytics" \
  --read-only

Always use read-only credentials for production databases. Claude should never have write access to production data through MCP.

API Integration Servers

GraphQL -- auto-convert .graphql query files to MCP tools:

claude mcp add --transport stdio graphql -- npx -y graphql-mcp-server \
  --endpoint https://api.example.com/graphql \
  --queries ./queries/

REST API wrapper -- configure endpoints via YAML:

# api-config.yaml
apis:
  - name: get_users
    method: GET
    url: https://api.example.com/users
    description: "Fetch user list with optional status filter"
    params:
      - name: status
        type: string
        required: false
  - name: create_ticket
    method: POST
    url: https://api.example.com/tickets
    description: "Create a support ticket with title and priority"
    body:
      - name: title
        type: string
        required: true
      - name: priority
        type: string
        required: true

Monitoring Servers

Datadog (official, remote):

claude mcp add --transport http datadog https://mcp.datadoghq.com

Supports core metrics, alerting, APM, logs, traces, security findings, and synthetic tests. Provides live observability data directly in Claude Code sessions.

Grafana (local):

claude mcp add --transport stdio grafana -- npx -y @grafana/mcp-grafana

Exposes distributed tracing, dashboard management, and natural language queries for exploring services.

CI/CD Servers

GitHub (official, remote):

claude mcp add --transport http github https://api.githubcopilot.com/mcp/

Repository management, issue/PR operations, Actions workflow monitoring, build failure analysis, and release management.

Jenkins:

claude mcp add --transport stdio jenkins -- npx -y mcp-jenkins \
  --url https://jenkins.example.com \
  --user admin \
  --token $JENKINS_TOKEN

37 tools for job monitoring, build control, queue management, and pipeline configuration.

Configuration: .mcp.json

The project-scoped configuration file supports environment variable expansion with defaults.

{
  "mcpServers": {
    "db": {
      "command": "npx",
      "args": ["-y", "@bytebase/dbhub", "--dsn", "${DATABASE_URL}"],
      "env": {
        "DATABASE_URL": "${DATABASE_URL:-postgresql://localhost:5432/dev}"
      }
    },
    "api": {
      "type": "http",
      "url": "${API_BASE_URL:-https://api.example.com}/mcp",
      "headers": {
        "Authorization": "Bearer ${API_KEY}"
      }
    },
    "internal-api": {
      "type": "http",
      "url": "https://mcp.internal.example.com",
      "headersHelper": "/opt/bin/get-mcp-auth-headers.sh"
    }
  }
}

Environment variable syntax:

${VAR} -- expands to the value of VAR
${VAR:-default} -- uses VAR if set, otherwise falls back to the default
Expansion works in command, args, env, url, and headers fields

Dynamic headers with headersHelper: Runs a shell command that outputs a JSON object of headers to stdout. Executes fresh on each connection (no caching). 10-second timeout. Receives CLAUDE_CODE_MCP_SERVER_NAME and CLAUDE_CODE_MCP_SERVER_URL as environment variables.

OAuth Authentication

For servers requiring OAuth:

# Dynamic client registration
claude mcp add --transport http my-server https://mcp.example.com/mcp
 
# Pre-configured credentials with fixed callback port
claude mcp add --transport http \
  --client-id your-client-id --client-secret --callback-port 8080 \
  my-server https://mcp.example.com/mcp
 
# CI/non-interactive: pass secret via environment
MCP_CLIENT_SECRET=your-secret claude mcp add --transport http \
  --client-id your-client-id --client-secret --callback-port 8080 \
  my-server https://mcp.example.com/mcp

The equivalent .mcp.json configuration for OAuth-authenticated servers:

{
  "mcpServers": {
    "my-server": {
      "type": "http",
      "url": "https://mcp.example.com/mcp",
      "clientId": "your-client-id",
      "clientSecret": "${MCP_CLIENT_SECRET}",
      "callbackPort": 8080
    }
  }
}

Windows Quirk

On native Windows (not WSL), npx requires a cmd /c wrapper:

claude mcp add --transport stdio my-server -- cmd /c npx -y @some/package

Without this wrapper, the server process fails to start. This is the most common platform-specific MCP issue.