How to Build an MCP Server in Python (Step-by-Step)

Colorful building blocks stacked by hand

MCP (Model Context Protocol) is how AI assistants like Claude connect to external tools and data. Think of it as a USB-C port for AI — a standard interface that lets any model talk to any tool. If you want the conceptual overview first, start with What If AI Agents Had a Universal Connector? Meet MCP.

In this tutorial, we’ll build a notes MCP server from scratch using FastMCP 2.0. Not a toy calculator — something you’d actually use. By the end, you’ll have a working server with tools, resources, prompts, SQLite persistence, tests, and deployment basics.

Prerequisites: Python 3.10+, basic Python knowledge, and pip. The full source code is on GitHub.

1. Project Setup

Create a virtual environment and install FastMCP:

python -m venv .venv
source .venv/bin/activate
pip install fastmcp

Create a project directory and a single file to start:

notes-mcp/
├── server.py
└── pyproject.toml

Initialize your server in server.py:

from fastmcp import FastMCP

mcp = FastMCP("Notes")

if __name__ == "__main__":
    mcp.run()

That’s it. Three lines and you have a valid MCP server. FastMCP handles all the protocol negotiation, message framing, and transport setup.

Run it with the dev inspector to verify it works:

fastmcp dev server.py

This opens a browser-based inspector where you can see your server’s capabilities and test them interactively. Right now it’s empty — let’s fix that.

2. Your First Tool

Tools are functions the AI can call to perform actions. Let’s add one that creates a note.

We’ll start with an in-memory dictionary for storage (we’ll upgrade to SQLite later):

from fastmcp import FastMCP
from typing import Annotated
from uuid import uuid4

mcp = FastMCP("Notes")

notes: dict[str, dict] = {}

@mcp.tool
def add_note(
    title: Annotated[str, "Title of the note"],
    content: Annotated[str, "Note body text"],
) -> dict:
    """Create a new note and return it."""
    note_id = uuid4().hex[:8]
    notes[note_id] = {"id": note_id, "title": title, "content": content}
    return notes[note_id]

That @mcp.tool decorator does a lot of heavy lifting. FastMCP:

Uses the function name (add_note) as the tool name
Pulls the description from the docstring
Generates the JSON schema from type hints automatically
The Annotated strings become parameter descriptions the AI sees

Now add a search tool:

@mcp.tool
def search_notes(
    query: Annotated[str, "Text to search for in titles and content"],
) -> list[dict]:
    """Search notes by title or content."""
    query_lower = query.lower()
    return [
        note for note in notes.values()
        if query_lower in note["title"].lower()
        or query_lower in note["content"].lower()
    ]

And a delete tool:

from fastmcp.exceptions import ToolError

@mcp.tool
def delete_note(
    note_id: Annotated[str, "ID of the note to delete"],
) -> str:
    """Delete a note by ID."""
    if note_id not in notes:
        raise ToolError(f"Note '{note_id}' not found.")
    del notes[note_id]
    return f"Deleted note '{note_id}'."

Run fastmcp dev server.py again. You’ll see all three tools in the inspector. Try adding a note, searching for it, and deleting it.

To test with Claude Desktop, add this to your Claude config:

{
  "mcpServers": {
    "notes": {
      "command": "python",
      "args": ["server.py"],
      "cwd": "/path/to/notes-mcp"
    }
  }
}

3. Adding Resources

Tools perform actions. Resources expose data for the AI to read — think of them as GET endpoints.

Let’s expose each note as a resource the AI can read directly:

@mcp.resource("note://{note_id}")
def get_note(note_id: str) -> dict:
    """Read a specific note by ID."""
    if note_id not in notes:
        raise ToolError(f"Note '{note_id}' not found.")
    return notes[note_id]

The {note_id} in the URI makes this a resource template. When the AI requests note://abc123, FastMCP extracts abc123 and passes it to your function.

Add a resource that lists all notes (useful for the AI to discover what’s available):

@mcp.resource("note://all")
def list_notes() -> list[dict]:
    """List all notes with their IDs and titles."""
    return [
        {"id": note["id"], "title": note["title"]}
        for note in notes.values()
    ]

Resources vs tools — when to use which:

Resources are for reading data without side effects. The AI can fetch them to build context.
Tools are for actions that change state — creating, updating, deleting.

4. Adding Prompts

Prompts are reusable templates that guide how the AI interacts with your server. They’re useful for defining common workflows.

@mcp.prompt
def summarize_notes(
    style: Annotated[str, "Summary style: 'brief' or 'detailed'"] = "brief",
) -> str:
    """Create a prompt to summarize all stored notes."""
    note_list = "\n".join(
        f"- **{n['title']}**: {n['content']}" for n in notes.values()
    )
    if not note_list:
        return "There are no notes to summarize."

    if style == "detailed":
        return (
            f"Here are all my notes:\n\n{note_list}\n\n"
            "Please provide a detailed summary of each note, "
            "highlighting key themes and connections between them."
        )
    return (
        f"Here are all my notes:\n\n{note_list}\n\n"
        "Please provide a brief, one-paragraph summary."
    )

The @mcp.prompt decorator works like @mcp.tool — FastMCP infers the name, description, and argument schema from your function.

When to use prompts vs tools:

Prompts set up a conversation. They give the AI a template with instructions and context.
Tools do work. They execute code and return results.

Use prompts for workflows like “summarize my data” or “help me draft something based on these notes.” Use tools for CRUD operations.

5. Persistence with SQLite

Our in-memory dictionary loses everything when the server restarts. Let’s add SQLite.

Install aiosqlite for async database operations:

pip install aiosqlite

The key pattern is using FastMCP’s dependency injection with an async context manager. This gives us proper startup/shutdown lifecycle management:

import aiosqlite
from contextlib import asynccontextmanager
from fastmcp import FastMCP
from fastmcp.dependencies import Depends
from typing import Annotated
from uuid import uuid4
from fastmcp.exceptions import ToolError

DB_PATH = "notes.db"

@asynccontextmanager
async def get_db():
    """Provide a database connection with auto-cleanup."""
    db = await aiosqlite.connect(DB_PATH)
    db.row_factory = aiosqlite.Row
    try:
        yield db
    finally:
        await db.close()

mcp = FastMCP("Notes")

@mcp.tool
async def add_note(
    title: Annotated[str, "Title of the note"],
    content: Annotated[str, "Note body text"],
    db=Depends(get_db),
) -> dict:
    """Create a new note and return it."""
    note_id = uuid4().hex[:8]
    await db.execute(
        "INSERT INTO notes (id, title, content) VALUES (?, ?, ?)",
        (note_id, title, content),
    )
    await db.commit()
    return {"id": note_id, "title": title, "content": content}

@mcp.tool
async def search_notes(
    query: Annotated[str, "Text to search for in titles and content"],
    db=Depends(get_db),
) -> list[dict]:
    """Search notes by title or content."""
    cursor = await db.execute(
        "SELECT id, title, content FROM notes "
        "WHERE title LIKE ? OR content LIKE ?",
        (f"%{query}%", f"%{query}%"),
    )
    rows = await cursor.fetchall()
    return [dict(row) for row in rows]

@mcp.tool
async def delete_note(
    note_id: Annotated[str, "ID of the note to delete"],
    db=Depends(get_db),
) -> str:
    """Delete a note by ID."""
    cursor = await db.execute(
        "DELETE FROM notes WHERE id = ?", (note_id,)
    )
    await db.commit()
    if cursor.rowcount == 0:
        raise ToolError(f"Note '{note_id}' not found.")
    return f"Deleted note '{note_id}'."

The Depends(get_db) parameter is hidden from the AI — it never appears in the tool’s schema. FastMCP injects the database connection automatically and handles cleanup when the tool finishes.

You’ll also need to create the table on first run. Add an initialization function:

async def init_db():
    async with aiosqlite.connect(DB_PATH) as db:
        await db.execute(
            """
            CREATE TABLE IF NOT EXISTS notes (
                id TEXT PRIMARY KEY,
                title TEXT NOT NULL,
                content TEXT NOT NULL
            )
            """
        )
        await db.commit()

Call it before running the server:

import asyncio

if __name__ == "__main__":
    asyncio.run(init_db())
    mcp.run()

Now update the resources and prompt from sections 3-4 to use the database too:

@mcp.resource("note://{note_id}")
async def get_note(note_id: str, db=Depends(get_db)) -> dict:
    """Read a specific note by ID."""
    cursor = await db.execute(
        "SELECT id, title, content FROM notes WHERE id = ?", (note_id,)
    )
    row = await cursor.fetchone()
    if row is None:
        raise ToolError(f"Note '{note_id}' not found.")
    return dict(row)

@mcp.resource("note://all")
async def list_notes(db=Depends(get_db)) -> list[dict]:
    """List all notes with their IDs and titles."""
    cursor = await db.execute("SELECT id, title FROM notes")
    rows = await cursor.fetchall()
    return [dict(row) for row in rows]

@mcp.prompt
async def summarize_notes(
    style: Annotated[str, "Summary style: 'brief' or 'detailed'"] = "brief",
    db=Depends(get_db),
) -> str:
    """Create a prompt to summarize all stored notes."""
    cursor = await db.execute("SELECT title, content FROM notes")
    rows = await cursor.fetchall()

    if not rows:
        return "There are no notes to summarize."

    note_list = "\n".join(
        f"- **{row['title']}**: {row['content']}" for row in rows
    )

    if style == "detailed":
        return (
            f"Here are all my notes:\n\n{note_list}\n\n"
            "Please provide a detailed summary of each note, "
            "highlighting key themes and connections between them."
        )
    return (
        f"Here are all my notes:\n\n{note_list}\n\n"
        "Please provide a brief, one-paragraph summary."
    )

Same pattern — Depends(get_db) is injected automatically and hidden from the AI’s schema.

6. Error Handling

MCP has three layers where things can go wrong:

Transport errors — connection drops, malformed messages. FastMCP handles these for you.
Protocol errors — invalid method calls, unknown tools. Also handled by FastMCP.
Application errors — your code. This is where you need to be deliberate.

The key rule: return errors through tool results, don’t let exceptions crash the server.

Use ToolError for expected failures:

from fastmcp.exceptions import ToolError

@mcp.tool
async def get_note_by_id(
    note_id: Annotated[str, "ID of the note"],
    db=Depends(get_db),
) -> dict:
    """Retrieve a single note."""
    cursor = await db.execute(
        "SELECT id, title, content FROM notes WHERE id = ?",
        (note_id,),
    )
    row = await cursor.fetchone()
    if row is None:
        raise ToolError(f"Note '{note_id}' not found.")
    return dict(row)

ToolError messages are sent to the AI as error responses — the AI sees them and can react (e.g., “That note doesn’t exist, would you like to create one?”). Regular exceptions are caught by FastMCP and logged, but you can mask internal details from clients:

mcp = FastMCP("Notes", mask_error_details=True)

With masking enabled, only ToolError messages reach the client. Everything else becomes a generic “Internal error” — important if your server handles sensitive data.

The stdout gotcha: If you’re running over STDIO transport (the default for Claude Desktop), print() statements corrupt the MCP message stream. Use ctx.info() for logging, or write to stderr:

import sys
print("debug info", file=sys.stderr)  # safe
print("debug info")                   # breaks STDIO transport

7. Testing Your Server

FastMCP’s Client can connect directly to your server in-process — no subprocess, no network. This makes tests fast and reliable.

Install test dependencies:

pip install pytest pytest-asyncio

Configure pytest in pyproject.toml:

[tool.pytest.ini_options]
asyncio_mode = "auto"

Write your tests in test_server.py:

# test_server.py
import pytest
from fastmcp.client import Client
from fastmcp.exceptions import ToolError
from server import mcp, init_db

@pytest.fixture(autouse=True)
async def setup_db(tmp_path, monkeypatch):
    """Use a temporary database for each test."""
    db_path = str(tmp_path / "test_notes.db")
    monkeypatch.setattr("server.DB_PATH", db_path)
    await init_db()

@pytest.fixture
async def client():
    async with Client(transport=mcp) as c:
        yield c

async def test_add_and_search(client):
    """Test creating a note and finding it via search."""
    result = await client.call_tool(
        "add_note",
        {"title": "Meeting Notes", "content": "Discuss Q1 roadmap"},
    )
    assert "Meeting Notes" in str(result)

    results = await client.call_tool(
        "search_notes",
        {"query": "roadmap"},
    )
    assert "Q1 roadmap" in str(results)

async def test_delete_nonexistent(client):
    """Deleting a missing note should return an error."""
    with pytest.raises(ToolError, match="not found"):
        await client.call_tool(
            "delete_note",
            {"note_id": "doesnotexist"},
        )

async def test_list_tools(client):
    """Verify all expected tools are registered."""
    tools = await client.list_tools()
    tool_names = {t.name for t in tools}
    assert "add_note" in tool_names
    assert "search_notes" in tool_names
    assert "delete_note" in tool_names

Run them:

pytest test_server.py -v

The Client(transport=mcp) pattern connects directly to your FastMCP server object — no subprocess spawning, no port binding. Tests run in milliseconds.

8. Running in Production

Local: STDIO (Claude Desktop, Cursor)

STDIO is the default transport. The AI client spawns your server as a subprocess and communicates over stdin/stdout. This is what Claude Desktop and most IDE integrations use.

{
  "mcpServers": {
    "notes": {
      "command": "python",
      "args": ["server.py"]
    }
  }
}

No configuration needed in your code — mcp.run() defaults to STDIO.

Remote: Streamable HTTP

For shared or remote deployments, use Streamable HTTP (the modern standard — SSE was deprecated in March 2025):

if __name__ == "__main__":
    asyncio.run(init_db())
    mcp.run(transport="streamable-http", host="0.0.0.0", port=8000)

Docker

A minimal Dockerfile:

FROM python:3.12-slim

WORKDIR /app
COPY server.py .
RUN pip install fastmcp aiosqlite

EXPOSE 8000
CMD ["python", "server.py"]

For production, you’ll also want OAuth 2.1 authentication (now the standard for HTTP transports) and proper logging. Those topics each deserve their own posts.

9. Seeing It in Action

You’ve built the server. Now what does it look like when an AI actually uses it?

Here’s a real conversation with Claude Desktop after connecting our notes server. I’ll annotate what’s happening behind the scenes.

You: “Save a note about our API migration deadline — March 15th, need to deprecate v1 endpoints and notify partners.”

Claude: “I’ve created the note.”

→ Tool call: add_note
  title: "API Migration Deadline"
  content: "March 15th — deprecate v1 endpoints and notify partners"
← Result: {"id": "a3f7b2c1", "title": "API Migration Deadline", "content": "..."}

The AI chose to call add_note based on your intent. It extracted a clean title and content from your natural language — you didn’t need to specify parameters.

You: “What notes do I have about APIs?”

Claude: “I found one note about APIs — your API Migration Deadline note about deprecating v1 endpoints by March 15th.”

→ Tool call: search_notes
  query: "API"
← Result: [{"id": "a3f7b2c1", "title": "API Migration Deadline", "content": "..."}]

The AI mapped “notes about APIs” to a search_notes call with the query “API.” It then summarized the results in natural language instead of dumping raw JSON.

You: “Summarize all my notes.”

Claude: “You have one note focused on an upcoming deadline: the v1 API endpoints need to be deprecated by March 15th, with partner notifications required.”

→ Prompt: summarize_notes
  style: "brief"
← Template with all notes injected, asking for a one-paragraph summary

This time the AI used the summarize_notes prompt. The prompt fetched all notes from SQLite, assembled them into a template, and handed the AI a pre-structured request. The AI followed the template’s instruction for a brief summary.

You: “Delete the migration note, we’ve pushed the deadline.”

Claude: “Done — I’ve deleted the API Migration Deadline note.”

→ Tool call: delete_note
  note_id: "a3f7b2c1"
← Result: "Deleted note 'a3f7b2c1'."

The AI remembered the note ID from earlier in the conversation and used it directly. No extra lookup needed.

The pattern here: you never told the AI which tool to call. You spoke naturally, and the AI decided which MCP tool fit your intent. That’s the whole point of MCP — your server exposes capabilities, and the AI figures out when and how to use them.

What We Built

A fully functional MCP server with:

Three tools — add, search, and delete notes
Resources — expose notes as readable data
A prompt template — for summarizing notes
SQLite persistence — data survives restarts
Error handling — proper error responses, not crashes
Tests — fast, in-process testing with FastMCP’s client
Two deployment options — local STDIO and remote HTTP

This is the same pattern behind real MCP servers. My pdf-mcp project uses the same FastMCP foundation with SQLite caching and tool decomposition — just applied to a different problem (letting Claude read large PDFs page by page instead of choking on the whole file).

Next steps:

Add more sophisticated search (full-text search with SQLite FTS5)
Add authentication for the HTTP transport

The full source code for this tutorial is a single file under 100 lines. MCP servers don’t have to be complex — they just have to be useful. Once you start connecting agents to real APIs, the design decisions get harder — I documented the key pitfalls in I Gave My AI Agent Full API Access — It Was a Mistake.