How I Built pdf-mcp: Solving Claude’s Large PDF Limitations with MCP

Stacks of documents

It started with a 30MB technical standard PDF.

I was working with Claude Code, trying to understand an industry specification document. Everything was going smoothly—until I hit that wall we’ve all seen:

“PDF too large to process.”

I tried again. Same error. Tried a different document. Same error.

The AI that could write complex code, debug intricate systems, and reason through multi-step problems couldn’t read a standard technical document.

If you work with specs, standards, contracts, or long technical PDFs, you’ve probably hit this too. It’s not just Claude—most LLMs and AI coding assistants struggle with large documents.

So I did what any developer does next: I went looking for an existing solution.

TL;DR

Claude struggles with large PDFs due to token and page limits
Existing MCP tools either break or waste context
pdf-mcp lets Claude read PDFs the way humans do—incrementally and selectively

The Search for Existing Solutions

Surely someone had already solved this, right?

I checked the Claude Code GitHub issues—turns out this had been a pain point for a while. Others were hitting the same wall, asking for better PDF handling. No good solution had emerged.

I found a few MCP servers claiming to handle PDFs:

Some weren’t maintained. Installation failed, dependencies were broken, issues sat unanswered for months.
Some worked—but poorly. They dumped the entire PDF back to Claude in one shot. You still hit context limits, just with extra steps. Worse, none of them cached results, so every new conversation re-extracted everything from scratch.
RAG felt like overkill. Vector databases, embeddings, chunking strategies—for simply reading a PDF? The complexity didn’t match the problem.

I’d already built a few MCP servers before—redmine-mcp-server for project management integration and qt4-doc-mcp-server for Qt documentation. I knew the protocol, and I knew what good tool design looked like.

That frustration, combined with that experience, became pdf-mcp.

Why Claude Struggles with Large PDFs

Claude’s PDF handling has real constraints:

~25,000 token limit per file read
Practical issues with documents over ~100 pages
“PDF too large” errors on files as small as 5MB

It’s not just file size—it’s how much text gets extracted.

Typical problem documents:

Industry standards (ISO, IEEE): hundreds of pages, often 40–80MB
SDK and API docs: dense text, long appendices
Financial and research reports with tables and figures

When you hit the limit, it’s not a polite failure. The server can get blocked, forcing you to start a fresh conversation—all context gone.

Even when PDFs do load, dumping 100 pages into the context window is wasteful. You burn tokens on content you don’t need, leaving less room for actual reasoning.

The Lightbulb Moment

The architecture clicked immediately:

Instead of loading the entire PDF into context, what if Claude could access only the parts it actually needs?

That’s how humans read long documents. We:

Check the table of contents
Search for relevant sections
Read specific pages
Jump around as needed

AI shouldn’t be any different.

Designing for How AI Actually Works

Building pdf-mcp meant designing for interaction patterns, not raw extraction.

Eight Tools, Not One

The existing MCP tools I found all made the same mistake: a single monolithic function that dumps everything at once.

A monolithic tool forces the AI to decide everything upfront. Instead, I broke the interface into small, focused tools that mirror human document navigation:

Tool	Purpose
`pdf_info`	Inspect metadata and page count
`pdf_get_toc`	Extract table of contents
`pdf_search`	Locate relevant pages
`pdf_read_pages`	Read specific page ranges
`pdf_read_all`	Full document read (with safety limits)
`pdf_extract_images`	Extract figures and diagrams
`pdf_cache_stats`	Inspect cache performance
`pdf_cache_clear`	Cache maintenance

This turns PDF reading from a single high-risk operation into a sequence of small, safe, reversible steps. The same principle applies to any MCP tool surface — as I learned when giving an AI agent full API access went wrong, fewer focused tools always outperform a kitchen-sink approach.

A typical workflow looks like this:

pdf_info → “This is a 150-page document”
pdf_get_toc → “Section 4 covers revenue”
pdf_search("revenue by region") → “Pages 45–52”
pdf_read_pages(45, 52)

Instead of flooding the context with 150 pages, Claude works with 8—and has room left to think.

Solving the Caching Problem

MCP servers using STDIO transport spawn a new process per conversation. No persistent state.

Without caching, every chat means re-extracting the entire PDF—slow and wasteful.

pdf-mcp uses SQLite caching:

Extracted text, metadata, and images persist to ~/.cache/pdf-mcp/cache.db
Cache invalidation via file modification time
24-hour TTL (configurable with PDF_MCP_CACHE_TTL)

First conversation does the work. Every conversation after that is instant.

Without caching, the tool would be frustratingly slow. With it, every conversation after the first is instant.

Token Estimation (No More Guessing)

Context overflow is the silent killer of AI workflows.

pdf-mcp estimates token usage for extracted text. Before Claude requests 50 pages, it can check whether they’ll fit—and narrow the request before breaking the session.

No more surprise truncations. No more dead conversations.

Local Files and URLs

PDFs don’t always live on disk.

Research papers, shared links, cloud-hosted docs—pdf-mcp fetches HTTP/HTTPS PDFs, caches them locally, and processes them exactly like local files.

One interface. Same behavior.

The Build

Two libraries made this straightforward:

FastMCP removed protocol plumbing, letting me focus on tool ergonomics
PyMuPDF (fitz) handled fast, reliable text and image extraction

The hardest part was testing. PDFs are chaotic: scans, broken encodings, corrupted files, password protection. I built a test corpus of pathological PDFs and made sure failures were graceful—not catastrophic.

pdf-mcp is open source on GitHub and published on PyPI. Releases are automated via GitHub Actions—tag it, test it, ship it.

Try It Yourself

pdf-mcp is available on PyPI. Install and add to your Claude setup:

pip install pdf-mcp
claude mcp add pdf-mcp -- pdf-mcp

Or for Claude Desktop:

pip install pdf-mcp

Then add to your claude_desktop_config.json:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "pdf-mcp": {
      "command": "pdf-mcp"
    }
  }
}

Restart Claude Desktop after saving the config.

Then ask Claude to analyze any PDF—local or remote, 5 pages or 500.

Source: https://github.com/jztan/pdf-mcp

Built out of frustration—and shipped in the hope it saves someone else the same headache.

Kevin Tan

How I Built pdf-mcp: Solving Claude's Large PDF Limitations with MCP

TL;DR

The Search for Existing Solutions

Why Claude Struggles with Large PDFs

The Lightbulb Moment

Designing for How AI Actually Works

Eight Tools, Not One

Solving the Caching Problem

Token Estimation (No More Guessing)

Local Files and URLs

The Build

Try It Yourself

TL;DR

The Search for Existing Solutions

Why Claude Struggles with Large PDFs

The Lightbulb Moment

Designing for How AI Actually Works

Eight Tools, Not One

Solving the Caching Problem

Token Estimation (No More Guessing)

Local Files and URLs

The Build

Try It Yourself

Subscribe to the newsletter

AI Agent Error Handling Patterns: 5 Lessons from Breaking Mine in Production

I Gave My AI Agent Full API Access. It Was a Mistake

OpenAI Codex + MCP Server: Adding Tools Fast

How I Linked a Legacy System to a Modern AI Agent — with MCP