pdf-mcp: How to Handle Large PDFs in Claude Code with MCP

Stacks of documents

Claude has a strict PDF limit: 30MB for file uploads, 100 pages for visual analysis. To handle large PDFs, you need a smarter tool.

This guide shows how pdf-mcp solves it.

pdf-mcp is an open-source MCP server with seven tools that let Claude Code read large PDFs incrementally, solving the “PDF too large to process” error. Here’s how it works and how to set it up.

Claude Code can write complex code and debug distributed systems.

But give it a 30MB PDF and it fails with:

“PDF too large to process.”

I hit this while trying to analyze a technical standard with Claude Code. The document wasn’t unusual. About 150 pages, 30MB. But Claude Code simply refused to read it.

If you work with specs, standards, contracts, or long technical PDFs, you’ve probably seen this too. It’s not just Claude. Most AI coding assistants struggle with large documents.

So I did what any developer does next: I went looking for an existing solution.

TL;DR: Claude Code fails on large PDFs because it loads the entire document into context at once. pdf-mcp is an open-source MCP server that fixes this with seven tools for incremental reading: inspect structure, search by keyword, read specific pages, and cache results across sessions. It handles local files and URLs without hitting context limits.


The Search for Existing Solutions

I found a few MCP servers claiming to handle PDFs.

  • Some were abandoned. Installation failed, dependencies were broken, issues sat unanswered for months.
  • Some worked, but poorly. They dumped the entire PDF into context in one shot. You still hit limits, just with extra steps. None of them cached results, so every new conversation re-extracted everything from scratch.
  • RAG felt like overkill. Vector databases, embeddings, chunking strategies, for simply reading a PDF? The complexity didn’t match the problem.

I’d already built a few MCP servers before: redmine-mcp-server for project management integration and qt4-doc-mcp-server for Qt documentation. I knew the protocol, and I knew what good tool design looked like.

That frustration, combined with that experience, became pdf-mcp.


Why Claude Code Says “PDF Too Large to Process”

Claude Code’s PDF handling has real constraints:

  • Practical limits on how much text it can load from a file at once
  • Practical issues with documents over ~100 pages
  • “PDF too large” errors on files as small as 5MB

It’s not just file size. It’s how much text gets extracted.

Typical problem documents:

  • Industry standards (ISO, IEEE): hundreds of pages, often 40–80MB
  • SDK and API docs: dense text, long appendices
  • Financial and research reports with tables and figures

When you hit the limit, it’s not a polite failure. The server can get blocked, forcing you to start a fresh conversation, all context gone.

Even when PDFs do load, dumping 100 pages into the context window is wasteful. You burn tokens on content you don’t need, leaving less room for actual reasoning.


The Lightbulb Moment

The architecture clicked immediately:

Instead of loading the entire PDF into context, what if Claude Code could access only the parts it actually needs?

That’s how humans read long documents. We:

  1. Check the table of contents
  2. Search for relevant sections
  3. Read specific pages
  4. Jump around as needed

AI shouldn’t be any different.


Designing for How AI Actually Works

The key insight was this: the problem isn’t PDF extraction. The problem is how the AI interacts with the document.

Building pdf-mcp meant designing for interaction patterns, not raw extraction.

Seven Tools, Not One

The existing MCP tools I found all made the same mistake: a single monolithic function that dumps everything at once.

Instead of exposing one large “read_pdf” function, I broke the interface into seven small tools that mirror how humans navigate documents:

Tool Purpose
pdf_info Inspect metadata and page count
pdf_get_toc Extract table of contents
pdf_search Locate relevant pages
pdf_read_pages Read specific page ranges (images and tables included)
pdf_read_all Full document read (with safety limits)
pdf_cache_stats Inspect cache performance
pdf_cache_clear Cache maintenance

This turns PDF reading from a single high-risk operation into a sequence of small, safe, reversible steps. The same principle applies to any MCP tool surface, as I learned when giving an AI agent full API access went wrong, fewer focused tools always outperform a kitchen-sink approach. For patterns on making each step fail gracefully, see AI Agent Error Handling Patterns (circuit breakers, validation gates, structured fallbacks).

A typical workflow looks like this:

  1. pdf_info → “This is a 150-page document”
  2. pdf_get_toc → “Section 4 covers revenue”
  3. pdf_search("revenue by region") → “Pages 45–52”
  4. pdf_read_pages(45, 52)

Instead of flooding the context with 150 pages, Claude Code works with just 8. That leaves room to actually reason.


Solving the Caching Problem

MCP servers using STDIO transport spawn a new process per conversation. No persistent state.

Without caching, every chat means re-extracting the entire PDF, slow and wasteful.

pdf-mcp uses SQLite caching:

  • Extracted text, metadata, and images persist to ~/.cache/pdf-mcp/cache.db
  • Cache invalidation via file modification time
  • 24-hour TTL (configurable with PDF_MCP_CACHE_TTL)

The first conversation performs extraction. Every conversation after that reads from cache and returns instantly.


Token Estimation (No More Guessing)

Context overflow is the silent killer of AI workflows.

pdf-mcp estimates token usage for extracted text. Before Claude Code requests 50 pages, it can check whether they’ll fit and narrow the request before breaking the session.

No more surprise truncations. No more dead conversations.


Local Files and URLs

PDFs don’t always live on disk.

Research papers, shared links, cloud-hosted docs: pdf-mcp fetches HTTP/HTTPS PDFs, caches them locally, and processes them exactly like local files.

One interface. Same behavior.


The Build

Two libraries made this straightforward:

  • FastMCP removed protocol plumbing, letting me focus on tool ergonomics
  • PyMuPDF (fitz) handled fast, reliable text and image extraction

The hardest part was testing. PDFs are chaotic: scans, broken encodings, corrupted files, password protection. I built a test corpus of pathological PDFs and made sure failures were graceful, not catastrophic. For a broader framework on testing non-deterministic AI systems, see Testing AI Agents in Production.

pdf-mcp is open source on GitHub and published on PyPI. Releases are automated via GitHub Actions: tag it, test it, ship it. After shipping, I ran a full security audit and found 8 vulnerabilities including SSRF, prompt injection, and path traversal.


Try It Yourself

pdf-mcp is available on PyPI. Install and add to your Claude Code setup:

pip install pdf-mcp
claude mcp add pdf-mcp -- pdf-mcp

Or for Claude Desktop:

pip install pdf-mcp

Then add to your claude_desktop_config.json:

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Windows: %APPDATA%\Claude\claude_desktop_config.json
{
  "mcpServers": {
    "pdf-mcp": {
      "command": "pdf-mcp"
    }
  }
}

Restart Claude Desktop after saving the config.

Then ask Claude to analyze any PDF, local or remote, 5 pages or 500.

Source: https://github.com/jztan/pdf-mcp


Since Launch

Since launching in January 2026, pdf-mcp has reached 4,800+ PyPI downloads across seven releases.

The tool decomposition pattern held up — and the server has evolved significantly. A few notable changes since this post was first published:

Search got smarter. pdf_search now uses a SQLite FTS5 index with BM25 relevance ranking and Porter stemming. Subsequent searches are O(log N) instead of a linear page scan. Results are ordered by relevance, not page number.

Tables are now built in. pdf_read_pages automatically returns extracted table data alongside text and images — no extra tool call needed.

pdf_extract_images was removed in v1.4.0. Images are now returned per-page in pdf_read_pages responses. If you configured pdf-mcp before March 2026, update your workflows accordingly.

The core design principle hasn’t changed: inspect, search, read only what you need. The tools have just gotten faster and more capable.


If you want to build your own MCP server, see How to Build an MCP Server in Python (Step-by-Step). For the conceptual overview of how MCP works, start with What Is MCP?. And if you’re choosing between MCP and native function calling, see MCP vs Function Calling. And once your server is live, Monitoring AI Agents in Production covers the 4 observability layers that catch failures before users do.


AI tools fail in surprisingly ordinary places. In this case, it wasn’t reasoning or coding. It was simply reading a document.

Sometimes the fix isn’t a bigger model or more context. It’s better tools.

Kevin Tan
Written by

Cloud Solutions Architect and Engineering Leader based in Singapore. I write about AWS, distributed systems, and building reliable software at scale.

Discussion

Comments are powered by GitHub Discussions. Sign in with GitHub to join the conversation.