
It started with a 30MB technical standard PDF.
I was working with Claude Code, trying to understand an industry specification document. Everything was going smoothly—until I hit that wall we’ve all seen:
“PDF too large to process.”
I tried again. Same error. Tried a different document. Same error.
The AI that could write complex code, debug intricate systems, and reason through multi-step problems couldn’t read a standard technical document.
If you work with specs, standards, contracts, or long technical PDFs, you’ve probably hit this too. It’s not just Claude—most LLMs and AI coding assistants struggle with large documents.
So I did what any developer does next: I went looking for an existing solution.
TL;DR
- Claude struggles with large PDFs due to token and page limits
- Existing MCP tools either break or waste context
- pdf-mcp lets Claude read PDFs the way humans do—incrementally and selectively
The Search for Existing Solutions
Surely someone had already solved this, right?
I checked the Claude Code GitHub issues—turns out this had been a pain point for a while. Others were hitting the same wall, asking for better PDF handling. No good solution had emerged.
I found a few MCP servers claiming to handle PDFs:
- Some weren’t maintained. Installation failed, dependencies were broken, issues sat unanswered for months.
- Some worked—but poorly. They dumped the entire PDF back to Claude in one shot. You still hit context limits, just with extra steps. Worse, none of them cached results, so every new conversation re-extracted everything from scratch.
- RAG felt like overkill. Vector databases, embeddings, chunking strategies—for simply reading a PDF? The complexity didn’t match the problem.
I’d already built a few MCP servers before—redmine-mcp-server for project management integration and qt4-doc-mcp-server for Qt documentation. I knew the protocol, and I knew what good tool design looked like.
That frustration, combined with that experience, became pdf-mcp.
Why Claude Struggles with Large PDFs
Claude’s PDF handling has real constraints:
- ~25,000 token limit per file read
- Practical issues with documents over ~100 pages
- “PDF too large” errors on files as small as 5MB
It’s not just file size—it’s how much text gets extracted.
Typical problem documents:
- Industry standards (ISO, IEEE): hundreds of pages, often 40–80MB
- SDK and API docs: dense text, long appendices
- Financial and research reports with tables and figures
When you hit the limit, it’s not a polite failure. The server can get blocked, forcing you to start a fresh conversation—all context gone.
Even when PDFs do load, dumping 100 pages into the context window is wasteful. You burn tokens on content you don’t need, leaving less room for actual reasoning.
The Lightbulb Moment
The architecture clicked immediately:
Instead of loading the entire PDF into context, what if Claude could access only the parts it actually needs?
That’s how humans read long documents. We:
- Check the table of contents
- Search for relevant sections
- Read specific pages
- Jump around as needed
AI shouldn’t be any different.
Designing for How AI Actually Works
Building pdf-mcp meant designing for interaction patterns, not raw extraction.
Eight Tools, Not One
The existing MCP tools I found all made the same mistake: a single monolithic function that dumps everything at once.
A monolithic tool forces the AI to decide everything upfront. Instead, I broke the interface into small, focused tools that mirror human document navigation:
| Tool | Purpose |
|---|---|
pdf_info |
Inspect metadata and page count |
pdf_get_toc |
Extract table of contents |
pdf_search |
Locate relevant pages |
pdf_read_pages |
Read specific page ranges |
pdf_read_all |
Full document read (with safety limits) |
pdf_extract_images |
Extract figures and diagrams |
pdf_cache_stats |
Inspect cache performance |
pdf_cache_clear |
Cache maintenance |
This turns PDF reading from a single high-risk operation into a sequence of small, safe, reversible steps. The same principle applies to any MCP tool surface — as I learned when giving an AI agent full API access went wrong, fewer focused tools always outperform a kitchen-sink approach.
A typical workflow looks like this:
pdf_info→ “This is a 150-page document”pdf_get_toc→ “Section 4 covers revenue”pdf_search("revenue by region")→ “Pages 45–52”pdf_read_pages(45, 52)
Instead of flooding the context with 150 pages, Claude works with 8—and has room left to think.
Solving the Caching Problem
MCP servers using STDIO transport spawn a new process per conversation. No persistent state.
Without caching, every chat means re-extracting the entire PDF—slow and wasteful.
pdf-mcp uses SQLite caching:
- Extracted text, metadata, and images persist to
~/.cache/pdf-mcp/cache.db - Cache invalidation via file modification time
- 24-hour TTL (configurable with
PDF_MCP_CACHE_TTL)
First conversation does the work. Every conversation after that is instant.
Without caching, the tool would be frustratingly slow. With it, every conversation after the first is instant.
Token Estimation (No More Guessing)
Context overflow is the silent killer of AI workflows.
pdf-mcp estimates token usage for extracted text. Before Claude requests 50 pages, it can check whether they’ll fit—and narrow the request before breaking the session.
No more surprise truncations. No more dead conversations.
Local Files and URLs
PDFs don’t always live on disk.
Research papers, shared links, cloud-hosted docs—pdf-mcp fetches HTTP/HTTPS PDFs, caches them locally, and processes them exactly like local files.
One interface. Same behavior.
The Build
Two libraries made this straightforward:
- FastMCP removed protocol plumbing, letting me focus on tool ergonomics
- PyMuPDF (fitz) handled fast, reliable text and image extraction
The hardest part was testing. PDFs are chaotic: scans, broken encodings, corrupted files, password protection. I built a test corpus of pathological PDFs and made sure failures were graceful—not catastrophic.
pdf-mcp is open source on GitHub and published on PyPI. Releases are automated via GitHub Actions—tag it, test it, ship it.
Try It Yourself
pdf-mcp is available on PyPI. Install and add to your Claude setup:
pip install pdf-mcp
claude mcp add pdf-mcp -- pdf-mcp
Or for Claude Desktop:
pip install pdf-mcp
Then add to your claude_desktop_config.json:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"pdf-mcp": {
"command": "pdf-mcp"
}
}
}
Restart Claude Desktop after saving the config.
Then ask Claude to analyze any PDF—local or remote, 5 pages or 500.
Source: https://github.com/jztan/pdf-mcp
Built out of frustration—and shipped in the hope it saves someone else the same headache.