I Gave My AI Agent Full API Access. It Was a Mistake

Originally published on Medium ↗

I gave my AI agent full access to a legacy Redmine API.

It didn’t make the agent smarter.

It made the agent hallucinate across hundreds of issues, burn tokens at an alarming rate, and surface data in ways that would never pass a security review.

Nothing crashed.
Nothing threw an exception.

Which made it worse.

That failure forced me to rebuild the boundary from scratch — not at the prompt layer, but at the MCP server.

What followed were three hard lessons about designing MCP servers as control planes for AI agents.

The Model Context Protocol (MCP) turns these lessons into a repeatable design pattern — the same boundaries apply whether the agent is Claude, GPT, or something else.

Photo by Ryan Ancill on Unsplash

TL;DR

Exposing legacy APIs directly to AI agents often leads to context collapse, runaway token costs, and subtle security risks. To build production‑ready agents, the Model Context Protocol (MCP) server must act as a control plane — enforcing boundaries as a Security Membrane and Policy Enforcement Point (PEP).

Key Takeaways

  • Enforce Strict Pagination
    Pagination is a reasoning requirement and a cost‑control mechanism that prevents models from burning tokens and losing focus.
  • Shift to Intent‑Level Tools
    Avoid 1‑to‑1 API mappings. Design opinionated tools like triage_new_tickets to reduce agent cognitive load.
  • Isolate Legacy Resources
    Use the MCP server to proxy data, generate short‑lived UUID‑based URLs, and ensure agents never see raw credentials or direct system access.

1. Context Is a Finite Resource (Pagination Is Not Optional)

Legacy systems don’t return “a little data.”

They return:

  • hundreds of issues,
  • thousands of fields,
  • deeply nested structures.

If you let an agent pull everything, you don’t get better answers — you get context collapse.

That’s why pagination isn’t a performance optimization.
It’s a reasoning requirement.

It’s also a cost-control mechanism.

Models charge by the token. Letting an agent pull unbounded data doesn’t just break reasoning — it quietly burns money.

Before: full issue dumps streamed straight into the model.
After: 20-item pages with summaries, enforced inside the MCP server via Model Context Protocol (MCP) boundaries.

Redmine’s API already enforces pagination via limit and offset — the MCP server’s job is to respect and enforce those boundaries consistently for agents.

Once I enforced strict pagination and summaries:

  • agent responses became faster
  • reasoning chains became shorter
  • token usage dropped sharply
  • failure modes became predictable

The model didn’t change.
The shape of information did.

2. Exposing Every API as a Tool Is a Trap

My early instinct was simple:

If the API exists, expose it as a tool.

That was a mistake.

Every tool:

  • adds tokens
  • adds decision branches
  • increases the agent’s cognitive load

A 1-to-1 mapping between APIs and tools doesn’t make agents more powerful — it makes them indecisive.

So I flipped the model.

I stopped designing tools around APIs and started designing them around intent.

Here’s the difference:

API-centric tools (the trap):

  • GET /issues
  • POST /notes
  • PUT /status

These maximize surface area — and cognitive load.

Intent-centric tools (the fix):

  • triage_new_tickets
  • list_my_issues_paginated
  • generate_weekly_summary

These collapse multiple decisions into one bounded action.

Cognitive load is a finite resource for agents — just like context.

By exposing fewer, opinionated tools through the MCP server, the agent stopped hesitating and started completing tasks reliably.

Fewer tools.
Stronger guarantees.
Better outcomes.

3. Legacy Resources Must Be Isolated Behind the MCP Boundary

This was the most subtle — and most important — lesson.

In a traditional system, returning a raw Redmine attachment URL is normal.

In an agentic system, it’s dangerous.

And this isn’t specific to Redmine — it applies to any legacy system.

Why?

Because:

  • URLs may expose access paths the agent should never reason about
  • access may bypass audit logs
  • permissions may outlive the agent’s intent
  • agents don’t understand trust boundaries — they only follow affordances

So I stopped returning Redmine URLs entirely.

Instead, the MCP server:

  • fetches the attachment on behalf of the agent
  • stores it in a controlled, local store
  • generates a temporary, UUID-based download URL
  • logs access and automatically expires and cleans it up

This turns the MCP server into more than a proxy.

It becomes a Policy Enforcement Point (PEP).

The MCP server is where:

  • access is scoped
  • credentials are never exposed
  • URLs are short-lived
  • actions are logged and auditable

The agent never talks to Redmine directly.
It never sees Redmine credentials.
It never escapes the boundary.

In practice, the MCP server acts as a security membrane — enforcing policy, not just translating requests.

The Counterintuitive Result

The agent didn’t lose capability.

It gained safety.
It gained predictability.
It gained trustworthiness.

Each constraint mapped directly to a failure mode:

  • pagination prevented context collapse
  • intent-level tools reduced indecision
  • resource isolation closed security gaps

By constraining context, tools, and access, the system stopped behaving like a demo — and started behaving like infrastructure.

That’s when it became usable in production.

ModelContextProtocol #AIAgents #AgentArchitecture #AIInfrastructure #PlatformEngineering