AI Agent API Access: Why Full Permissions Are a Security Risk

I gave my AI agent full access to a legacy Redmine API. It didn’t make the agent smarter. It made the agent hallucinate across hundreds of issues, burn tokens at an alarming rate, and surface data in ways that would never pass a security review.

Nothing crashed. Nothing threw an exception. Which made it worse.

That failure forced me to rebuild the boundary from scratch, not at the prompt layer, but at the MCP server. This was the same server I originally built to connect a legacy Redmine system to an AI agent. What followed were three hard lessons about designing MCP servers as control planes for AI agents.

AI agent with API access concept


Pagination Is a Reasoning Requirement

Legacy systems don’t return “a little data.” They return hundreds of issues, thousands of fields, and deeply nested structures. If you let an agent pull everything, you don’t get better answers. You get context collapse.

Pagination isn’t a performance optimization. It’s a reasoning requirement, and a cost-control mechanism. Models charge by the token. Letting an agent pull unbounded data doesn’t just break reasoning. It quietly burns money.

Before: full issue dumps streamed straight into the model. After: 20-item pages with summaries, enforced inside the MCP server. Redmine’s API already enforces pagination via limit and offset, and the MCP server’s job is to respect and enforce those boundaries consistently for agents.

Once I enforced strict pagination and summaries, agent responses became faster, reasoning chains became shorter, token usage dropped sharply, and failure modes became predictable. The model didn’t change. The shape of information did.


Intent-Level Tools Beat API Mappings

My early instinct was simple: if the API exists, expose it as a tool. That was a mistake. Every tool adds tokens, adds decision branches, and increases the agent’s cognitive load. A 1-to-1 mapping between APIs and tools doesn’t make agents more powerful. It makes them indecisive.

So I flipped the model. I stopped designing tools around APIs and started designing them around intent.

API-centric tools (the trap): GET /issues, POST /notes, PUT /status. These maximize surface area and cognitive load.

Intent-centric tools (the fix): triage_new_tickets, list_my_issues_paginated, generate_weekly_summary. These collapse multiple decisions into one bounded action.

Cognitive load is a finite resource for agents, just like context. By exposing fewer, opinionated tools through the MCP server, the agent stopped hesitating and started completing tasks reliably. Fewer tools, stronger guarantees, better outcomes. I applied the same principle when building pdf-mcp: eight focused tools for PDF navigation instead of one monolithic “dump the whole document” approach.


Resource Isolation Behind the MCP Boundary

This was the most important lesson and the most subtle.

In a traditional system, returning a raw Redmine attachment URL is normal. In an agentic system, it’s dangerous. And this isn’t specific to Redmine. It applies to any legacy system. URLs may expose access paths the agent should never reason about, access may bypass audit logs, permissions may outlive the agent’s intent, and agents don’t understand trust boundaries. They only follow affordances.

So I stopped returning Redmine URLs entirely. Instead, the MCP server fetches the attachment on behalf of the agent, stores it in a controlled local store, generates a temporary UUID-based download URL, and logs access with automatic expiration.

This turns the MCP server into more than a proxy. It becomes a Policy Enforcement Point (PEP), where access is scoped, credentials are never exposed, URLs are short-lived, and actions are logged and auditable. The agent never talks to Redmine directly, never sees credentials, and never escapes the boundary. In practice, the MCP server acts as a security membrane, enforcing policy rather than just translating requests.


Constraints Increased Capability

The agent didn’t lose capability. It gained safety, predictability, and trustworthiness. Each constraint mapped directly to a failure mode: pagination prevented context collapse, intent-level tools reduced indecision, and resource isolation closed security gaps.

By constraining context, tools, and access, the system stopped behaving like a demo and started behaving like infrastructure. These lessons mirror the broader patterns I documented in why AI agents fail in production. Reliability comes from constraints, not intelligence. That’s when it became usable in production.

Kevin Tan
Written by

Cloud Solutions Architect and Engineering Leader based in Singapore. I write about AWS, distributed systems, and building reliable software at scale.

Discussion

Comments are powered by GitHub Discussions. Sign in with GitHub to join the conversation.