AWS AgentCore Runtime for MCP: Why Local Was Faster for IDE Workflows

AWS AgentCore for MCP

In a previous post, I built a local MCP server that gave AI agents structured access to Redmine. It worked well, but it ran on localhost. The obvious next question: could a managed cloud runtime do this better?

Amazon Bedrock AgentCore Runtime looked purpose-built for the job. Managed container execution, built-in observability, IAM-native auth. I spent six hours testing it as a host for the same MCP server that had been running locally since Part 1.

Six hours later, I wasn’t debugging my MCP server. I was debugging authentication flows. That’s when it became clear: AgentCore is built for enterprise agent orchestration, not IDE/CLI dev workflows.


The Premise: From Local to Cloud

The local setup was simple. A FastAPI + FastMCP server running in Docker, with Redmine as the backend. IDE clients (VS Code, Claude Code, Kiro) connected via http://localhost:8000/mcp. No auth, no TLS, no moving parts beyond the container itself.

Local setup:
  IDE → localhost:8000/mcp → FastMCP → Redmine

Expected AgentCore setup:
  IDE → AgentCore endpoint (HTTPS + JWT)
          → Managed container → Redmine

Cloud hosting was tempting for real reasons: team members could share the same MCP server, I’d get centralized logging and tracing out of the box, and it would be closer to a production deployment path. AgentCore was in preview at the time of writing (September 2025), but it already supported containerized workloads with MCP-compatible endpoints.


What AgentCore Expects

Deploying to AgentCore Runtime isn’t a simple push. The platform assumes a production-grade service, and the setup reflects that.

First, your MCP server must be containerized and pushed to ECR:

ECR=123456789.dkr.ecr.us-east-1.amazonaws.com

aws ecr get-login-password --region us-east-1 | \
  docker login --username AWS --password-stdin $ECR

docker build -t redmine-mcp-server .
docker tag redmine-mcp-server:latest $ECR/redmine-mcp:latest
docker push $ECR/redmine-mcp:latest

Then you configure the runtime: execution roles, VPC networking, CloudWatch log groups, and the runtime configuration itself. This is all reasonable for a production service. The friction starts when you look at what calling clients need to provide.

Here’s what a VS Code MCP config looks like for a local server:

{
  "mcpServers": {
    "redmine": {
      "url": "http://localhost:8000/mcp"
    }
  }
}

And here’s what AgentCore needs from the caller: an HTTPS endpoint with a valid Bearer token obtained through an OAuth/OIDC flow, using short-lived JWTs that expire and require refresh. The IDE config can’t express that natively.


Authentication: The Core Mismatch

This is where the evaluation ended, practically speaking. AgentCore supports two inbound auth mechanisms: IAM SigV4 and OAuth/OIDC JWTs. Neither integrates smoothly with IDE MCP clients. SigV4 requires AWS credential signing on every request. JWT requires a valid Bearer token that’s short-lived (minutes to an hour) and needs a refresh mechanism.

The design makes sense for service-to-service communication. A backend calling another backend can handle token acquisition, caching, and refresh in its own auth middleware. But IDE MCP clients don’t work that way. VS Code, Claude Code, and Kiro all expect to point at a URL and start making calls. There’s no built-in mechanism for OIDC token acquisition or silent refresh.

The workarounds I tried all reintroduced local dependencies:

  • Local auth proxy. A small HTTP server running on localhost that handles token acquisition, injects the Bearer header, and forwards requests to AgentCore. This works, but now you’re running a local service to avoid running a local service.
  • Token helper script. A shell script that fetches a token and injects it into the MCP config. This works until the token expires mid-session.
  • Manual token paste. Copy a token from the AWS console, paste it into config. Expires in minutes. Not a workflow.

Contrast this with Part 1’s setup: http://localhost:8000/mcp. No auth, no tokens, no proxy. The IDE connects and tools are immediately available.

Managed runtimes are optimized for service-to-service security models. IDE agents are still optimized for localhost ergonomics. The tooling hasn’t caught up to cloud-native auth assumptions yet, and until it does, that gap falls on the developer to bridge.


Latency: When Milliseconds Compound

I want to be upfront about methodology: these are observational measurements from my testing environment, not formal benchmarks with controlled variables. The numbers reflect what I experienced, not what every deployment would produce.

Path Observed Latency Notes
Local stdio < 10 ms Direct process communication
Local HTTP < 50 ms Localhost, no TLS
AgentCore 100 - 300 ms HTTPS + JWT validation + network

The gap comes from multiple layers: network round-trip to the AWS region, TLS handshake, JWT token validation on every request, and occasional cold starts when the container hadn’t received traffic recently.

On paper, 200ms doesn’t sound disruptive. In practice, MCP calls compound during agent reasoning chains. A single agent turn might make three to five tool calls in sequence. At 200ms each, that’s a full second of added latency per turn, on top of whatever the LLM inference takes. The delay becomes perceptible as a sluggishness in completions and agent responses.

For more on how latency accumulates in agent toolchains, see I Profiled the Copilot SDK: 33% of Latency Was Avoidable. Similar overhead patterns apply to any hosted runtime sitting between the IDE and the tool backend.


Cost: Enterprise Pricing for a Dev Workflow

The economics are straightforward. A local MCP server costs nothing beyond the machine already running the IDE. AgentCore adds managed container runtime costs, data transfer charges, and the ECR storage for the image.

For a team of developers sharing a production MCP server, those costs are noise. For a single developer running lightweight tool calls during coding sessions, the paradox is simple: you’re paying for compliance boundaries and enterprise controls while actively bypassing them with local proxies just to make the IDE work.

AgentCore’s pricing makes sense when you need centralized access control or audit logging for production agents serving external users. It doesn’t make sense as a dev tool host.


Local vs. AgentCore at a Glance

Factor Local MCP Server AgentCore Runtime
Setup complexity Docker + one config line ECR + IAM + VPC + runtime config
Authentication None (localhost) OAuth/OIDC JWTs required
Latency < 50 ms 100 - 300 ms
Cost Free Container runtime + data transfer
Process isolation Container-level Managed, platform-level
Observability Manual (logs, stdout) Built-in (CloudWatch, tracing)
Team access Local only Shared endpoint
Best for IDE/CLI dev workflows Enterprise production agents

AgentCore wins on everything that matters in production: isolation, observability, team access, compliance. Local wins on everything that matters during development: speed, simplicity, zero configuration.


What I’d Change

Wait for IDE-native OIDC support. The auth friction is a tooling gap, not a fundamental flaw. If VS Code or Claude Code added native OIDC token management for MCP connections, the biggest barrier would disappear. This seems likely given the direction of MCP adoption, but it wasn’t available during my testing.

Dual-mode architecture. The practical path forward is running local for development and AgentCore (or a similar managed runtime) for production. The MCP server itself doesn’t need to change. The same container runs in both environments. Only the client configuration differs.

More rigorous profiling. My latency observations were informal. A proper evaluation would use structured traces across multiple regions, measure cold start frequency, and compare against other managed options. I didn’t invest that time because the auth friction alone was enough to send me back to local.

The local server from Part 1 remains my daily driver. It starts in seconds, needs no auth, and responds before I notice the round-trip. For IDE/CLI workflows, that’s hard to beat. AgentCore will likely earn its place when the MCP server needs to serve a team or face external traffic.

The interesting lesson wasn’t that “cloud is slower.” It’s that developer tooling and cloud-native security models are evolving at different speeds. Until those converge, localhost will remain the fastest way to build.

For more on MCP, see What Is MCP? for the conceptual foundation, How to Build an MCP Server in Python for a hands-on tutorial, or MCP vs Function Calling to decide when each approach fits.

Kevin Tan
Written by

Cloud Solutions Architect and Engineering Leader based in Singapore. I write about AWS, distributed systems, and building reliable software at scale.

Discussion

Comments are powered by GitHub Discussions. Sign in with GitHub to join the conversation.