Classical security tools find nothing. Here's a 5-phase framework for auditing MCP servers that handle financial operations — and why the attack surface is semantic, not syntactic.
When you audit a smart contract, you know what you're looking for: reentrancy, integer overflow, access control, price manipulation. The vulnerability classes are well-understood. Slither, semgrep, and Mythril catch a large portion of the low-hanging fruit automatically.
When you audit an AI agent MCP server, the tools give you nothing.
I've run semgrep with 200+ rules on multiple MCP server codebases in the past month. The results: zero findings, every time. Not because the code is perfect — but because the vulnerabilities don't look like code bugs. They look like design decisions.
This article explains what to look for.
Model Context Protocol (MCP) is Anthropic's standard for giving language models structured access to external tools. An MCP server exposes a set of typed functions — tools — that an AI model can call during a conversation.
{
name: "execute_transfer",
description: "Send tokens to a recipient address",
inputSchema: {
amount: { type: "string" },
recipient: { type: "string" }, // <-- This is where it gets interesting
chain: { type: "string" }
}
}
The AI model reads the tool descriptions and decides when and how to call them based on user instructions. The model is the decision-maker. The MCP server is the executor.
This creates a fundamentally different trust model from traditional software.
In classical software security, we assume:
In AI agent security, the model is partially:
The attack surface shifts from what the code does to what the model decides to do.
Start by enumerating every tool the MCP server exposes and categorize each parameter by who controls it:
Specifically look for: recipient addresses, authority addresses, amount fields, URL parameters embedded in links users click, fee recipients.
Prompt injection is the core attack vector. An attacker embeds instructions in any content the agent reads, and those instructions manipulate the agent's subsequent tool calls.
Common injection sources in DeFi/crypto MCP servers:
For each injection source, trace the path:
Injection Source → Model Context → Tool Call Parameter → Financial Action
If you can draw a complete path, you have a finding.
For each tool that has financial impact, ask:
1. Can the recipient be changed without the user knowing?
If an injected instruction changes the recipient to an attacker's address, will the user notice before signing?
2. Is the pre-flight confirmation UI adequate?
If the recipient address is buried in a query parameter, users often don't check it. This is especially true on mobile.
3. Does the server validate parameters against a trusted source?
The server knows the user's wallet (from initialization or session). It should validate recipient addresses or at minimum display a prominent warning when they differ from the user's wallet.
4. Are API responses sanitized before being returned to the model?
If an external API returns data that gets passed directly to the model, it can be a secondary injection source.
MCP servers typically have a session setup phase where the client sends configuration. Look for initialization code that does NOT:
Many MCP servers expose a get_instructions tool that returns system-level guidance injected into the model's context. If this instruction file is loaded from disk, fetched from a remote URL, or configurable by the user — it's a potential attack vector for persistent instruction manipulation.
A supply chain attack on the npm package could inject arbitrary instructions into every AI agent that uses the MCP server.
The vulnerability classes map like this:
| Classical | AI Agent Equivalent | Detection |
|---|---|---|
| SQL injection | Prompt injection | Manual trace |
| Missing auth check | Missing parameter validation | Manual audit |
| Unchecked return value | Unsanitized API response | Manual trace |
| Privilege escalation | Trust model exploitation | Architecture review |
| CSRF | Cross-context manipulation | Threat modeling |
The tools were built for a world where the attack surface is syntactic. In AI agent security, the attack surface is semantic — it's about what the model understands and decides, not what the code explicitly allows.
MCP servers for financial operations are six months old. The security tooling doesn't exist yet. The vulnerability patterns are just being discovered. Bug bounty programs haven't caught up.
This is an opportunity for security researchers who can think in terms of trust models and agent behavior rather than just code patterns.
The next frontier isn't finding buffer overflows — it's understanding when an AI agent can be made to act against its user's interests. That's the attack surface for the next decade.