The Priority Checklist at a Glance
Work top to bottom. The ordering reflects exploitability and blast radius, not alphabetical neatness. If you only fix five things, fix the first five.
| # | Control | Stops | Effort |
|---|---|---|---|
| 1 | Authenticate every request (OAuth 2.1, audience-bound tokens) | Unauthenticated tool invocation, token replay | Medium |
| 2 | Least-privilege tool scoping & human-in-the-loop for destructive actions | Over-broad agent capability, lateral movement | Medium |
| 3 | Strict input validation & output schema enforcement | Injection into downstream systems, type confusion | Low |
| 4 | Prompt-injection defenses (data/instruction separation, content provenance) | Tool hijacking via untrusted content | High |
| 5 | Secrets isolation (server-side vault, never in tool descriptions) | Credential exfiltration | Low |
| 6 | Structured audit logging of every tool call | Blind incident response, undetected abuse | Low |
| 7 | Network isolation & egress filtering | SSRF, data exfiltration, C2 | Medium |
| 8 | Supply-chain verification (pinned, signed, reviewed servers) | Malicious or trojaned MCP servers | Medium |
You can baseline your server against most of this automatically with the free open-source mcp-security-scanner, which flags missing auth, over-broad scopes, unvalidated inputs, and leaked secrets before they reach production.
1. Authenticate Every Request with OAuth 2.1
The single most common finding in our assessments is an MCP server that trusts the transport. A server reachable over the network with no per-request authentication is an open RPC endpoint. The 2025 MCP authorization spec aligned on OAuth 2.1, and you should implement it properly rather than bolt on a static API key.
What "properly" means
- Authorization Code + PKCE for interactive clients. PKCE is mandatory in OAuth 2.1; it defeats authorization-code interception.
- Audience-restricted tokens. The access token must be bound to your MCP server as the resource (the
audclaim). A token minted for another service must be rejected. This is what stops the confused-deputy and token-passthrough problems where an upstream token is replayed against your server. - Short-lived access tokens + refresh tokens. Minutes, not days. Rotate refresh tokens on use.
- Validate the token on every call - signature, issuer, audience, expiry, and scopes. Do not cache an "authenticated" flag for the lifetime of a session.
For server-to-server MCP deployments without a human, use the OAuth 2.0 Client Credentials grant with mTLS or private-key JWT client authentication rather than a shared secret in an environment variable.
def validate_token(token, request):
claims = jwt.decode(
token, jwks(issuer),
audience="https://mcp.example.com", # reject tokens for other resources
algorithms=["RS256", "ES256"], # never "none", never HS with a public key
options={"require": ["exp", "aud", "iss"]},
)
if request.tool not in scopes_to_tools(claims["scope"]):
raise Forbidden("token not scoped for this tool")
return claimsDo not invent your own crypto and do not accept the none algorithm. Reject any token whose algorithm you did not explicitly allow - algorithm-confusion attacks remain effective against naive verifiers.
2. Least-Privilege Tool Scoping and Human-in-the-Loop
An agent inherits the union of every capability you expose. If your MCP server offers a generic run_sql tool with the application's full database role, the model - and anyone who can influence the model - has DBA. Scope tools the way you scope service accounts.
- One narrow tool beats one flexible tool. Prefer
get_open_invoices(customer_id)overquery(sql). Each tool should map to a single, auditable operation with a typed signature. - Bind tools to OAuth scopes. A token without
invoices:readcannot reach the invoice tool, enforced server-side regardless of what the model decides to call. - Separate read from write, and gate destructive actions behind explicit human confirmation (the MCP elicitation / approval flow). Deleting records, sending money, or mailing customers should require a confirmation the user actually sees - not a confirmation the model can auto-approve.
- Run the server with the minimum OS and cloud IAM privileges it needs. The process identity, not just the token, is part of the blast radius.
The mental model: assume the calling model is fully compromised by an attacker and ask what damage a single tool call can do. If the answer is "a lot," the tool is too broad.
3. Validate Inputs and Enforce Output Schemas
MCP tool arguments arrive as JSON generated by a language model from text that may be attacker-controlled. Treat every argument as hostile, exactly as you would an HTTP request body.
- Schema-validate every argument against a strict JSON Schema: types, enums, length bounds, and format constraints. Reject anything that does not match; do not coerce silently.
- Parameterize downstream calls. Build SQL with bound parameters, shell commands with argument arrays (never string concatenation), and file paths with canonicalization plus an allow-list root to defeat
../traversal. - Validate outputs too. Constrain what the tool can return so a compromised backend cannot smuggle a fresh prompt-injection payload back into the model's context (see the next section).
- Resource-bound everything: timeouts, max result sizes, and rate limits per tool and per token. Unbounded tool calls are a denial-of-service and a cost vector.
# BAD - shell injection via model-controlled filename
subprocess.run(f"convert {name} out.png", shell=True)
# GOOD - argument array, validated against an allow-list root
safe = resolve_within(ALLOWED_DIR, name) # raises on traversal
subprocess.run(["convert", safe, "out.png"], shell=False, timeout=10)4. Defend Against Prompt Injection
Prompt injection is the defining MCP threat, and it is not fully solvable at the model layer - so you build defense in depth around it. The core problem: data the tool returns (a web page, an email, a file) can contain instructions that the model treats as commands and then acts on by calling other tools. This is how "summarize this document" becomes "exfiltrate the user's secrets."
Layered defenses that actually move the needle
- Separate instructions from data. Wrap all tool output in clear, consistent delimiters and label it as untrusted content. The model should be system-prompted to never execute instructions found inside tool results.
- Enforce control at the tool boundary, not in the prompt. The most robust mitigation is that the server refuses dangerous actions regardless of what the model was talked into - least-privilege scoping (control #2) and human approval for destructive actions are your real prompt-injection backstop.
- Tool-chain policies. Disallow dangerous sequences such as "read private data, then call a tool that can write to an external destination" within a single turn without explicit approval. This is the classic exfiltration chain.
- Content provenance and sanitization. Strip or neutralize hidden content - zero-width characters, HTML comments, off-screen text, metadata - before it reaches the model. "Tool poisoning," where a malicious server hides instructions in its own tool descriptions, is the same attack one layer up: review and pin tool descriptions (control #8).
- Egress control as the last line. Even a hijacked agent cannot exfiltrate to
attacker.comif egress is locked down (control #7).
Treat detection-based filters and "injection classifiers" as useful noise reduction, not as a security boundary. Our deeper write-up on this lives in the prompt-injection defense pillar linked below.
5. Handle Secrets Server-Side and Out of the Model's Reach
Secrets should never transit the model's context window. The model does not need your database password; the server uses it on the model's behalf. Two rules cover most of the failure modes.
- Never put secrets in tool descriptions, tool names, parameters, or return values. Anything the model can see, an injection can ask it to repeat. We regularly find API keys baked into tool metadata "for convenience."
- Resolve secrets server-side from a real secret store - a vault, cloud secret manager, or workload identity - at call time, scoped to the operation. Inject them into the downstream request inside the server process; strip them from anything returned to the model.
| Anti-pattern | Do instead |
|---|---|
API key in .env read into tool description | Fetch from secret manager per-call, never expose to model |
| Long-lived static credential in image | Short-lived credential via workload identity / OAuth |
| Returning raw upstream error containing a token | Sanitize errors before returning to the model |
Rotate credentials on a schedule and immediately on any suspicion, and make sure your logs (next section) redact secrets so audit trails do not become the new leak.
6. Log Every Tool Call as Structured, Tamper-Evident Audit Data
When an incident happens - and with agentic systems, assume it will - your ability to answer "what did the agent do, on whose authority, with what data" determines whether you have a contained event or a guess. Most MCP servers we review log almost nothing useful.
Emit a structured event for every tool invocation containing at least: timestamp, authenticated principal and token ID (not the token), tool name, validated arguments (with secrets redacted), decision (allowed/denied and why), outcome, and a correlation ID tying it to the session and upstream request.
{
"ts": "2026-05-29T11:04:22Z",
"principal": "user_8842",
"token_id": "tok_9f...",
"tool": "get_open_invoices",
"args": {"customer_id": 5512},
"decision": "allow",
"scope_checked": "invoices:read",
"session": "s_31a",
"result": "ok",
"latency_ms": 88
}- Ship logs off-host to an append-only or write-once store so a compromised server cannot rewrite history.
- Alert on the patterns that matter: denied-scope spikes, unusual tool-call sequences (read-private then write-external), volume anomalies, and repeated validation failures.
- Keep retention aligned with your compliance obligations - the audit log is also your evidence trail for frameworks covered in our compliance pillar.
7. Isolate the Network and Filter Egress
MCP servers frequently sit close to internal systems, which makes them a prime SSRF and pivot target. An agent that can be steered to fetch http://169.254.169.254/ (cloud metadata) or an internal admin panel is a serious problem.
- Default-deny egress. Allow-list the exact destinations a tool legitimately needs. This single control neutralizes most exfiltration and command-and-control even after a successful prompt injection.
- Block link-local and private ranges for any tool that fetches URLs, and resolve-then-validate to defeat DNS rebinding. Do not trust the URL string alone.
- Segment the server into its own network zone with explicit ingress/egress rules, separate from your databases and internal services. The server should reach only what its tools require.
- Run each server with sandboxing - containers with dropped capabilities, read-only root filesystems, seccomp profiles - so a compromised tool process cannot escalate.
Network controls are the safety net that holds when the model-layer defenses fail, which is exactly why they belong in the priority list rather than the appendix.
8. Verify the MCP Supply Chain
Installing a third-party MCP server is granting code execution and a set of tools into your agent's trust boundary. The "tool poisoning" and "rug pull" classes of attack - where a server ships benign and later updates to include malicious tool descriptions or behavior - make supply-chain hygiene non-optional.
- Pin versions and verify integrity. Pin to a specific release with a hash or signature; do not auto-update servers into production. Prefer servers published with signed provenance (e.g. Sigstore-style attestations).
- Review tool definitions before and after updates. Diff tool descriptions across versions - a description change is a behavioral change to your agent.
- Vendor and self-host critical servers rather than pulling from a remote registry at runtime, so a registry compromise cannot reach you.
- Maintain an inventory (SBOM-style) of every MCP server, its source, version, and the scopes it holds. You cannot defend an attack surface you have not enumerated.
Run the mcp-security-scanner in CI against any server you adopt, and treat a new server with the same scrutiny you would a new dependency with shell access - because that is what it is. For a structured way to model these risks across your stack, see our MCP threat matrix.
Frequently Asked Questions
What is MCP security?
What are the most important MCP security best practices?
What is the difference between OAuth 2.1 and a static API key for MCP?
How do you prevent prompt injection in MCP servers?
How should secrets be handled in an MCP server?
Can you automatically scan an MCP server for security issues?
Related reading
- MCP prompt-injection defense: full playbook for the hardest MCP threat
- MCP server hardening checklist: step-by-step configuration guide
- The MCP threat matrix: map risks across your agent stack
- MCP compliance: aligning controls with security frameworks
- Hardening sprint: get your MCP servers production-safe fast
Secure your MCP deployment
MCP Defense runs attack-surface assessments, hardening sprints, and 24/7 incident response for Model Context Protocol and AI-agent infrastructure.