aws/agent-toolkit-for-aws

Official, AWS-supported MCP servers, skills, and plugins to help AI agents build on AWS

6 chapters 5 audio lessons 5 videos 3 free previews Fresh topic

Start here

1. E00_Introduction_The_Agent_Trust_Problem

The Agent Trust Problem

A year of agents on AWS taught the cloud one lesson: the question was never how to give them more power. It was who they would answer to.

In February 2025, a backend engineer at a mid-sized fintech typed npx @awslabs/mcp into a terminal and got an AI coding agent wired straight into her AWS account. The agent could describe instances, list S3 buckets, call PutItem on DynamoDB, and even rotate Lambda functions — all from a one-line npm install. What it could not do was tell anyone it had done so. No CloudTrail record differentiated her agent's actions from her own IAM role's actions. No CloudWatch metric distinguished the human's describe_instances from the agent's. The developer was inside the blast radius, the security team was outside the audit trail, and the AWS Labs MCP server in the middle was, by design, a thin pipe.

A year later, that pipe is still there. It is just no longer the product. The product is the door around it.

AWS's agent-toolkit-for-aws repository, GA-tagged and Apache-2.0, is the formal answer to a question AWS Labs' MCP servers could not answer: how do you let an AI coding agent act on AWS without losing observability and without throttling the developer? The answer is not a better MCP server. There is still one managed AWS MCP Server, accessed through a pinned mcp-proxy-for-aws@1.6.3 package, fronted by a regional HTTPS endpoint at https://aws-mcp.<region>.api.aws/mcp. The answer, instead, is everything *around* that pipe: a curated library of 107 skills, three cross-host manifest shims (Claude Code, Codex, Cursor), four plugin bundles, one regional MCP endpoint, and — the part I want you to remember when the series ends — one PreToolUse hook that refuses to let a secret leak.

That sentence sounds modest. It is not. If you have shipped an AI agent into a regulated AWS environment, or if you are a platform engineer who has had to write the IAM policy that decides what your agent can and cannot touch, you already know that the hard part was never "does the agent reach the API." The hard part is the boundary. And the boundary, in this toolkit, is the hook.

I started this analysis treating the Agent Toolkit for AWS as just another MCP project — a rebranding exercise, a marketplace entry, a re-skin of what AWS Labs had already shipped. Reading the README's third paragraph changed that. The paragraph that names the three differentiators — IAM condition keys that distinguish agent actions from human actions, CloudWatch metrics and CloudTrail audit logging for every request, and skills that have been end-to-end evaluated — is not a feature list. It is a thesis. The toolkit exists *because* the AWS Labs version could not provide those three things, and the team that built it decided that the gap was large enough to justify a new distribution.

The thesis is auditable. Every claim in the README's third paragraph corresponds to a concrete artifact in the repository, and that correspondence is what makes the toolkit interesting to a senior engineer. The IAM condition key claim maps to the aws:CalledVia context key, which appears in the policy guidance for the aws-core and aws-agents plugins. The CloudWatch and CloudTrail claim maps to the metadata bus of mcp-proxy-for-aws@1.6.3, which tags every request with INSTALL_SOURCE=agent-toolkit so AWS's backend can meter it. The end-to-end evaluation claim maps to the fact that the toolkit's 107-skill library is bounded at 107 — the number is a budget, not an accident, because the team can only end-to-end-evaluate so many

7m / Article + audio + video

2. E01_The_Plugin_Surface

The Plugin Surface

The toolkit is four plugins. It is also 107 skills. Those two numbers are not the same kind of thing — and confusing them is the single most common mistake readers make on first contact.

Key Takeaways

The repository ships four installable plugins and 107 SKILL.md files — those are different kinds of artifact. Plugins are packaging; skills are the work.
Every plugin ships three manifest files (Claude Code, Codex, Cursor) plus a fourth marketplace shape under .agents/, because the four agent hosts have not converged on a single schema.
The 107 skills split into 14 core-skills + 50 specialized-skills at the top level, plus 15+7+8+13 bundled inside the four plugins — duplicated deliberately, not by accident.
The validator at tools/validate.py enforces kebab-case skill names, directory-name matching, description length, and JSON manifest schema; every CI run gates on it.
The plugin boundaries map to four kinds of work: operate (aws-core), build agents (aws-agents), analyze data (aws-data-analytics), investigate incidents (aws-agents-for-devsecops) — pick by primary use case.

There are 107 SKILL.md files in the aws/agent-toolkit-for-aws repository. I counted them: 14 in skills/core-skills/, 50 in skills/specialized-skills/ spread across 10 service categories, 15 inside the aws-core plugin, 7 inside aws-agents, 8 inside aws-data-analytics, and 13 inside aws-agents-for-devsecops. I also counted something the README does not advertise: there are exactly four installable plugins. The ratio of skills to plugins is roughly 27 to 1, and that ratio is the most important fact in the entire codebase, because it tells you what the project is for.

The plugins are packaging. The skills are the work.

I assumed, walking into the repository, that the plugins were the product. They have version numbers (aws-core is 1.1.0; the other three are 1.0.0). They have manifest files. They have README.md documents with marketing language ("Start here." "Investigate incidents, review code and execute UAT for release readiness, scan code for vulnerabilities, and run penetration tests with AWS DevOps Agent and AWS Security Agent."). They look like the kind of artifact that gets shipped. Counting the skills changed my mind. Plugins are what an install command grabs; skills are what the agent actually reads when a developer asks it to do something. The repository's center of gravity is the 107 Markdown files, not the four plugin folders.

This chapter maps the surface so you can see the same thing I saw.

The four plugins

The top-level marketplace.json files (one for Claude Code at .claude-plugin/marketplace.json, one for Codex at .agents/plugins/marketplace.json, one for Cursor at .cursor-plugin/marketplace.json) all enumerate the same four plugins, in the same order:

1. aws-core (v1.1.0) — "Build, deploy, and operate applications on AWS." The general-purpose bundle. Service selection, CDK/CloudFormation, serverless, containers, storage, observability, billing, SDK usage, deployment. The README is unambiguous: start here. 2. aws-agents (v1.0.0) — "Build, deploy, and operate AI agents on AWS." Skills for scaffolding agents with Amazon Bedrock AgentCore (Strands, LangGraph), connecting tools via Gateway and MCP, multi-agent and A2A orchestration, memory, Cedar policies, evaluation, observability, and production hardening. 3. aws-data-analytics (v1.0.0) — "Data lake, analytics, and ETL workflows with S3 Tables, AWS Glue, and Athena." Managed Iceberg tables on S3 Tables, ingestion from JDBC databases, Amazon Redshift, Snowflake, BigQuery, federated Athena queries, vector storage on S3 Vectors, Amazon OpenSearch Service. 4. aws-agents-for-devsecops (v1.0.0) — "Investigate incidents, review code and execute UAT for release readiness, scan code for vulnerabilities, and run penetration tests with AWS DevOps Agent and AWS Security Agent." Note: this is the only plugin that ships a commands/ directory (9 slash commands) and an examples/ directory. The other three are pure skill bundles.

graph TB
  M[Agent Toolkit for AWS<br/>marketplace.json]
  M --> P1[aws-core v1.1.0]
  M --> P2[aws-agents v1.0.0]
  M --> P3[aws-data-analytics v1.0.0]
  M --> P4[aws-agents-for-devsecops v1.0.0]
  P1 --> S1[15 skills]
  P1 --> S2[hooks/secret-safety.py]
  P1 --> S3[.mcp.json: aws-mcp proxy]
  P2 --> S4[7 lifecycle skills]
  P2 --> S5[.mcp.json: knowledge endpoint]
  P3 --> S6[8 analytics skills]
  P3 --> S7[.mcp.json: aws-mcp proxy]
  P4 --> S8[13 devsecops skills]
  P4 --> S9[9 slash commands]
  P4 --> S10[examples/multi-space-walkthrough]
  P4 --> S11[.mcp.json: DevOps Agent endpoint]
  R1[Top-level skills/]
  R1 --> R2[14 core-skills]
  R1 --> R3[50 specialized-skills<br/>across 10 categories]

8m / Article + audio + video

3. E02_One_Server_Four_Configurations

One Server, Four Configurations

If the AWS MCP Server is the same in every plugin, why are the four .mcp.json files different? That difference is the most under-read evidence in the repository — and it tells you exactly what AWS thinks its agent is for.

Key Takeaways

There is one managed AWS MCP Server, accessed through mcp-proxy-for-aws@1.6.3 — a Python proxy invoked by uvx against https://aws-mcp.<region>.api.aws/mcp. The four .mcp.json files are not four servers; they are four postures.
The proxy is a client, a credential broker, and a metadata bus in one binary. The agent never sees the developer's AWS credentials; the proxy injects them at transport time.
The version pin (@1.6.3, exact) is itself a security decision — pinned to make the supply chain auditable and the install reproducible.
The four configurations are: aws-core (general operator, full IAM), aws-agents (documentation only, no auth), aws-data-analytics (same as aws-core, domain-specific skills), aws-agents-for-devsecops (region-routed, Bearer-token, async jobs).
Pin your proxy. Always. A floating version of mcp-proxy-for-aws is a floating trust boundary.

Open plugins/aws-core/.mcp.json. The whole file is this:

{
  "mcpServers": {
    "aws-mcp": {
      "command": "uvx",
      "args": [
        "mcp-proxy-for-aws@1.6.3",
        "https://aws-mcp.us-east-1.api.aws/mcp",
        "--skip-auth",
        "--metadata",
        "INSTALL_SOURCE=agent-toolkit"
      ]
    }
  }
}

That is the AWS MCP Server, for the aws-core plugin, in nine lines of JSON. A uvx invocation — uv is Astral's Python package manager — of mcp-proxy-for-aws@1.6.3, a proxy that wraps a remote HTTPS endpoint at aws-mcp.us-east-1.api.aws. Two flags: --skip-auth (the proxy will sign requests on the developer's behalf using the local AWS credential chain) and --metadata (a key-value pair the proxy attaches to every request, here tagging it with INSTALL_SOURCE=agent-toolkit so AWS's servers can see which distribution the developer came from). No secrets in the file. No tokens. No embedded credentials. The local aws CLI credential chain does the heavy lifting at call time.

Now open plugins/aws-agents/.mcp.json. It is different in two ways: the endpoint changes, and the auth model changes. The endpoint is the AWS Knowledge MCP server, a regionless HTTPS service that does not require AWS credentials because it serves public documentation. The plugin uses an HTTP transport, not stdio, and it has no command to launch — the agent host connects directly to the URL.

Now plugins/aws-data-analytics/.mcp.json. It is identical in shape to aws-core: a uvx invocation of mcp-proxy-for-aws@1.6.3 against the same aws-mcp.us-east-1.api.aws endpoint. Same proxy, same version, same region. The metadata tag is the only thing that varies.

Now plugins/aws-agents-for-devsecops/.mcp.json. Different again: a Bearer-token auth model, a region-routed endpoint at https://connect.aidevops.${DEVOPS_AGENT_REGION:-us-east-1}.api.aws/mcp, and a separate SigV4 fallback. The metadata tag is DEVOPS_AGENT_REGION, which reads from the environment.

That is four configurations. The honest first reaction is that AWS has built a different MCP server for each plugin. That would be the obvious reading. It is also wrong. The four configurations are four *views* of the same architectural pattern, and the pattern is what matters.

The shape of the architecture

The proxy at mcp-proxy-for-aws@1.6.3 is doing three jobs simultaneously, and the four configurations show all three:

1. It is a client. The proxy speaks MCP to the local agent host (Claude Code, Cursor, Codex) and converts those calls into HTTPS requests to the AWS regional endpoint. 2. It is a credential broker. When the agent calls s3:ListBuckets, the proxy reads the local AWS credential chain (environment variables, ~/.aws/credentials, SSO, instance metadata) and signs the request with SigV4. The agent never sees the credentials; the proxy injects them at transport time. 3. It is a metadata bus. The --metadata flag attaches arbitrary key-value pairs to every request, which AWS's backend uses for routing, attribution, and audit. INSTALL_SOURCE=agent-toolkit tells AWS that this request came from the toolkit, not from AWS Labs MCP servers and not from a custom integration.

The version pin (@1.6.3, exact) is itself an architectural decision. Most Python tools are invoked wi

8m / Article + audio + video

Premium chapters

4. E03_The_PreToolUse_Hook

Available after upgrade / 9m

5. E04_Skills_Migration_and_What_Comes_Next

Available after upgrade / 9m

6. README

Available after upgrade / 2m