16 APR 2026·8 min read·#ai #systems #agents #mcp #protocols #security

The Autonomy Ladder

Google's A2A protocol and MCP are live under shared governance. The protocol layer is solved. The trust layer that determines whether agents can actually discover and rely on each other is not.

Google launched Agent2Agent at Cloud Next in April 2025, and eight months later, Google, Microsoft, AWS, Cloudflare, and Bloomberg joined Anthropic, Block, and OpenAI in founding the Linux Foundation's Agentic AI Foundation [1], which now governs both MCP and the surrounding protocol ecosystem [2]. What MCP did for the connection between an agent and a tool, A2A does for the connection between one agent and another, standardising how they find each other, advertise what they can do, and negotiate work without a human pre-wiring the connections.

I'll be clear that neither protocol delivers full autonomy yet. MCP hands an agent a menu of tools that still had to be written, and A2A's Agent Cards are still published by humans at known URLs, so the discovery is bounded rather than open-ended. But the direction of travel is unmistakable. Traditional programming was explicit instruction, where a developer specified which APIs to call in which order. MCP moved us toward declared capability, where tools expose themselves and the agent picks from the menu. A2A extends that toward autonomous discovery, where the menu itself becomes something the agent finds at runtime.

The destination is the one I wrote about in What OpenClaw Actually Proved: Nvidia's CEO ranked a community project alongside Linux because users had been telling us for years that what they actually want is AI with tool access and execution rights, not a better chat interface. The industry spent three years optimising for the wrong thing. MCP and A2A are infrastructures that catch up with what people were already doing by hand. Each step moves more composition from the human to the system, and each step expands what we mean by "autonomous" in practice, even if the word is still doing more work than the protocols technically support.

What's actually different about A2A

The mechanism worth understanding is the Agent Card. An A2A-compliant agent publishes a small JSON document at a well-known URL, typically /.well-known/agent.json, that describes who it is, what it can do, how to reach it, and what authentication it expects [3]. Another agent that wants to accomplish a task can fetch this card, parse the capabilities, and start negotiating work without a developer ever having written code that explicitly names the remote agent. That's the mechanically new thing.

That said, the pattern itself is not new. Well-known URLs have been a web standard since 2010 [4], OpenAPI documents have let humans describe HTTP APIs for a decade, and service discovery is a solved problem in every serious distributed system. What A2A does is collapse the abstraction level. Instead of a developer reading a capability document and writing integration code, an agent reads it and acts on it in the same session. The human layer between "what this service can do" and "let's use it" is gone.

This is a smaller change than the hype around it suggests, and a larger change than a sceptic reading the spec might assume. It's smaller because the protocols and patterns are mostly assembled from existing pieces. It's larger because the composition model it enables is genuinely different from what came before. A developer building a workflow with twenty integrations used to write twenty bits of integration code. In an A2A world, they describe the workflow's intent, and the agent composes the twenty integrations at runtime based on what it finds. The leverage is in the composition, not the connection.

The autonomy ladder

If the leverage is in composition, the question becomes how much composition the system can actually do on its own, and the honest answer has levels.

The first rung is explicit instruction, and it is the world's most software-intensive environment. A developer writes the integration code, specifies which service to call, maps the inputs and outputs, and the agent or application executes the call exactly as written. There is no choice involved on the agent's part, only compliance. This is how we built software for decades, and it works well when the problem space is known in advance, which is most of the time.

The second rung is declared capability with bounded choice, and this is where MCP and the current A2A specification sit together. The agent is no longer told exactly which service to call. Instead, it presents a set of tools or agents that have described their own capabilities, and selects from that set based on the task at hand. The selection is real, the reasoning is real, but the menu was still curated by a human. Someone decided which MCP servers to connect, someone published the Agent Cards at known URLs, and someone configured the trust boundaries. The agent has latitude within a space defined by a person. This is the same shift I wrote about in Queryable Thinking applied to services rather than content: the consumer of the information stops following a script and starts reasoning about its options, which changes what the producer has to offer.

The third rung is open discovery with goal-directed selection, and it is important to say plainly that this rung has not shipped anywhere yet. In this model, an agent forms a goal, searches the open web for other agents that might help accomplish it, evaluates their published capabilities against its own requirements, negotiates terms of engagement, and composes a multi-agent workflow from what it finds. The agent is no longer choosing from a menu. It is finding restaurants, reading their menus, and booking a table, all without a human having pointed it in the right direction.

We are at the second rung. The more interesting question is what the jump from the second to the third actually requires, because the gap is not primarily a protocol gap. The protocols assume the existence of trust infrastructure, reputation systems, payment rails, and capability verification mechanisms that do not yet exist. A2A gives you a way to describe what an agent can do. It does not give you a way to confirm that the description is accurate, that the agent will behave reliably, or that the entity behind it is who they claim to be.

The trust gap

Consider a concrete scenario: an agent is planning a multi-step workflow and discovers, via a well-known URL, an Agent Card advertising exactly the capability it needs. The card is well-formed, the authentication fields are populated, and the described skills match the task. The agent has no way to know whether the service behind that card is a competent implementation, a broken deployment that will silently corrupt data, or a prompt-injection honeypot designed to hijack the requesting agent's context and redirect its behaviour.

The web solved a version of this problem over two decades. DNS provided reliable name resolution, SSL and its successor TLS provided encrypted transport, and certificate authorities provided a chain of trust that lets a browser verify that the server it is talking to is actually controlled by the entity that owns the domain [5]. The system is imperfect and has been exploited many times, but it provides a baseline layer of identity verification that the entire economy now depends on.

A2A includes authentication fields in Agent Cards, allowing a requesting agent to verify that the serving agent controls the identity in question. That is the equivalent of checking a TLS certificate. What A2A does not have, and what no agent protocol currently provides, is a reputation or verification layer. You can confirm who an agent claims to be. You cannot confirm whether it is competent, whether it has been compromised, or whether its outputs are trustworthy. The identity layer exists, and the competence layer does not.

Prompt injection makes this gap especially dangerous in practice, because a malicious agent does not need to break the protocol to cause harm. It only needs to return content that, when processed by the requesting agent's language model, alters the model's behaviour in ways the requesting agent's operator did not intend [6]. The attack surface is the natural language interface itself, which is also the interface that makes agent-to-agent communication flexible and powerful in the first place.

Who builds the trust layer

The protocol layer now has a clear governance home. The Agentic AI Foundation, founded in December 2025 under the Linux Foundation, was co-founded by Anthropic, Block, and OpenAI, with founding support from Google, Microsoft, AWS, Cloudflare, and Bloomberg [2]. That is a striking alignment. The companies that would usually compete on proprietary standards have agreed that the protocol layer should be neutral, and they have put MCP, goose, and AGENTS.md under shared governance to prove it.

The trust layer has no equivalent. DNS, TLS, and certificate authorities each emerged from specific institutional arrangements: IETF standards, commercial CAs operating under WebTrust audits, browser vendors enforcing trust roots. Each of those mechanisms took years to build and decades to harden, and each was contested along the way. For agent protocols, no equivalents exist yet. There is no reputation registry for agents, no competence certification body, and no standard way to report that a particular Agent Card is misbehaving or compromised. The open question is whether this infrastructure gets built inside the Agentic AI Foundation, emerges from commercial vendors competing on reliability as a product, or comes from somewhere no one has started yet.

My honest view is that this is the most valuable unclaimed territory in the current AI infrastructure stack. Identity is solved. Protocols are solved. Trust and reputation are not, and the organisations that figure out how to build them in an agent-native way will find themselves in the same position as the certificate authorities occupied in the early commercial web.

The protocol wars everyone predicted have not happened. The industry's major competitors co-founded the foundation that governs the standards, and the interesting work has already moved elsewhere. The shift from instruction to discovery is real, the second rung of the autonomy ladder is live in production, and the third rung is gated by infrastructure nobody has built yet. The trust layer is where the next five years of useful work sit. If you are looking at the agent space and trying to work out where the leverage actually is, that is the answer.

References

Series:

Sources:

Google, "Announcing the Agent2Agent Protocol (A2A)," Google Cloud Blog, April 2025.
Linux Foundation, "Linux Foundation Announces the Formation of the Agentic AI Foundation (AAIF)," December 9, 2025.
Google, "Agent2Agent Protocol Specification: Agent Card," github.com/google/A2A.
IETF, "RFC 5785: Defining Well-Known Uniform Resource Identifiers," 2010.
IETF, "RFC 8446: The Transport Layer Security (TLS) Protocol Version 1.3," 2018.
Greshake et al., "Not What You've Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection," 2023.