Agent2Agent: A practical guide to build agents

Google introduced the Agent2Agent (A2A) Protocol, which is already making waves in the AI space. Real-world use cases, from travel planning to enterprise automation, are popping up fast.
Over 50 tech partners, including Mongodb, Atlassian, SAP, PayPal, and Cohere, have already adopted it, highlighting just how impactful A2A is becoming.
As an AI enthusiast, I spent the weekend exploring A2A, and in this article, I’ll break down what I’ve learned.
To keep things simple, I’ve split it into two parts:
- • Theory – What A2A is, how it works, and why it matters.
- • Practical – How to build with A2A and integrate MCP to seamlessly connect tools and data sources.
So, let’s begin.
What is A2A

A2A stands for Agent-to-Agent protocol. It is designed to enable different, specialised AI agents to communicate directly with each other, delegate tasks, and work together as a team.
For example, it allows a primary agent (like a personal assistant) to act as a project manager, coordinating a team of specialist agents.
This solves the problem of current AI agents working in isolation and opens up new ways and possibilities to build complex multi-agent systems.
It is built on 5 key principles (from the docs itself):
- 1. Simple: Reusing existing standards (HTTP, JSON-RPC, SSE, Push Notify)
- 2. Enterprise Ready: Built-in Auth, Security, Privacy, Tracing, Monitoring Support
- 3. Async First: Can handle (Very) long-running tasks while providing meaningful updates.
- 4. Modality Agnostic: Supports wide range of modalities: text, audio/video, forms, Iframe,, etc.
- 5. Opaque Execution: Agents do not have to share thoughts, plans, or tools.
Think of it as creating a standard way for AI agents to introduce themselves, share what they can do, and work together on tasks.
Now let’s look at the components that make up A2A
Components of A2A Protocol
Before understanding how the A2A works, it’s important to know the core components that power it.

These are the core components of A2A:
- • Client-Server Model: The A2A works on client-server architecture, where the client (agent) asks for a task to be done and the server (specialised agent/ tool) does the task. However, roles can keep changing during the task flow
- • Agent Cards: Agent Cards are json file that act as an agent’s profile. It lists down
id
name
job
type
securtiy details
,mcp support
and much more. It helps in agent discovery by client. - • Task: A Task is the main unit of work and moves through clear stages —submitted, working, input-required, completed, failed, or cancelled. This helps to manage progress and workflow.
- • Message Structure: Inside the task, agents talk using messages. A message contains parts that consist of the actual content (multimodal).
- • Artefacts: The output of the task is delivered through artefacts. Artefacts are structured results to ensure the final output is consistent and easy to use.
💡Note: For simplicity, I have only covered the essentials. A detailed dive can be found [here](https://composio.dev/blog/mcp-vs-a2a-everything-you-need-to-know/)
With the fundamentals of the core component under the belt, let’s look at how it all ties up together
How A2A Protocol Works

When a task is assigned to an agent, it goes through the following stages:
Agent Discovery
- • Each specialist agent publishes an Agent Card (like a resume).
- • The card includes capabilities (e.g., “travel_planning”, “budget_analysis”).
- • A requesting agent uses these cards to discover the right specialists.
Task Delegation
- • The requesting agent assigns tasks to the selected specialists.
- • Tasks are expressed in natural language, allowing flexibility.
- • Example: “Find affordable flights and hotel options.”
- • Specialist agents interpret and act on these high-level requests using their own intelligence. Task Processing (Multi Turn)
- • Tasks go through a lifecycle: pending → running → intermediate updates → completed/failed.
- • The requesting agent can receive acknowledgements, track progress, get partial results, and continuously monitor task updates.
Completion & Delivery
- • Once all tasks are done, the requesting agent collects and merges results (artifacts).
- • The final output is a coherent, combined solution (e.g., full vacation itinerary).
- • The agent may also refine or post-process the gathered data for presentation or further use.
With seamless collaboration between multiple agents, complex workflows are possible, but often, these multi-agent systems get stuck with tool mismatch, context loss, and misaligned goals.
To tackle these problems, MCP comes to the rescue.
A2A + MCP – A great Match
MCP (Model Context Protocol) is a new standard protocol that allows agents to connect with custom tools, APIs, and data sources.
By adding an MCP layer, you can keep everything in sync and ensure smooth collaboration between agents by handling:
- • Context Loss: MCP Shares key background info so all agents know what’s happening.
- • Tool Mismatch: MCP tells agents what tools are available and how to use them.
- • Misaligned Goals: MCP helps agents understand the task’s purpose so they work toward the same goal.
- • Data Confusion: MCP standardises how agents share and read data to avoid miscommunication.
- • Handoffs: The MCP ensures that the next agent picks up where the last one left off with all the right information.
Here is an example on how the trip planning task flow look like with MCP.
Example

The user assigns a trip planning task, and the master agent looks for agent cards for a specialised agent, say Travel. Budget & Local Guide
Now agents can be given a set of tools to complete its task managed by MCP.
For example (with no meditaries as in the chart):
- • A Travel Agent can use MCP to search flights and hotels using flight and hotel search tools
- • Budget Agent might use an MCP-linked calculator tool
- • Local Guide can access an activities database. (data sources)
Finally, once all the agents have returned resources, the master agent combines them and generates the final response in a structured format.
In this setting, A2A coordinates who does what, while MCP ensures each agent can actually do it using the right tools and stays on track—a key difference between A2A and MCP.
I hope you understood the context. Now, let’s move on to the second part of the blog, building with the A2A protocol.
Building with A2A Protocol using Composio, Gemini & Anthropic
In my experience, building tools and MCP servers is time-consuming. and often error-prone (especially the setup part). If you followed my earlier blog on Building MCP Server From Scratch, you might know it too.
So, to reduce the time, I will use Composio and Anthropic, which host a directory of over 200+ predefined MCP servers with tool integration to speed up the process.
Now let’s get building
Problem Statement
For this demo, let’s build a browser automation agent that leverages puppeteer
MCP server will browse the web and perform actions autonomously.
Workflow
Sample Project Workflow

If this feels complicated, things will clarify as we move along.
Let’s set up the workspace.
Workspace Setup
Open a terminal and run the given commands one by one
> python -m venv a2a
> a2a\Scriptsa2a\Scripts\activate
> pip install google-adk
> code .
This code will create the environment a2a
, activate it, install Google’s Agent Developer Kit (ADK) and open the workspace in VS Code (optional).
Google Agent Developer Kit package will be used to connect to the MCP servers with A2A agents.
Inside the workspace, create a new file .env
and populate it with the following value:
GOOGLE_API_KEY=your-api-key
GEMINI_API_KEY=your-api-key
You can get your API key at: Google AI Studio , though you might need to login first!
Time to set up MCP Servers.
Set Up MCP Server (stdio/ sse)
- 1. Head to the MCP Server Repository and open the Puppeteer MCP Server page. From here, we will use the installer shortcut. I will use npx; feel free to use docker if you like.
- 2. In the page readme, scroll down and find section titled: Usage with VS Code. Click on one-click install, give the required permission. This install the puppeteer MCP server. You can check
settings.json
for verification. - 3. For more granular control over server installation and refer to the readme.
With this, the MCP setup is done, time to write the actual code.
App Code
Create a file called automation_agent.py in the root.
Import necessary modules and load the environment variables defined in .env
file
# ./adk_agent_samples/mcp_agent/agent.py
import asyncio
from dotenv import load_dotenv
from google.genai import types
from google.adk.agents.llm_agent import LlmAgent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.adk.artifacts.in_memory_artifact_service import InMemoryArtifactService # Optional
from google.adk.tools.mcp_tool.mcp_toolset import MCPToolset, SseServerParams, StdioServerParameters
# Load environment variables from .env file in the parent directory
load_dotenv('.env')
Configure MCP Server (stdio based) & Fetch tools
# Import Tools from MCP Server - Puppeteer
async def get_tools_async():
"""Gets tools from the File System MCP Server."""
print("Attempting to connect to Pupeeter MCP server...")
tools, exit_stack = await MCPToolset.from_server(
connection_params=StdioServerParameters(
command='npx',
args=["-y",
"@modelcontextprotocol/server-puppeteer"],
)
)
print("MCP Toolset created successfully.")
return tools, exit_stack
Focus on the connection_params
It’s the same as in settings.json
. This is crucial for fetching tools.
For an SSE-based server, use this code 👇
async def get_tools_async():
"""Gets tools from the MCP Server."""
print("Attempting to connect to MCP server...")
tools, exit_stack = await MCPToolset.from_server(
connection_params=SseServerParams(url="mcp-sse-server-url")
)
print("MCP Toolset created successfully.")
return tools, exit_stack
Ensure to replace mcp-sse-server-url
With an actual SSE URL like: https://mcp.composio.dev/gmail/tinkling-faint-car-f6g1zk
– composio Gmail MCP server (sse)
Next, create an agent
async def get_agent_async():
"""Creates an ADK Agent equipped with tools from the MCP Server."""
tools, exit_stack = await get_tools_async()
print(f"Fetched {len(tools)} tools from MCP server.")
root_agent = LlmAgent(
model='gemini-2.0-flash', # Adjust if needed
name='web_search_assistant',
instruction=(
"You are a web automation assistant. Use browser tools to navigate pages, click, and extract data. "
"Always try to solve user queries by using the tools instead of answering directly. "
"If the user asks for news or information from a website, open that site and extract the relevant parts."),
tools=tools
)
# uncomment to see list of tools - for verifiability
# for tool in tools:
# print(f"Tool: {tool.name} - {tool.description}")
return root_agent, exit_stack
Ensure the instruction
The prompt is detailed and highlights the use case. The better the input, the better the tool navigation support.
Finally, add the main logic:
# Define the main function to run the agent
async def async_main():
session_service = InMemorySessionService()
artifacts_service = InMemoryArtifactService() # Optional
session = session_service.create_session(
state={}, app_name='mcp_web_search_app', user_id='web_search'
)
# Define reliable query for best results
query = (
"Go to https://news.google.com and extract the top 5 headlines from the homepage. "
"Use your browser automation tools to navigate and extract the text."
)
print(f"User Query: '{query}'")
content = types.Content(role='user', parts=[types.Part(text=query)])
root_agent, exit_stack = await get_agent_async()
runner = Runner(
app_name='mcp_web_search_app',
agent=root_agent,
artifact_service=artifacts_service, # Optional
session_service=session_service,
)
print("Running agent...")
events_async = runner.run_async(
session_id=session.id, user_id=session.user_id, new_message=content
)
async for event in events_async:
print(f"Event received: {event}")
print("Closing MCP server connection...")
await exit_stack.aclose()
print("Cleanup complete.")
if __name__ == '__main__':
try:
asyncio.run(async_main())
except Exception as e:
print(f"An error occurred: {e}")
The above script runs an AI agent that takes a query to “extract the top 5 Google News headlines”, processes it asynchronously, prints the agent’s responses, and then cleans up the session. The main function handles error handling and runs the program.
Make sure to prompt the exact requirements in query
the parameter/ keep user input for better results.
But how does this all get executed? Agent Cards!
In the Google A2A ecosystem, an Agent Card is like a profile or manifest for the agent. It tells the ADK runner info about agent capabilities like skills, endpoint URL, and other relevant information.
All agent cards are accessible at: url/.well-known/agent.json
So any developer needs to follow the format of Agent Card to expose the functionality.
Lucky for us Google ADK does it in the background through LlmAgent. So, in this case, the agent card might look like:
{
"name": "WebSearchAgent",
"description": "An agent that performs web searches and extracts information.",
"url": "http://localhost:8000",
"capabilities": {
"streaming": true,
"pushNotifications": false
},
"defaultInputModes": ["text"],
"defaultOutputModes": ["text"],
"skills": [
{
"id": "browser",
"name": "browser automation",
"description": "Performs web searches to retrieve information."
}
]
}
The card holds the following information:
- • Name/Description → The Agent name and the
instruction
prompt. - • URL → Placeholder
localhost:3000
, adjust based on MCP server hosting. - • Provider/Version → Project/org details.
- • Capabilities → SSE-based streaming with no push notification
- • Input Output Mode→ Text based
- • Skills → One main skill: navigating websites and extracting information via Puppeteer tools.
This agent card is then fetched by Runner
runnable to identify the agent to delegate the task and wait for the response.
Internally:

The diagram represents the following flow: (WA – web automation agent, MA – master / main agent)
- • The user sends a query to the Main Agent (MA).
- • The MA fetches the Agent Card of the Web Automation Agent (WA) to understand its capabilities.
- • The Agent Card provides metadata about the WA.
- • The MA delegates the task to the WA.
- • The WA utilizes tools defined in the Agent Card, such as the Puppeteer MCP Server.
- • The Puppeteer MCP Server performs web automation tasks.
- • The results are returned to the WA.
- • The WA sends the results back to the MA.
I hope you understood the internal detail, if not drop a comment, I will be happy to explain
Now it’s time to test the agent.
Now time to test the agent.
Results Time
- • Run the MCP Server using:
CTRL
+SHIFT
+P
→ List MCP Servers → Puppeteer → Start Server. - • Run the script using:
python *automation_agent.py*
The puppeteer server opens a browser, heads to the Google News page, fetches the output, and displays it in the terminal as artefacts, showcasing it using A2A and MCP under the hood.
Here is a quick demo of me using it:
I left the results-saving part for demonstration purposes. You are free to implement it if you like.
Congrats on becoming your first A2A + MCP agent. I hope you had fun. I’m really excited to see what you all come up with.
Here are some final notes from building this project before I end this article.
Conclusion
The line between MCP and Agents often gets blurry and can be used interchangeably, but they are different and complement each other. Always keep in mind:
- • MCP helps agents talk to tools that connect with outside apps,
- • Agent2Agent helps agents collaborate.
Both are steps toward making agent development more standard, easier and automated.
It’ll be interesting to see how they shape agents’ future together!