Agent2Agent: A practical guide to build agents

Google agent2agent

Google introduced the Agent2Agent (A2A) Protocol, which is already making waves in the AI space. Real-world use cases, from travel planning to enterprise automation, are popping up fast.

Over 50 tech partners, including Mongodb, Atlassian, SAP, PayPal, and Cohere, have already adopted it, highlighting just how impactful A2A is becoming.

As an AI enthusiast, I spent the weekend exploring A2A, and in this article, I’ll break down what I’ve learned.

To keep things simple, I’ve split it into two parts:

  • Theory – What A2A is, how it works, and why it matters.
  • Practical – How to build with A2A and integrate MCP to seamlessly connect tools and data sources.

So, let’s begin.

What is A2A

A2A Image
Source: Github

A2A stands for Agent-to-Agent protocol. It is designed to enable different, specialised AI agents to communicate directly with each other, delegate tasks, and work together as a team.

For example, it allows a primary agent (like a personal assistant) to act as a project manager, coordinating a team of specialist agents.

This solves the problem of current AI agents working in isolation and opens up new ways and possibilities to build complex multi-agent systems.

It is built on 5 key principles (from the docs itself):

  1. 1. Simple: Reusing existing standards (HTTP, JSON-RPC, SSE, Push Notify)
  2. 2. Enterprise Ready: Built-in Auth, Security, Privacy, Tracing, Monitoring Support
  3. 3. Async First: Can handle (Very) long-running tasks while providing meaningful updates.
  4. 4. Modality Agnostic: Supports wide range of modalities: text, audio/video, forms, Iframe,, etc.
  5. 5. Opaque Execution: Agents do not have to share thoughts, plans, or tools.

Think of it as creating a standard way for AI agents to introduce themselves, share what they can do, and work together on tasks.

Now let’s look at the components that make up A2A

Components of A2A Protocol

Before understanding how the A2A works, it’s important to know the core components that power it.

Components Of A2A- Harsh Mishra

These are the core components of A2A:

  • Client-Server Model: The A2A works on client-server architecture, where the client (agent) asks for a task to be done and the server (specialised agent/ tool) does the task. However, roles can keep changing during the task flow
  • Agent Cards: Agent Cards are json file that act as an agent’s profile. It lists down id name job type securtiy details , mcp support and much more. It helps in agent discovery by client.
  • Task: A Task is the main unit of work and moves through clear stages —submitted, working, input-required, completed, failed, or cancelled. This helps to manage progress and workflow.
  • Message Structure: Inside the task, agents talk using messages. A message contains parts that consist of the actual content (multimodal).
  • Artefacts: The output of the task is delivered through artefacts. Artefacts are structured results to ensure the final output is consistent and easy to use.

💡Note: For simplicity, I have only covered the essentials. A detailed dive can be found [here](https://composio.dev/blog/mcp-vs-a2a-everything-you-need-to-know/)

With the fundamentals of the core component under the belt, let’s look at how it all ties up together

How A2A Protocol Works

A2A Workflow - Mermaid Digram

When a task is assigned to an agent, it goes through the following stages:

Agent Discovery

  • • Each specialist agent publishes an Agent Card (like a resume).
  • • The card includes capabilities (e.g., “travel_planning”, “budget_analysis”).
  • • A requesting agent uses these cards to discover the right specialists.

Task Delegation

  • • The requesting agent assigns tasks to the selected specialists.
  • • Tasks are expressed in natural language, allowing flexibility.
  • • Example: “Find affordable flights and hotel options.”
  • • Specialist agents interpret and act on these high-level requests using their own intelligence. Task Processing (Multi Turn)
  • • Tasks go through a lifecycle: pending → running → intermediate updates → completed/failed.
  • • The requesting agent can receive acknowledgements, track progress, get partial results, and continuously monitor task updates.

Completion & Delivery

  • • Once all tasks are done, the requesting agent collects and merges results (artifacts).
  • • The final output is a coherent, combined solution (e.g., full vacation itinerary).
  • • The agent may also refine or post-process the gathered data for presentation or further use.

With seamless collaboration between multiple agents, complex workflows are possible, but often, these multi-agent systems get stuck with tool mismatchcontext loss, and misaligned goals.

To tackle these problems, MCP comes to the rescue.

A2A + MCP – A great Match

MCP (Model Context Protocol) is a new standard protocol that allows agents to connect with custom tools, APIs, and data sources.

By adding an MCP layer, you can keep everything in sync and ensure smooth collaboration between agents by handling:

  • Context Loss: MCP Shares key background info so all agents know what’s happening.
  • Tool Mismatch: MCP tells agents what tools are available and how to use them.
  • Misaligned Goals: MCP helps agents understand the task’s purpose so they work toward the same goal.
  • Data Confusion: MCP standardises how agents share and read data to avoid miscommunication.
  • Handoffs: The MCP ensures that the next agent picks up where the last one left off with all the right information.

Here is an example on how the trip planning task flow look like with MCP.

Example

A2A & MCP Based Trip Planner Agent Workflow

The user assigns a trip planning task, and the master agent looks for agent cards for a specialised agent, say Travel. Budget & Local Guide

Now agents can be given a set of tools to complete its task managed by MCP.

For example (with no meditaries as in the chart):

  • • A Travel Agent can use MCP to search flights and hotels using flight and hotel search tools
  • Budget Agent might use an MCP-linked calculator tool
  • Local Guide can access an activities database. (data sources)

Finally, once all the agents have returned resources, the master agent combines them and generates the final response in a structured format.

In this setting, A2A coordinates who does what, while MCP ensures each agent can actually do it using the right tools and stays on track—a key difference between A2A and MCP.

I hope you understood the context. Now, let’s move on to the second part of the blog, building with the A2A protocol.

Building with A2A Protocol using Composio, Gemini & Anthropic

In my experience, building tools and MCP servers is time-consuming. and often error-prone (especially the setup part). If you followed my earlier blog on Building MCP Server From Scratch, you might know it too.

So, to reduce the time, I will use Composio and Anthropic, which host a directory of over 200+ predefined MCP servers with tool integration to speed up the process.

Now let’s get building

Problem Statement

For this demo, let’s build a browser automation agent that leverages puppeteer MCP server will browse the web and perform actions autonomously.

Workflow

Sample Project Workflow

Browser Automation Using A2A & Puppeteer MCP Workflow

If this feels complicated, things will clarify as we move along.

Let’s set up the workspace.

Workspace Setup

Open a terminal and run the given commands one by one

> python -m venv a2a
> a2a\Scriptsa2a\Scripts\activate
> pip install google-adk
> code .

This code will create the environment a2a, activate it, install Google’s Agent Developer Kit (ADK) and open the workspace in VS Code (optional).

Google Agent Developer Kit package will be used to connect to the MCP servers with A2A agents.

Inside the workspace, create a new file .env and populate it with the following value:

GOOGLE_API_KEY=your-api-key
GEMINI_API_KEY=your-api-key

You can get your API key at: Google AI Studio , though you might need to login first!

Time to set up MCP Servers.

Set Up MCP Server (stdio/ sse)

  1. 1. Head to the MCP Server Repository and open the Puppeteer MCP Server page. From here, we will use the installer shortcut. I will use npx; feel free to use docker if you like.
  2. 2. In the page readme, scroll down and find section titled: Usage with VS Code. Click on one-click install, give the required permission. This install the puppeteer MCP server. You can check settings.json for verification.
  3. 3. For more granular control over server installation and refer to the readme.

With this, the MCP setup is done, time to write the actual code.

App Code

Create a file called automation_agent.py in the root.

Import necessary modules and load the environment variables defined in .env file

# ./adk_agent_samples/mcp_agent/agent.py
import asyncio
from dotenv import load_dotenv
from google.genai import types
from google.adk.agents.llm_agent import LlmAgent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.adk.artifacts.in_memory_artifact_service import InMemoryArtifactService # Optional
from google.adk.tools.mcp_tool.mcp_toolset import MCPToolset, SseServerParams, StdioServerParameters

# Load environment variables from .env file in the parent directory
load_dotenv('.env')

Configure MCP Server (stdio based) & Fetch tools

# Import Tools from MCP Server - Puppeteer
async def get_tools_async():
  """Gets tools from the File System MCP Server."""
  print("Attempting to connect to Pupeeter MCP server...")
  tools, exit_stack = await MCPToolset.from_server(
      connection_params=StdioServerParameters(
          command='npx',
          args=["-y",    
                "@modelcontextprotocol/server-puppeteer"],
      )
  )
  print("MCP Toolset created successfully.")
  return tools, exit_stack

Focus on the connection_params It’s the same as in settings.json . This is crucial for fetching tools.

For an SSE-based server, use this code 👇

async def get_tools_async():
  """Gets tools from the MCP Server."""
  print("Attempting to connect to MCP server...")
  tools, exit_stack = await MCPToolset.from_server(
      connection_params=SseServerParams(url="mcp-sse-server-url")
  )
  print("MCP Toolset created successfully.")

  return tools, exit_stack

Ensure to replace mcp-sse-server-url With an actual SSE URL like: https://mcp.composio.dev/gmail/tinkling-faint-car-f6g1zk composio Gmail MCP server (sse)

Next, create an agent

async def get_agent_async():
    """Creates an ADK Agent equipped with tools from the MCP Server."""
    tools, exit_stack = await get_tools_async()
    print(f"Fetched {len(tools)} tools from MCP server.")
    root_agent = LlmAgent(
        model='gemini-2.0-flash',  # Adjust if needed
        name='web_search_assistant',
        instruction=(
            "You are a web automation assistant. Use browser tools to navigate pages, click, and extract data. "
            "Always try to solve user queries by using the tools instead of answering directly. "
            "If the user asks for news or information from a website, open that site and extract the relevant parts."),
        tools=tools
    )
    # uncomment to see list of tools - for verifiability
    # for tool in tools:
       # print(f"Tool: {tool.name} - {tool.description}")
    return root_agent, exit_stack

Ensure the instruction The prompt is detailed and highlights the use case. The better the input, the better the tool navigation support.

Finally, add the main logic:

# Define the main function to run the agent
async def async_main():
    session_service = InMemorySessionService()
    artifacts_service = InMemoryArtifactService()  # Optional

    session = session_service.create_session(
        state={}, app_name='mcp_web_search_app', user_id='web_search'
    )

    # Define reliable query for best results
    query = (
    "Go to https://news.google.com and extract the top 5 headlines from the homepage. "
    "Use your browser automation tools to navigate and extract the text."
)
    print(f"User Query: '{query}'")
    content = types.Content(role='user', parts=[types.Part(text=query)])
    root_agent, exit_stack = await get_agent_async()

    runner = Runner(
        app_name='mcp_web_search_app',
        agent=root_agent,
        artifact_service=artifacts_service,  # Optional
        session_service=session_service,
    )

    print("Running agent...")
    events_async = runner.run_async(
        session_id=session.id, user_id=session.user_id, new_message=content
    )

    async for event in events_async:
        print(f"Event received: {event}")

    print("Closing MCP server connection...")
    await exit_stack.aclose()
    print("Cleanup complete.")


if __name__ == '__main__':
    try:
        asyncio.run(async_main())
    except Exception as e:
        print(f"An error occurred: {e}")

The above script runs an AI agent that takes a query to “extract the top 5 Google News headlines”, processes it asynchronously, prints the agent’s responses, and then cleans up the session. The main function handles error handling and runs the program.

Make sure to prompt the exact requirements in query the parameter/ keep user input for better results.

But how does this all get executed? Agent Cards!

In the Google A2A ecosystem, an Agent Card is like a profile or manifest for the agent. It tells the ADK runner info about agent capabilities like skills, endpoint URL, and other relevant information.

All agent cards are accessible at: url/.well-known/agent.json So any developer needs to follow the format of Agent Card to expose the functionality.

Lucky for us Google ADK does it in the background through LlmAgent. So, in this case, the agent card might look like:

{
  "name": "WebSearchAgent",
  "description": "An agent that performs web searches and extracts information.",
  "url": "http://localhost:8000",
  "capabilities": {
    "streaming": true,
    "pushNotifications": false
  },
  "defaultInputModes": ["text"],
  "defaultOutputModes": ["text"],
  "skills": [
    {
      "id": "browser",
      "name": "browser automation",
      "description": "Performs web searches to retrieve information."
    }
  ]
}

The card holds the following information:

  • Name/Description → The Agent name and the instruction prompt.
  • URL → Placeholder localhost:3000, adjust based on MCP server hosting.
  • Provider/Version → Project/org details.
  • CapabilitiesSSE-based streaming with no push notification
  • Input Output Mode→ Text based
  • Skills → One main skill: navigating websites and extracting information via Puppeteer tools.

This agent card is then fetched by Runner runnable to identify the agent to delegate the task and wait for the response.

Internally:

Agent to Agent Flow

The diagram represents the following flow: (WA – web automation agent, MA – master / main agent)

  • • The user sends a query to the Main Agent (MA).
  • • The MA fetches the Agent Card of the Web Automation Agent (WA) to understand its capabilities.
  • • The Agent Card provides metadata about the WA.
  • • The MA delegates the task to the WA.
  • • The WA utilizes tools defined in the Agent Card, such as the Puppeteer MCP Server.
  • • The Puppeteer MCP Server performs web automation tasks.
  • • The results are returned to the WA.
  • • The WA sends the results back to the MA.

I hope you understood the internal detail, if not drop a comment, I will be happy to explain

Now it’s time to test the agent.

Now time to test the agent.

Results Time

  • • Run the MCP Server using: CTRL + SHIFT + P → List MCP Servers → Puppeteer → Start Server.
  • • Run the script using: python *automation_agent.py*

The puppeteer server opens a browser, heads to the Google News page, fetches the output, and displays it in the terminal as artefacts, showcasing it using A2A and MCP under the hood.

Here is a quick demo of me using it:

I left the results-saving part for demonstration purposes. You are free to implement it if you like.

Congrats on becoming your first A2A + MCP agent. I hope you had fun. I’m really excited to see what you all come up with.

Here are some final notes from building this project before I end this article.

Conclusion

The line between MCP and Agents often gets blurry and can be used interchangeably, but they are different and complement each other. Always keep in mind:

  • • MCP helps agents talk to tools that connect with outside apps,
  • • Agent2Agent helps agents collaborate.

Both are steps toward making agent development more standard, easier and automated.

It’ll be interesting to see how they shape agents’ future together!

Leave a Reply

Your email address will not be published. Required fields are marked *

  • Pricing
  • Explore
  • Blog