How to integrate Browser tool MCP with Hermes

Browser tool logo
Hermes logo
divider

Introduction

Hermes is a 24/7 autonomous agent that lives on your computer or server — it remembers what it learns and evolves as your usage grows.

This guide explains the easiest and most robust way to connect your Browser tool account to Hermes. You can do this through either Composio Connect CLI or Composio Connect MCP. For personal use we recommend the CLI, but you won't go wrong with MCP either.

Also integrate Browser tool with

What is Composio Connect?

Composio Connect is a consumer offering that lets anyone plug 1,000+ applications directly into their agent harness — including Hermes. It can:

  • Search and load tools from relevant toolkits on-demand, reducing context usage.
  • Chain multiple tools to accomplish complex workflows via a remote workbench, without excessive back-and-forth with the LLM.
  • Manage app authentication end-to-end with zero manual overhead.

Integrating Browser tool with Hermes

Using Composio Connect CLI

1. Install the Composio CLI

Run the install script directly, or paste https://composio.dev/hermes into your Hermes chat box to have it installed for you.

bash
curl -fsSL https://composio.dev/install | bash
Hermes authenticating with Composio

2. Authenticate

Once the CLI is installed, ask Hermes to authenticate with Composio.

3. Connect to Browser tool

Ask your agent to connect to Browser tool, or simply request any Browser tool-related task. Hermes will prompt you to authenticate and authorize access.

4. Done. You're all set with a new Browser tool connection.


Using Composio Connect MCP

1. Get your MCP URL and API Key

Go to dashboard.composio.dev and copy your Connect MCP URL and API key.

Copy MCP URL and API key from Composio dashboard

What is the Browser tool MCP server, and what's possible with it?

The Browser tool MCP server is an implementation of the Model Context Protocol that connects your AI agent and assistants like Claude, Cursor, etc directly to browser automation tools. It provides structured and secure access to browser actions, so your agent can fetch web content, perform clicks, automate keyboard shortcuts, move the mouse, and interact with on-page elements just like a real user.

  • Fetch and analyze webpage content: Let your agent retrieve the full HTML or clean text of any web page for data extraction, analysis, or decision-making.
  • Automated mouse and keyboard interactions: Instruct your agent to perform precise clicks, double clicks, drags, and keyboard shortcuts to navigate, select, or manipulate content on the page.
  • Clipboard and text extraction: Have the agent copy highlighted text, read clipboard contents, or transfer data between the browser and other tools for seamless workflows.
  • Drag-and-drop automation: Enable your agent to handle complex drag-and-drop actions, such as moving files or rearranging lists, to mimic advanced user interactions.
  • Fine-grained UI element control: Direct your agent to move the mouse, press and hold, or release buttons at exact coordinates to interact with dynamic or custom web interfaces.

Supported Tools & Triggers

Tools
Copy Selected TextCopy currently selected text on the page to clipboard - ideal for extracting highlighted content, copying form data, or harvesting visible text selections.
Drag and DropExecute precise drag and drop operations - essential for file uploads, list reordering, element moving, and complex ui interactions that require drag-based manipulation.
Fetch Webpage ContentYour eyes: get page content for decision-making.
Get Clipboard ContentRead current content from the system clipboard - essential for data transfer workflows, extracting copied text, and reading user-copied data for processing.
Keyboard ShortcutExecute keyboard shortcuts and key combinations - essential for copy/paste, navigation, and application commands that agents need for efficient browser automation.
Mouse ClickPrecision clicker: manual clicking with coordinates.
Mouse Double ClickExecute a precise double click at specified screen coordinates - ideal for opening files, selecting text, or activating ui elements that require double click gestures.
Mouse Down (Press and Hold)Press and hold mouse button at coordinates - use for starting custom drag operations, text selections, or long-press interactions.
Mouse MoveMove mouse cursor to precise coordinates without clicking - perfect for triggering hover effects, revealing tooltips, and positioning for subsequent interactions.
Mouse Up (Release Button)Release mouse button at coordinates - completes drag operations, text selections, and long-press interactions.
Navigate to URLAlways start here: creates browser session and navigates to url.
Paste TextPaste text content at the current cursor position - perfect for filling forms, inserting data into text fields, or quick content insertion at focused elements.
AI Perform Web TaskAi automation: complex workflows only.
Screenshot WebpageCapture high-quality screenshot of any webpage with extensive customization options - perfect for archiving, visual documentation, full-page captures, and cross-device viewport testing.
Scroll PagePage navigation: smooth scrolling.
Set Clipboard ContentStore text content in the system clipboard for later paste operations - perfect for preparing data transfers, staging content for forms, or cross-application data sharing.
Take ScreenshotVisual verification: capture screenshot of current browser viewport.
Type TextControlled input: human-like typing.

Way Forward

With Browser tool connected, Hermes can now act on your behalf whenever it detects a relevant task or you ask it to.

From here, you can extend Hermes further:

  • Connect more apps: Calendar, Slack, Notion, Linear, and hundreds of others are available through the same Composio Connect setup. Each new integration compounds what Hermes can do for you.
  • Build workflows across tools: Once multiple apps are connected, Hermes can chain actions together — turn an email into a calendar invite, a Slack message into a Linear ticket, or a meeting note into a follow-up draft.
  • Let it learn your patterns: The more you use Hermes, the better it gets at anticipating how you'd handle recurring tasks. Give it feedback on drafts and decisions, and it will adapt.

If you run into trouble or want to share what you've built, join the community or check out the Docs for deeper configuration options.

How to build Browser tool MCP Agent with another framework

FAQ

What are the differences in Tool Router MCP and Browser tool MCP?

With a standalone Browser tool MCP server, the agents and LLMs can only access a fixed set of Browser tool tools tied to that server. However, with the Composio Tool Router, agents can dynamically load tools from Browser tool and many other apps based on the task at hand, all through a single MCP endpoint.

Can I use Tool Router MCP with Hermes?

Yes, you can. Hermes fully supports MCP integration. You get structured tool calling, message history handling, and model orchestration while Tool Router takes care of discovering and serving the right Browser tool tools.

Can I manage the permissions and scopes for Browser tool while using Tool Router?

Yes, absolutely. You can configure which Browser tool scopes and actions are allowed when connecting your account to Composio. You can also bring your own OAuth credentials or API configuration so you keep full control over what the agent can do.

How safe is my data with Composio Tool Router?

All sensitive data such as tokens, keys, and configuration is fully encrypted at rest and in transit. Composio is SOC 2 Type 2 compliant and follows strict security practices so your Browser tool data and credentials are handled as safely as possible.

Used by agents from

Context
Letta
glean
HubSpot
Agent.ai
Altera
DataStax
Entelligence
Rolai
Context
Letta
glean
HubSpot
Agent.ai
Altera
DataStax
Entelligence
Rolai
Context
Letta
glean
HubSpot
Agent.ai
Altera
DataStax
Entelligence
Rolai

Never worry about agent reliability

We handle tool reliability, observability, and security so you never have to second-guess an agent action.