# How to integrate Firecrawl MCP with LlamaIndex

```json
{
  "title": "How to integrate Firecrawl MCP with LlamaIndex",
  "toolkit": "Firecrawl",
  "toolkit_slug": "firecrawl",
  "framework": "LlamaIndex",
  "framework_slug": "llama-index",
  "url": "https://composio.dev/toolkits/firecrawl/framework/llama-index",
  "markdown_url": "https://composio.dev/toolkits/firecrawl/framework/llama-index.md",
  "updated_at": "2026-05-12T10:11:40.863Z"
}
```

## Introduction

This guide walks you through connecting Firecrawl to LlamaIndex using the Composio tool router. By the end, you'll have a working Firecrawl agent that can extract all product prices from this e-commerce site, crawl competitor blogs for latest article summaries, map all subpages linked from homepage url through natural language commands.
This guide will help you understand how to give your LlamaIndex agent real control over a Firecrawl account through Composio's Firecrawl MCP server.
Before we dive in, let's take a quick look at the key ideas and tools involved.

## Also integrate Firecrawl with

- [ChatGPT](https://composio.dev/toolkits/firecrawl/framework/chatgpt)
- [OpenAI Agents SDK](https://composio.dev/toolkits/firecrawl/framework/open-ai-agents-sdk)
- [Claude Agent SDK](https://composio.dev/toolkits/firecrawl/framework/claude-agents-sdk)
- [Claude Code](https://composio.dev/toolkits/firecrawl/framework/claude-code)
- [Claude Cowork](https://composio.dev/toolkits/firecrawl/framework/claude-cowork)
- [Codex](https://composio.dev/toolkits/firecrawl/framework/codex)
- [Cursor](https://composio.dev/toolkits/firecrawl/framework/cursor)
- [VS Code](https://composio.dev/toolkits/firecrawl/framework/vscode)
- [OpenCode](https://composio.dev/toolkits/firecrawl/framework/opencode)
- [OpenClaw](https://composio.dev/toolkits/firecrawl/framework/openclaw)
- [Hermes](https://composio.dev/toolkits/firecrawl/framework/hermes-agent)
- [CLI](https://composio.dev/toolkits/firecrawl/framework/cli)
- [Google ADK](https://composio.dev/toolkits/firecrawl/framework/google-adk)
- [LangChain](https://composio.dev/toolkits/firecrawl/framework/langchain)
- [Vercel AI SDK](https://composio.dev/toolkits/firecrawl/framework/ai-sdk)
- [Mastra AI](https://composio.dev/toolkits/firecrawl/framework/mastra-ai)
- [CrewAI](https://composio.dev/toolkits/firecrawl/framework/crew-ai)

## TL;DR

Here's what you'll learn:
- Set your OpenAI and Composio API keys
- Install LlamaIndex and Composio packages
- Create a Composio Tool Router session for Firecrawl
- Connect LlamaIndex to the Firecrawl MCP server
- Build a Firecrawl-powered agent using LlamaIndex
- Interact with Firecrawl through natural language

## What is LlamaIndex?

LlamaIndex is a data framework for building LLM applications. It provides tools for connecting LLMs to external data sources and services through agents and tools.
Key features include:
- ReAct Agent: Reasoning and acting pattern for tool-using agents
- MCP Tools: Native support for Model Context Protocol
- Context Management: Maintain conversation context across interactions
- Async Support: Built for async/await patterns

## What is the Firecrawl MCP server, and what's possible with it?

The Firecrawl MCP server is an implementation of the Model Context Protocol that connects your AI agent and assistants like Claude, Cursor, etc directly to your Firecrawl account. It provides structured and secure access to automated web crawling, scraping, and data extraction, so your agent can perform actions like indexing sites, extracting structured content, mapping URLs, and searching the web on your behalf.
- Automated web crawling and indexing: Let your agent launch and manage web crawl jobs to gather content or index entire websites efficiently.
- Structured data extraction: Instruct your agent to extract targeted data from web pages using custom prompts or schemas, turning unstructured sites into actionable information.
- URL mapping and discovery: Have the agent explore and map all URLs within a website, including options for subdomain inclusion, sitemap processing, or search-based discovery.
- On-demand scraping and content retrieval: Enable your agent to scrape specific URLs, retrieve page content, and even extract structured JSON using LLM-powered methods.
- Integrated web search and data collection: Task your agent with running web searches, scraping top result pages, and returning relevant details—all in one workflow.

## Supported Tools

| Tool slug | Name | Description |
|---|---|---|
| `FIRECRAWL_AGENT_CANCEL` | Cancel an agent job | Tool to cancel an in-progress agent job by its ID. Use when you need to terminate an active agent operation. The API returns a success boolean upon cancellation. |
| `FIRECRAWL_BATCH_SCRAPE` | Batch scrape multiple URLs | Tool to scrape multiple URLs in batch with concurrent processing. Use when you need to scrape multiple web pages efficiently with customizable formats and content filtering. |
| `FIRECRAWL_BATCH_SCRAPE_CANCEL` | Cancel a batch scrape job | Tool to cancel a running batch scrape job using its unique identifier. Use when you need to terminate an in-progress batch scrape operation. |
| `FIRECRAWL_BATCH_SCRAPE_GET` | Get batch scrape status | Retrieves the current status and results of a batch scrape job using the job ID. Use this to check batch scrape progress and retrieve scraped data. |
| `FIRECRAWL_BATCH_SCRAPE_GET_ERRORS` | Get errors from batch scrape job | Tool to retrieve error details from a batch scrape job, including failed URLs and URLs blocked by robots.txt. Use when you need to debug or understand why certain pages failed to scrape in a batch operation. |
| `FIRECRAWL_CRAWL` | Start a web crawl | Initiates a Firecrawl web crawl from a given URL, applying various filtering and content extraction rules, and polls until the job is complete; ensure the URL is accessible and any regex patterns for paths are valid. |
| `FIRECRAWL_CANCEL_A_CRAWL_JOB` | Cancel a crawl job | Cancels an active or queued web crawl job using its ID; attempting to cancel completed, failed, or previously canceled jobs will not change their state. |
| `FIRECRAWL_CANCEL_A_CRAWL_JOB` | Cancel a crawl job | Tool to cancel a running crawl job by its ID. Use when you need to stop an active crawl operation. The API returns a status of 'cancelled' upon successful cancellation. |
| `FIRECRAWL_CRAWL_GET` | Get crawl job status | Tool to retrieve the status and results of a Firecrawl crawl job. Use when you need to check the progress or get data from an ongoing or completed crawl operation. Returns crawl status, progress metrics, credits used, and the crawled page data. |
| `FIRECRAWL_CRAWL_GET_ERRORS` | Get errors from a crawl job | Tool to retrieve errors from a Firecrawl crawl job. Use when you need to understand why certain pages failed to scrape or which URLs were blocked by robots.txt during a crawl operation. |
| `FIRECRAWL_CRAWL_LIST_ACTIVE` | Get all active crawl jobs | Tool to retrieve all active crawl jobs for the authenticated team. Use when you need to see which crawl operations are currently running. |
| `FIRECRAWL_CRAWL_PARAMS_PREVIEW` | Preview crawl parameters | Preview crawl parameters before starting a crawl by generating optimal configuration from natural language instructions. Use this tool to understand what crawl settings will be applied based on your requirements before executing a full crawl operation. The endpoint intelligently interprets natural language prompts to configure crawl parameters like include/exclude paths, depth limits, and domain scope. |
| `FIRECRAWL_CRAWL_V2` | Start a web crawl (v2) [NEW] | [NEW v2 API] Initiates a Firecrawl v2 web crawl with enhanced features over v1: natural language prompts for automatic crawler configuration, crawlEntireDomain for sibling/parent page discovery, better depth control with maxDiscoveryDepth, subdomain support, and full webhook configuration. Polls until crawl is complete. |
| `FIRECRAWL_CREDIT_USAGE_GET` | Get team credit usage | Tool to get current team credit usage information. Use when you need to check remaining credits or billing period details. |
| `FIRECRAWL_CREDIT_USAGE_GET_HISTORICAL` | Get historical team credit usage | Tool to retrieve historical team credit usage on a monthly basis. Use when you need to analyze credit consumption patterns over time, optionally segmented by API key. |
| `FIRECRAWL_EXTRACT` | Extract structured data | Extracts structured data from web pages by initiating an extraction job and polling for completion; requires a natural language `prompt` or a JSON `schema` (one must be provided). |
| `FIRECRAWL_EXTRACT_GET` | Get extract job status | Tool to retrieve the status and results of a previously submitted extract job. Use when you need to check the progress or get the final results of an extraction operation. |
| `FIRECRAWL_GET_AGENT_STATUS` | Get agent job status | Tool to get the status and results of an agent job. Use when you need to check if an agent job has completed and retrieve the collected data. Agent jobs autonomously search, navigate, and extract data from the web. |
| `FIRECRAWL_GET_DEEP_RESEARCH_STATUS` | Get deep research status | Retrieves the status and results of a deep research job by its ID. Use when you need to check the progress or retrieve the final analysis of a deep research operation. |
| `FIRECRAWL_GET_THE_STATUS_OF_A_CRAWL_JOB` | Get the status of a crawl job | Retrieves the current status, progress, and details of a web crawl job, using the job ID obtained when the crawl was initiated. |
| `FIRECRAWL_LLMS_TXT_GENERATE` | Generate LLMs.txt for a website | Initiates an async job to generate an LLMs.txt file for a website, converting web content into LLM-friendly format. Returns a job ID to check status and retrieve results. Use when you need to create a standardized, machine-readable representation of website content for language models. |
| `FIRECRAWL_LLMS_TXT_GET` | Get LLMs.txt generation job status | Tool to get the status and results of an LLMs.txt generation job. Use when you need to check if a job has completed and retrieve the generated content. |
| `FIRECRAWL_MAP_MULTIPLE_URLS_BASED_ON_OPTIONS` | Map multiple URLs | Maps a website by discovering URLs from a starting base URL, with options to customize the crawl via search query, subdomain inclusion, sitemap handling, and result limits; search effectiveness is site-dependent. |
| `FIRECRAWL_QUEUE_GET` | Get team queue status | Tool to retrieve metrics about the team's scrape queue. Use when you need to check queue status, job counts, or concurrency limits. |
| `FIRECRAWL_SCRAPE` | Scrape URL | Scrapes a publicly accessible URL, optionally performing pre-scrape browser actions or extracting structured JSON using an LLM, to retrieve content in specified formats. |
| `FIRECRAWL_SEARCH` | Search | Performs a web search for a query, scrapes content from the top search results using Firecrawl, and returns details in specified formats. |
| `FIRECRAWL_START_AGENT` | Start an agent job | Tool to start an agent job for agentic web extraction with multi-page navigation and interaction capabilities. Use when you need to autonomously gather data from the web with complex navigation requirements. The agent can search, navigate, and extract information across multiple pages based on your natural language prompt. |
| `FIRECRAWL_TOKEN_USAGE_GET` | Get team token usage | Tool to retrieve the current team's token usage and balance information for Firecrawl's Extract feature. Use when you need to check remaining token credits, plan allocation, or billing period details. |
| `FIRECRAWL_TOKEN_USAGE_GET_HISTORICAL` | Get historical team token usage | Tool to retrieve historical team token usage on a monthly basis. Use when you need to analyze token consumption patterns over time, optionally segmented by API key. |

## Supported Triggers

None listed.

## Creating MCP Server - Stand-alone vs Composio SDK

The Firecrawl MCP server is an implementation of the Model Context Protocol that connects your AI agent to Firecrawl. It provides structured and secure access so your agent can perform Firecrawl operations on your behalf through a secure, permission-based interface.
With Composio's managed implementation, you don't have to create your own developer app. For production, if you're building an end product, we recommend using your own credentials. The managed server helps you prototype fast and go from 0-1 faster.

## Step-by-step Guide

### 1. Prerequisites

Before you begin, make sure you have:
- Python 3.8/Node 16 or higher installed
- A Composio account with the API key
- An OpenAI API key
- A Firecrawl account and project
- Basic familiarity with async Python/Typescript

### 1. Getting API Keys for OpenAI, Composio, and Firecrawl

No description provided.

### 2. Installing dependencies

No description provided.
```python
pip install composio-llamaindex llama-index llama-index-llms-openai llama-index-tools-mcp python-dotenv
```

```typescript
npm install @composio/llamaindex @llamaindex/openai @llamaindex/tools @llamaindex/workflow dotenv
```

### 3. Set environment variables

Create a .env file in your project root:
These credentials will be used to:
- Authenticate with OpenAI's GPT-5 model
- Connect to Composio's Tool Router
- Identify your Composio user session for Firecrawl access
```bash
OPENAI_API_KEY=your-openai-api-key
COMPOSIO_API_KEY=your-composio-api-key
COMPOSIO_USER_ID=your-user-id
```

### 4. Import modules

No description provided.
```python
import asyncio
import os
import dotenv

from composio import Composio
from composio_llamaindex import LlamaIndexProvider
from llama_index.core.agent.workflow import ReActAgent
from llama_index.core.workflow import Context
from llama_index.llms.openai import OpenAI
from llama_index.tools.mcp import BasicMCPClient, McpToolSpec

dotenv.load_dotenv()
```

```typescript
import "dotenv/config";
import readline from "node:readline/promises";
import { stdin as input, stdout as output } from "node:process";

import { Composio } from "@composio/core";

import { mcp } from "@llamaindex/tools";
import { agent as createAgent } from "@llamaindex/workflow";
import { openai } from "@llamaindex/openai";

dotenv.config();
```

### 5. Load environment variables and initialize Composio

No description provided.
```python
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
COMPOSIO_API_KEY = os.getenv("COMPOSIO_API_KEY")
COMPOSIO_USER_ID = os.getenv("COMPOSIO_USER_ID")

if not OPENAI_API_KEY:
    raise ValueError("OPENAI_API_KEY is not set in the environment")
if not COMPOSIO_API_KEY:
    raise ValueError("COMPOSIO_API_KEY is not set in the environment")
if not COMPOSIO_USER_ID:
    raise ValueError("COMPOSIO_USER_ID is not set in the environment")
```

```typescript
const OPENAI_API_KEY = process.env.OPENAI_API_KEY;
const COMPOSIO_API_KEY = process.env.COMPOSIO_API_KEY;
const COMPOSIO_USER_ID = process.env.COMPOSIO_USER_ID;

if (!OPENAI_API_KEY) throw new Error("OPENAI_API_KEY is not set");
if (!COMPOSIO_API_KEY) throw new Error("COMPOSIO_API_KEY is not set");
if (!COMPOSIO_USER_ID) throw new Error("COMPOSIO_USER_ID is not set");
```

### 6. Create a Tool Router session and build the agent function

What's happening here:
- We create a Composio client using your API key and configure it with the LlamaIndex provider
- We then create a tool router MCP session for your user, specifying the toolkits we want to use (in this case, firecrawl)
- The session returns an MCP HTTP endpoint URL that acts as a gateway to all your configured tools
- LlamaIndex will connect to this endpoint to dynamically discover and use the available Firecrawl tools.
- The MCP tools are mapped to LlamaIndex-compatible tools and plug them into the Agent.
```python
async def build_agent() -> ReActAgent:
    composio_client = Composio(
        api_key=COMPOSIO_API_KEY,
        provider=LlamaIndexProvider(),
    )

    session = composio_client.create(
        user_id=COMPOSIO_USER_ID,
        toolkits=["firecrawl"],
    )

    mcp_url = session.mcp.url
    print(f"Composio MCP URL: {mcp_url}")

    mcp_client = BasicMCPClient(mcp_url, headers={"x-api-key": COMPOSIO_API_KEY})
    mcp_tool_spec = McpToolSpec(client=mcp_client)
    tools = await mcp_tool_spec.to_tool_list_async()

    llm = OpenAI(model="gpt-5")

    description = "An agent that uses Composio Tool Router MCP tools to perform Firecrawl actions."
    system_prompt = """
    You are a helpful assistant connected to Composio Tool Router.
    Use the available tools to answer user queries and perform Firecrawl actions.
    """
    return ReActAgent(tools=tools, llm=llm, description=description, system_prompt=system_prompt, verbose=True)
```

```typescript
async function buildAgent() {

  console.log(`Initializing Composio client...${COMPOSIO_USER_ID!}...`);
  console.log(`COMPOSIO_USER_ID: ${COMPOSIO_USER_ID!}...`);

  const composio = new Composio({
    apiKey: COMPOSIO_API_KEY,
    provider: new LlamaindexProvider(),
  });

  const session = await composio.create(
    COMPOSIO_USER_ID!,
    {
      toolkits: ["firecrawl"],
    },
  );

  const mcpUrl = session.mcp.url;
  console.log(`Composio Tool Router MCP URL: ${mcpUrl}`);

  const server = mcp({
    url: mcpUrl,
    clientName: "composio_tool_router_with_llamaindex",
    requestInit: {
      headers: {
        "x-api-key": COMPOSIO_API_KEY!,
      },
    },
    // verbose: true,
  });

  const tools = await server.tools();

  const llm = openai({ apiKey: OPENAI_API_KEY, model: "gpt-5" });

  const agent = createAgent({
    name: "composio_tool_router_with_llamaindex",
        description : "An agent that uses Composio Tool Router MCP tools to perform actions.",
    systemPrompt:
      "You are a helpful assistant connected to Composio Tool Router."+
"Use the available tools to answer user queries and perform Firecrawl actions." ,
    llm,
    tools,
  });

  return agent;
}
```

### 7. Create an interactive chat loop

No description provided.
```python
async def chat_loop(agent: ReActAgent) -> None:
    ctx = Context(agent)
    print("Type 'quit', 'exit', or Ctrl+C to stop.")

    while True:
        try:
            user_input = input("\nYou: ").strip()
        except (KeyboardInterrupt, EOFError):
            print("\nBye!")
            break

        if not user_input or user_input.lower() in {"quit", "exit"}:
            print("Bye!")
            break

        try:
            print("Agent: ", end="", flush=True)
            handler = agent.run(user_input, ctx=ctx)

            async for event in handler.stream_events():
                # Stream token-by-token from LLM responses
                if hasattr(event, "delta") and event.delta:
                    print(event.delta, end="", flush=True)
                # Show tool calls as they happen
                elif hasattr(event, "tool_name"):
                    print(f"\n[Using tool: {event.tool_name}]", flush=True)

            # Get final response
            response = await handler
            print()  # Newline after streaming
        except KeyboardInterrupt:
            print("\n[Interrupted]")
            continue
        except Exception as e:
            print(f"\nError: {e}")
```

```typescript
async function chatLoop(agent: ReturnType<typeof createAgent>) {
  const rl = readline.createInterface({ input, output });

  console.log("Type 'quit' or 'exit' to stop.");

  while (true) {
    let userInput: string;

    try {
      userInput = (await rl.question("\nYou: ")).trim();
    } catch {
      console.log("\nAgent: Bye!");
      break;
    }

    if (!userInput) {
      continue;
    }

    const lower = userInput.toLowerCase();
    if (lower === "quit" || lower === "exit") {
      console.log("Agent: Bye!");
      break;
    }

    try {
      process.stdout.write("Agent: ");

      const stream = agent.runStream(userInput);
      let finalResult: any = null;

      for await (const event of stream) {
        // The event.data contains the streamed content
        const data: any = event.data;

        // Check for streaming delta content
        if (data?.delta) {
          process.stdout.write(data.delta);
        }

        // Store final result for fallback
        if (data?.result || data?.message) {
          finalResult = data;
        }
      }

      // If no streaming happened, show the final result
      if (finalResult) {
        const answer =
          finalResult.result ??
          finalResult.message?.content ??
          finalResult.message ??
          "";
        if (answer && typeof answer === "string" && !answer.includes("[object")) {
          process.stdout.write(answer);
        }
      }

      console.log(); // New line after streaming completes
    } catch (err: any) {
      console.error("\nAgent error:", err?.message ?? err);
    }
  }

  rl.close();
}
```

### 8. Define the main entry point

What's happening here:
- We're orchestrating the entire application flow
- The agent gets built with proper error handling
- Then we kick off the interactive chat loop so you can start talking to Firecrawl
```python
async def main() -> None:
    agent = await build_agent()
    await chat_loop(agent)

if __name__ == "__main__":
    # Handle Ctrl+C gracefully
    signal.signal(signal.SIGINT, lambda s, f: (print("\nBye!"), exit(0)))
    try:
        asyncio.run(main())
    except KeyboardInterrupt:
        print("\nBye!")
```

```typescript
async function main() {
  try {
    const agent = await buildAgent();
    await chatLoop(agent);
  } catch (err) {
    console.error("Failed to start agent:", err);
    process.exit(1);
  }
}

main();
```

### 9. Run the agent

When prompted, authenticate and authorise your agent with Firecrawl, then start asking questions.
```bash
python llamaindex_agent.py
```

```typescript
npx ts-node llamaindex-agent.ts
```

## Complete Code

```python
import asyncio
import os
import signal
import dotenv

from composio import Composio
from composio_llamaindex import LlamaIndexProvider
from llama_index.core.agent.workflow import ReActAgent
from llama_index.core.workflow import Context
from llama_index.llms.openai import OpenAI
from llama_index.tools.mcp import BasicMCPClient, McpToolSpec

dotenv.load_dotenv()

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
COMPOSIO_API_KEY = os.getenv("COMPOSIO_API_KEY")
COMPOSIO_USER_ID = os.getenv("COMPOSIO_USER_ID")

if not OPENAI_API_KEY:
    raise ValueError("OPENAI_API_KEY is not set")
if not COMPOSIO_API_KEY:
    raise ValueError("COMPOSIO_API_KEY is not set")
if not COMPOSIO_USER_ID:
    raise ValueError("COMPOSIO_USER_ID is not set")

async def build_agent() -> ReActAgent:
    composio_client = Composio(
        api_key=COMPOSIO_API_KEY,
        provider=LlamaIndexProvider(),
    )

    session = composio_client.create(
        user_id=COMPOSIO_USER_ID,
        toolkits=["firecrawl"],
    )

    mcp_url = session.mcp.url
    print(f"Composio MCP URL: {mcp_url}")

    mcp_client = BasicMCPClient(mcp_url, headers={"x-api-key": COMPOSIO_API_KEY})
    mcp_tool_spec = McpToolSpec(client=mcp_client)
    tools = await mcp_tool_spec.to_tool_list_async()

    llm = OpenAI(model="gpt-5")
    description = "An agent that uses Composio Tool Router MCP tools to perform Firecrawl actions."
    system_prompt = """
    You are a helpful assistant connected to Composio Tool Router.
    Use the available tools to answer user queries and perform Firecrawl actions.
    """
    return ReActAgent(
        tools=tools,
        llm=llm,
        description=description,
        system_prompt=system_prompt,
        verbose=True,
    );

async def chat_loop(agent: ReActAgent) -> None:
    ctx = Context(agent)
    print("Type 'quit', 'exit', or Ctrl+C to stop.")

    while True:
        try:
            user_input = input("\nYou: ").strip()
        except (KeyboardInterrupt, EOFError):
            print("\nBye!")
            break

        if not user_input or user_input.lower() in {"quit", "exit"}:
            print("Bye!")
            break

        try:
            print("Agent: ", end="", flush=True)
            handler = agent.run(user_input, ctx=ctx)

            async for event in handler.stream_events():
                # Stream token-by-token from LLM responses
                if hasattr(event, "delta") and event.delta:
                    print(event.delta, end="", flush=True)
                # Show tool calls as they happen
                elif hasattr(event, "tool_name"):
                    print(f"\n[Using tool: {event.tool_name}]", flush=True)

            # Get final response
            response = await handler
            print()  # Newline after streaming
        except KeyboardInterrupt:
            print("\n[Interrupted]")
            continue
        except Exception as e:
            print(f"\nError: {e}")

async def main() -> None:
    agent = await build_agent()
    await chat_loop(agent)

if __name__ == "__main__":
    # Handle Ctrl+C gracefully
    signal.signal(signal.SIGINT, lambda s, f: (print("\nBye!"), exit(0)))
    try:
        asyncio.run(main())
    except KeyboardInterrupt:
        print("\nBye!")
```

```typescript
import "dotenv/config";
import readline from "node:readline/promises";
import { stdin as input, stdout as output } from "node:process";

import { Composio } from "@composio/core";
import { LlamaindexProvider } from "@composio/llamaindex";

import { mcp } from "@llamaindex/tools";
import { agent as createAgent } from "@llamaindex/workflow";
import { openai } from "@llamaindex/openai";

dotenv.config();

const OPENAI_API_KEY = process.env.OPENAI_API_KEY;
const COMPOSIO_API_KEY = process.env.COMPOSIO_API_KEY;
const COMPOSIO_USER_ID = process.env.COMPOSIO_USER_ID;

if (!OPENAI_API_KEY) {
    throw new Error("OPENAI_API_KEY is not set in the environment");
  }
if (!COMPOSIO_API_KEY) {
    throw new Error("COMPOSIO_API_KEY is not set in the environment");
  }
if (!COMPOSIO_USER_ID) {
    throw new Error("COMPOSIO_USER_ID is not set in the environment");
  }

async function buildAgent() {

  console.log(`Initializing Composio client...${COMPOSIO_USER_ID!}...`);
  console.log(`COMPOSIO_USER_ID: ${COMPOSIO_USER_ID!}...`);

  const composio = new Composio({
    apiKey: COMPOSIO_API_KEY,
    provider: new LlamaindexProvider(),
  });

  const session = await composio.create(
    COMPOSIO_USER_ID!,
    {
      toolkits: ["firecrawl"],
    },
  );

  const mcpUrl = session.mcp.url;
  console.log(`Composio Tool Router MCP URL: ${mcpUrl}`);

  const server = mcp({
    url: mcpUrl,
    clientName: "composio_tool_router_with_llamaindex",
    requestInit: {
      headers: {
        "x-api-key": COMPOSIO_API_KEY!,
      },
    },
    // verbose: true,
  });

  const tools = await server.tools();

  const llm = openai({ apiKey: OPENAI_API_KEY, model: "gpt-5" });

  const agent = createAgent({
    name: "composio_tool_router_with_llamaindex",
    description:
      "An agent that uses Composio Tool Router MCP tools to perform actions.",
    systemPrompt:
      "You are a helpful assistant connected to Composio Tool Router."+
"Use the available tools to answer user queries and perform Firecrawl actions." ,
    llm,
    tools,
  });

  return agent;
}

async function chatLoop(agent: ReturnType<typeof createAgent>) {
  const rl = readline.createInterface({ input, output });

  console.log("Type 'quit' or 'exit' to stop.");

  while (true) {
    let userInput: string;

    try {
      userInput = (await rl.question("\nYou: ")).trim();
    } catch {
      console.log("\nAgent: Bye!");
      break;
    }

    if (!userInput) {
      continue;
    }

    const lower = userInput.toLowerCase();
    if (lower === "quit" || lower === "exit") {
      console.log("Agent: Bye!");
      break;
    }

    try {
      process.stdout.write("Agent: ");

      const stream = agent.runStream(userInput);
      let finalResult: any = null;

      for await (const event of stream) {
        // The event.data contains the streamed content
        const data: any = event.data;

        // Check for streaming delta content
        if (data?.delta) {
          process.stdout.write(data.delta);
        }

        // Store final result for fallback
        if (data?.result || data?.message) {
          finalResult = data;
        }
      }

      // If no streaming happened, show the final result
      if (finalResult) {
        const answer =
          finalResult.result ??
          finalResult.message?.content ??
          finalResult.message ??
          "";
        if (answer && typeof answer === "string" && !answer.includes("[object")) {
          process.stdout.write(answer);
        }
      }

      console.log(); // New line after streaming completes
    } catch (err: any) {
      console.error("\nAgent error:", err?.message ?? err);
    }
  }

  rl.close();
}

async function main() {
  try {
    const agent = await buildAgent();
    await chatLoop(agent);
  } catch (err: any) {
    console.error("Failed to start agent:", err?.message ?? err);
    process.exit(1);
  }
}

main();
```

## Conclusion

You've successfully connected Firecrawl to LlamaIndex through Composio's Tool Router MCP layer.
Key takeaways:
- Tool Router dynamically exposes Firecrawl tools through an MCP endpoint
- LlamaIndex's ReActAgent handles reasoning and orchestration; Composio handles integrations
- The agent becomes more capable without increasing prompt size
- Async Python provides clean, efficient execution of agent workflows
You can easily extend this to other toolkits like Gmail, Notion, Stripe, GitHub, and more by adding them to the toolkits parameter.

## How to build Firecrawl MCP Agent with another framework

- [ChatGPT](https://composio.dev/toolkits/firecrawl/framework/chatgpt)
- [OpenAI Agents SDK](https://composio.dev/toolkits/firecrawl/framework/open-ai-agents-sdk)
- [Claude Agent SDK](https://composio.dev/toolkits/firecrawl/framework/claude-agents-sdk)
- [Claude Code](https://composio.dev/toolkits/firecrawl/framework/claude-code)
- [Claude Cowork](https://composio.dev/toolkits/firecrawl/framework/claude-cowork)
- [Codex](https://composio.dev/toolkits/firecrawl/framework/codex)
- [Cursor](https://composio.dev/toolkits/firecrawl/framework/cursor)
- [VS Code](https://composio.dev/toolkits/firecrawl/framework/vscode)
- [OpenCode](https://composio.dev/toolkits/firecrawl/framework/opencode)
- [OpenClaw](https://composio.dev/toolkits/firecrawl/framework/openclaw)
- [Hermes](https://composio.dev/toolkits/firecrawl/framework/hermes-agent)
- [CLI](https://composio.dev/toolkits/firecrawl/framework/cli)
- [Google ADK](https://composio.dev/toolkits/firecrawl/framework/google-adk)
- [LangChain](https://composio.dev/toolkits/firecrawl/framework/langchain)
- [Vercel AI SDK](https://composio.dev/toolkits/firecrawl/framework/ai-sdk)
- [Mastra AI](https://composio.dev/toolkits/firecrawl/framework/mastra-ai)
- [CrewAI](https://composio.dev/toolkits/firecrawl/framework/crew-ai)

## Related Toolkits

- [Tavily](https://composio.dev/toolkits/tavily) - Tavily offers powerful search and data retrieval from documents, databases, and the web. It helps teams locate and filter information instantly, saving hours on research.
- [Exa](https://composio.dev/toolkits/exa) - Exa is a data extraction and search platform for gathering and analyzing information from websites, APIs, or databases. It helps teams quickly surface insights and automate data-driven workflows.
- [Serpapi](https://composio.dev/toolkits/serpapi) - SerpApi is a real-time API for structured search engine results. It lets you automate SERP data collection, parsing, and analysis for SEO and research.
- [Peopledatalabs](https://composio.dev/toolkits/peopledatalabs) - Peopledatalabs delivers B2B data enrichment and identity resolution APIs. Supercharge your apps with accurate, up-to-date business and contact data.
- [Snowflake](https://composio.dev/toolkits/snowflake) - Snowflake is a cloud data warehouse built for elastic scaling, secure data sharing, and fast SQL analytics across major clouds.
- [Posthog](https://composio.dev/toolkits/posthog) - PostHog is an open-source analytics platform for tracking user interactions and product metrics. It helps teams refine features, analyze funnels, and reduce churn with actionable insights.
- [Amplitude](https://composio.dev/toolkits/amplitude) - Amplitude is a digital analytics platform for product and behavioral data insights. It helps teams analyze user journeys and make data-driven decisions quickly.
- [Bright Data MCP](https://composio.dev/toolkits/brightdata_mcp) - Bright Data MCP is an AI-powered web scraping and data collection platform. Instantly access public web data in real time with advanced scraping tools.
- [Browseai](https://composio.dev/toolkits/browseai) - Browseai is a web automation and data extraction platform that turns any website into an API. It's perfect for monitoring websites and retrieving structured data without manual scraping.
- [ClickHouse](https://composio.dev/toolkits/clickhouse) - ClickHouse is an open-source, column-oriented database for real-time analytics and big data processing using SQL. Its lightning-fast query performance makes it ideal for handling large datasets and delivering instant insights.
- [Coinmarketcal](https://composio.dev/toolkits/coinmarketcal) - CoinMarketCal is a community-powered crypto calendar for upcoming events, announcements, and releases. It helps traders track market-moving developments and stay ahead in the crypto space.
- [Control d](https://composio.dev/toolkits/control_d) - Control d is a customizable DNS filtering and traffic redirection platform. It helps you manage internet access, enforce policies, and monitor usage across devices and networks.
- [Databox](https://composio.dev/toolkits/databox) - Databox is a business analytics platform that connects your data from any tool and device. It helps you track KPIs, build dashboards, and discover actionable insights.
- [Databricks](https://composio.dev/toolkits/databricks) - Databricks is a unified analytics platform for big data and AI on the lakehouse architecture. It empowers data teams to collaborate, analyze, and build scalable solutions efficiently.
- [Datagma](https://composio.dev/toolkits/datagma) - Datagma delivers data intelligence and analytics for business growth and market discovery. Get actionable market insights and track competitors to inform your strategy.
- [Delighted](https://composio.dev/toolkits/delighted) - Delighted is a customer feedback platform based on the Net Promoter System®. It helps you quickly gather, track, and act on customer sentiment.
- [Dovetail](https://composio.dev/toolkits/dovetail) - Dovetail is a research analysis platform for transcript review and insight generation. It helps teams code interviews, analyze feedback, and create actionable research summaries.
- [Dub](https://composio.dev/toolkits/dub) - Dub is a short link management platform with analytics and API access. Use it to easily create, manage, and track branded short links for your business.
- [Elasticsearch](https://composio.dev/toolkits/elasticsearch) - Elasticsearch is a distributed, RESTful search and analytics engine for all types of data. It delivers fast, scalable search and powerful analytics across massive datasets.
- [Fireflies](https://composio.dev/toolkits/fireflies) - Fireflies.ai is an AI-powered meeting assistant that records, transcribes, and analyzes voice conversations. It helps teams capture call notes automatically and search or summarize meetings effortlessly.

## Frequently Asked Questions

### What are the differences in Tool Router MCP and Firecrawl MCP?

With a standalone Firecrawl MCP server, the agents and LLMs can only access a fixed set of Firecrawl tools tied to that server. However, with the Composio Tool Router, agents can dynamically load tools from Firecrawl and many other apps based on the task at hand, all through a single MCP endpoint.

### Can I use Tool Router MCP with LlamaIndex?

Yes, you can. LlamaIndex fully supports MCP integration. You get structured tool calling, message history handling, and model orchestration while Tool Router takes care of discovering and serving the right Firecrawl tools.

### Can I manage the permissions and scopes for Firecrawl while using Tool Router?

Yes, absolutely. You can configure which Firecrawl scopes and actions are allowed when connecting your account to Composio. You can also bring your own OAuth credentials or API configuration so you keep full control over what the agent can do.

### How safe is my data with Composio Tool Router?

All sensitive data such as tokens, keys, and configuration is fully encrypted at rest and in transit. Composio is SOC 2 Type 2 compliant and follows strict security practices so your Firecrawl data and credentials are handled as safely as possible.

---
[See all toolkits](https://composio.dev/toolkits) · [Composio docs](https://docs.composio.dev/llms.txt)