# How to integrate Scrapingbee MCP with LangChain

```json
{
  "title": "How to integrate Scrapingbee MCP with LangChain",
  "toolkit": "Scrapingbee",
  "toolkit_slug": "scrapingbee",
  "framework": "LangChain",
  "framework_slug": "langchain",
  "url": "https://composio.dev/toolkits/scrapingbee/framework/langchain",
  "markdown_url": "https://composio.dev/toolkits/scrapingbee/framework/langchain.md",
  "updated_at": "2026-05-12T10:24:52.875Z"
}
```

## Introduction

This guide walks you through connecting Scrapingbee to LangChain using the Composio tool router. By the end, you'll have a working Scrapingbee agent that can extract product prices from amazon search results, fetch latest news headlines from bbc homepage, bypass anti-bot protections to scrape booking data through natural language commands.
This guide will help you understand how to give your LangChain agent real control over a Scrapingbee account through Composio's Scrapingbee MCP server.
Before we dive in, let's take a quick look at the key ideas and tools involved.

## Also integrate Scrapingbee with

- [OpenAI Agents SDK](https://composio.dev/toolkits/scrapingbee/framework/open-ai-agents-sdk)
- [Claude Agent SDK](https://composio.dev/toolkits/scrapingbee/framework/claude-agents-sdk)
- [Claude Code](https://composio.dev/toolkits/scrapingbee/framework/claude-code)
- [Claude Cowork](https://composio.dev/toolkits/scrapingbee/framework/claude-cowork)
- [Codex](https://composio.dev/toolkits/scrapingbee/framework/codex)
- [OpenClaw](https://composio.dev/toolkits/scrapingbee/framework/openclaw)
- [Hermes](https://composio.dev/toolkits/scrapingbee/framework/hermes-agent)
- [CLI](https://composio.dev/toolkits/scrapingbee/framework/cli)
- [Google ADK](https://composio.dev/toolkits/scrapingbee/framework/google-adk)
- [Vercel AI SDK](https://composio.dev/toolkits/scrapingbee/framework/ai-sdk)
- [Mastra AI](https://composio.dev/toolkits/scrapingbee/framework/mastra-ai)
- [LlamaIndex](https://composio.dev/toolkits/scrapingbee/framework/llama-index)
- [CrewAI](https://composio.dev/toolkits/scrapingbee/framework/crew-ai)

## TL;DR

Here's what you'll learn:
- Get and set up your OpenAI and Composio API keys
- Connect your Scrapingbee project to Composio
- Create a Tool Router MCP session for Scrapingbee
- Initialize an MCP client and retrieve Scrapingbee tools
- Build a LangChain agent that can interact with Scrapingbee
- Set up an interactive chat interface for testing

## What is LangChain?

LangChain is a framework for developing applications powered by language models. It provides tools and abstractions for building agents that can reason, use tools, and maintain conversation context.
Key features include:
- Agent Framework: Build agents that can use tools and make decisions
- MCP Integration: Connect to external services through Model Context Protocol adapters
- Memory Management: Maintain conversation history across interactions
- Multi-Provider Support: Works with OpenAI, Anthropic, and other LLM providers

## What is the Scrapingbee MCP server, and what's possible with it?

The Scrapingbee MCP server is an implementation of the Model Context Protocol that connects your AI agent and assistants like Claude, Cursor, etc directly to your Scrapingbee account. It provides structured and secure access to powerful web scraping tools, so your agent can extract data, fetch HTML, bypass anti-bot measures, and monitor your usage—all without manual code.
- Structured data extraction from any webpage: Let your agent pull tables, lists, or custom data using CSS or XPath selectors with ScrapingBee's extraction rules.
- Fetch full HTML or page screenshots: Ask your agent to retrieve raw page markup or rendered screenshots, including support for JavaScript-heavy sites.
- Proxy and stealth scraping: Enable your agent to scrape sites that block bots by routing requests through ScrapingBee proxies or stealth modes to bypass anti-bot defenses.
- Resource control and custom rendering: Have your agent fine-tune scraping with options to block resources, control JS rendering, and speed up extraction for tricky websites.
- Monitor account usage and limits: Keep track of your remaining ScrapingBee credits and usage statistics directly through your agent for seamless quota management.

## Supported Tools

| Tool slug | Name | Description |
|---|---|---|
| `SCRAPINGBEE_DATA_EXTRACTION` | ScrapingBee Data Extraction | Tool to extract structured data from a webpage using CSS or XPath selectors. Use ScrapingBee's extract_rules feature. |
| `SCRAPINGBEE_HTML_FETCH` | ScrapingBee HTML Fetch | Tool to fetch HTML or screenshot via ScrapingBee HTML API. Use when you need page markup or image after optional JS rendering and resource controls. For anti-bot or CAPTCHA-protected sites (e.g., Cloudflare), combine render_js=true with premium_proxy=true or stealth_proxy=true to avoid blocks. |
| `SCRAPINGBEE_SCRAPING_BEE_PROXY_MODE` | ScrapingBee Proxy Mode | Tool to fetch web content via ScrapingBee's Proxy Mode. Use when you need to route requests through ScrapingBee proxies with optional JS rendering and resource blocking. |
| `SCRAPINGBEE_STEALTH_PROXY` | ScrapingBee Stealth Proxy | Tool to perform stealth scraping via ScrapingBee's Stealth Proxy mode. Use when you encounter anti-bot measures requiring undetectable requests. |
| `SCRAPINGBEE_USAGE_STATS` | ScrapingBee Usage Stats | Tool to retrieve usage statistics for your ScrapingBee account. Use when you need to monitor remaining credits and request count. |

## Supported Triggers

None listed.

## Creating MCP Server - Stand-alone vs Composio SDK

The Scrapingbee MCP server is an implementation of the Model Context Protocol that connects your AI agent to Scrapingbee. It provides structured and secure access so your agent can perform Scrapingbee operations on your behalf through a secure, permission-based interface.
With Composio's managed implementation, you don't have to create your own developer app. For production, if you're building an end product, we recommend using your own credentials. The managed server helps you prototype fast and go from 0-1 faster.

## Step-by-step Guide

### 1. Prerequisites

No description provided.

### 1. Getting API Keys for OpenAI and Composio

OpenAI API Key
- Go to the [OpenAI dashboard](https://platform.openai.com/settings/organization/api-keys) and create an API key. You'll need credits to use the models, or you can connect to another model provider.
- Keep the API key safe.
Composio API Key
- Log in to the [Composio dashboard](https://dashboard.composio.dev?utm_source=toolkits&utm_medium=framework_docs).
- Navigate to your API settings and generate a new API key.
- Store this key securely as you'll need it for authentication.

### 2. Install dependencies

No description provided.
```python
pip install composio-langchain langchain-mcp-adapters langchain python-dotenv
```

```typescript
npm install @composio/langchain @langchain/core @langchain/openai @langchain/mcp-adapters dotenv
```

### 3. Set up environment variables

Create a .env file in your project root.
What's happening:
- COMPOSIO_API_KEY authenticates your requests to Composio's API
- COMPOSIO_USER_ID identifies the user for session management
- OPENAI_API_KEY enables access to OpenAI's language models
```bash
COMPOSIO_API_KEY=your_composio_api_key_here
COMPOSIO_USER_ID=your_composio_user_id_here
OPENAI_API_KEY=your_openai_api_key_here
```

### 4. Import dependencies

No description provided.
```python
from langchain_mcp_adapters.client import MultiServerMCPClient
from langchain.agents import create_agent
from dotenv import load_dotenv
from composio import Composio
import asyncio
import os

load_dotenv()
```

```typescript
import { Composio } from '@composio/core';
import { LangchainProvider } from '@composio/langchain';
import { MultiServerMCPClient } from "@langchain/mcp-adapters";
import { createAgent } from "langchain";
import * as readline from 'readline';
import 'dotenv/config';

dotenv.config();
```

### 5. Initialize Composio client

What's happening:
- We're loading the COMPOSIO_API_KEY from environment variables and validating it exists
- Creating a Composio instance that will manage our connection to Scrapingbee tools
- Validating that COMPOSIO_USER_ID is also set before proceeding
```python
async def main():
    composio = Composio(api_key=os.getenv("COMPOSIO_API_KEY"))

    if not os.getenv("COMPOSIO_API_KEY"):
        raise ValueError("COMPOSIO_API_KEY is not set")
    if not os.getenv("COMPOSIO_USER_ID"):
        raise ValueError("COMPOSIO_USER_ID is not set")
```

```typescript
const composioApiKey = process.env.COMPOSIO_API_KEY;
const userId = process.env.COMPOSIO_USER_ID;

if (!composioApiKey) throw new Error('COMPOSIO_API_KEY is not set');
if (!userId) throw new Error('COMPOSIO_USER_ID is not set');

async function main() {
    const composio = new Composio({
        apiKey: composioApiKey as string,
        provider: new LangchainProvider()
    });
```

### 6. Create a Tool Router session

What's happening:
- We're creating a Tool Router session that gives your agent access to Scrapingbee tools
- The create method takes the user ID and specifies which toolkits should be available
- The returned session.mcp.url is the MCP server URL that your agent will use
- This approach allows the agent to dynamically load and use Scrapingbee tools as needed
```python
# Create Tool Router session for Scrapingbee
session = composio.create(
    user_id=os.getenv("COMPOSIO_USER_ID"),
    toolkits=['scrapingbee']
)

url = session.mcp.url
```

```typescript
const session = await composio.create(
    userId as string,
    {
        toolkits: ['scrapingbee']
    }
);

const url = session.mcp.url;
```

### 7. Configure the agent with the MCP URL

No description provided.
```python
client = MultiServerMCPClient({
    "scrapingbee-agent": {
        "transport": "streamable_http",
        "url": session.mcp.url,
        "headers": {
            "x-api-key": os.getenv("COMPOSIO_API_KEY")
        }
    }
})

tools = await client.get_tools()

agent = create_agent("gpt-5", tools)
```

```typescript
const client = new MultiServerMCPClient({
    "scrapingbee-agent": {
        transport: "http",
        url: url,
        headers: {
            "x-api-key": process.env.COMPOSIO_API_KEY
        }
    }
});

const tools = await client.getTools();

const agent = createAgent({ model: "gpt-5", tools });
```

### 8. Set up interactive chat interface

No description provided.
```python
conversation_history = []

print("Chat started! Type 'exit' or 'quit' to end the conversation.\n")
print("Ask any Scrapingbee related question or task to the agent.\n")

while True:
    user_input = input("You: ").strip()

    if user_input.lower() in ['exit', 'quit', 'bye']:
        print("\nGoodbye!")
        break

    if not user_input:
        continue

    conversation_history.append({"role": "user", "content": user_input})
    print("\nAgent is thinking...\n")

    response = await agent.ainvoke({"messages": conversation_history})
    conversation_history = response['messages']
    final_response = response['messages'][-1].content
    print(f"Agent: {final_response}\n")
```

```typescript
let conversationHistory: any[] = [];

console.log("Chat started! Type 'exit' or 'quit' to end the conversation.\n");
console.log("Ask any Scrapingbee related question or task to the agent.\n");

const rl = readline.createInterface({
    input: process.stdin,
    output: process.stdout,
    prompt: 'You: '
});

rl.prompt();

rl.on('line', async (userInput: string) => {
    const trimmedInput = userInput.trim();

    if (['exit', 'quit', 'bye'].includes(trimmedInput.toLowerCase())) {
        console.log("\nGoodbye!");
        rl.close();
        process.exit(0);
    }

    if (!trimmedInput) {
        rl.prompt();
        return;
    }

    conversationHistory.push({ role: "user", content: trimmedInput });
    console.log("\nAgent is thinking...\n");

    const response = await agent.invoke({ messages: conversationHistory });
    conversationHistory = response.messages;

    const finalResponse = response.messages[response.messages.length - 1]?.content;
    console.log(`Agent: ${finalResponse}\n`);
        
        rl.prompt();
    });

    rl.on('close', () => {
        console.log('\n👋 Session ended.');
        process.exit(0);
    });
```

### 9. Run the application

No description provided.
```python
if __name__ == "__main__":
    asyncio.run(main())
```

```typescript
main().catch((err) => {
    console.error('Fatal error:', err);
    process.exit(1);
});
```

## Complete Code

```python
from langchain_mcp_adapters.client import MultiServerMCPClient
from langchain.agents import create_agent
from dotenv import load_dotenv
from composio import Composio
import asyncio
import os

load_dotenv()

async def main():
    composio = Composio(api_key=os.getenv("COMPOSIO_API_KEY"))
    
    if not os.getenv("COMPOSIO_API_KEY"):
        raise ValueError("COMPOSIO_API_KEY is not set")
    if not os.getenv("COMPOSIO_USER_ID"):
        raise ValueError("COMPOSIO_USER_ID is not set")
    
    session = composio.create(
        user_id=os.getenv("COMPOSIO_USER_ID"),
        toolkits=['scrapingbee']
    )

    url = session.mcp.url
    
    client = MultiServerMCPClient({
        "scrapingbee-agent": {
            "transport": "streamable_http",
            "url": url,
            "headers": {
                "x-api-key": os.getenv("COMPOSIO_API_KEY")
            }
        }
    })
    
    tools = await client.get_tools()
  
    agent = create_agent("gpt-5", tools)
    
    conversation_history = []
    
    print("Chat started! Type 'exit' or 'quit' to end the conversation.\n")
    print("Ask any Scrapingbee related question or task to the agent.\n")
    
    while True:
        user_input = input("You: ").strip()
        
        if user_input.lower() in ['exit', 'quit', 'bye']:
            print("\nGoodbye!")
            break
        
        if not user_input:
            continue
        
        conversation_history.append({"role": "user", "content": user_input})
        print("\nAgent is thinking...\n")
        
        response = await agent.ainvoke({"messages": conversation_history})
        conversation_history = response['messages']
        final_response = response['messages'][-1].content
        print(f"Agent: {final_response}\n")

if __name__ == "__main__":
    asyncio.run(main())
```

```typescript
import { Composio } from '@composio/core';
import { LangchainProvider } from '@composio/langchain';
import { MultiServerMCPClient } from "@langchain/mcp-adapters";  
import { createAgent } from "langchain";
import * as readline from 'readline';
import 'dotenv/config';

const composioApiKey = process.env.COMPOSIO_API_KEY;
const userId = process.env.COMPOSIO_USER_ID;

if (!composioApiKey) throw new Error('COMPOSIO_API_KEY is not set');
if (!userId) throw new Error('COMPOSIO_USER_ID is not set');

async function main() {
    const composio = new Composio({
        apiKey: composioApiKey as string,
        provider: new LangchainProvider()
    });

    const session = await composio.create(
        userId as string,
        {
            toolkits: ['scrapingbee']
        }
    );

    const url = session.mcp.url;
    
    const client = new MultiServerMCPClient({
        "scrapingbee-agent": {
            transport: "http",
            url: url,
            headers: {
                "x-api-key": process.env.COMPOSIO_API_KEY
            }
        }
    });
    
    const tools = await client.getTools();
  
    const agent = createAgent({ model: "gpt-5", tools });
    
    let conversationHistory: any[] = [];
    
    console.log("Chat started! Type 'exit' or 'quit' to end the conversation.\n");
    console.log("Ask any Scrapingbee related question or task to the agent.\n");
    
    const rl = readline.createInterface({
        input: process.stdin,
        output: process.stdout,
        prompt: 'You: '
    });

    rl.prompt();

    rl.on('line', async (userInput: string) => {
        const trimmedInput = userInput.trim();
        
        if (['exit', 'quit', 'bye'].includes(trimmedInput.toLowerCase())) {
            console.log("\nGoodbye!");
            rl.close();
            process.exit(0);
        }
        
        if (!trimmedInput) {
            rl.prompt();
            return;
        }
        
        conversationHistory.push({ role: "user", content: trimmedInput });
        console.log("\nAgent is thinking...\n");
        
        const response = await agent.invoke({ messages: conversationHistory });
        conversationHistory = response.messages;
        
        const finalResponse = response.messages[response.messages.length - 1]?.content;
        console.log(`Agent: ${finalResponse}\n`);
        
        rl.prompt();
    });

    rl.on('close', () => {
        console.log('\nSession ended.');
        process.exit(0);
    });
}

main().catch((err) => {
    console.error('Fatal error:', err);
    process.exit(1);
});
```

## Conclusion

You've successfully built a LangChain agent that can interact with Scrapingbee through Composio's Tool Router.
Key features of this implementation:
- Dynamic tool loading through Composio's Tool Router
- Conversation history maintenance for context-aware responses
- Async Python provides clean, efficient execution of agent workflows
You can extend this further by adding error handling, implementing specific business logic, or integrating additional Composio toolkits to create multi-app workflows.

## How to build Scrapingbee MCP Agent with another framework

- [OpenAI Agents SDK](https://composio.dev/toolkits/scrapingbee/framework/open-ai-agents-sdk)
- [Claude Agent SDK](https://composio.dev/toolkits/scrapingbee/framework/claude-agents-sdk)
- [Claude Code](https://composio.dev/toolkits/scrapingbee/framework/claude-code)
- [Claude Cowork](https://composio.dev/toolkits/scrapingbee/framework/claude-cowork)
- [Codex](https://composio.dev/toolkits/scrapingbee/framework/codex)
- [OpenClaw](https://composio.dev/toolkits/scrapingbee/framework/openclaw)
- [Hermes](https://composio.dev/toolkits/scrapingbee/framework/hermes-agent)
- [CLI](https://composio.dev/toolkits/scrapingbee/framework/cli)
- [Google ADK](https://composio.dev/toolkits/scrapingbee/framework/google-adk)
- [Vercel AI SDK](https://composio.dev/toolkits/scrapingbee/framework/ai-sdk)
- [Mastra AI](https://composio.dev/toolkits/scrapingbee/framework/mastra-ai)
- [LlamaIndex](https://composio.dev/toolkits/scrapingbee/framework/llama-index)
- [CrewAI](https://composio.dev/toolkits/scrapingbee/framework/crew-ai)

## Related Toolkits

- [Supabase](https://composio.dev/toolkits/supabase) - Supabase is an open-source backend platform offering scalable Postgres databases, authentication, storage, and real-time APIs. It lets developers build modern apps without managing infrastructure.
- [Codeinterpreter](https://composio.dev/toolkits/codeinterpreter) - Codeinterpreter is a Python-based coding environment with built-in data analysis and visualization. It lets you instantly run scripts, plot results, and prototype solutions inside supported platforms.
- [GitHub](https://composio.dev/toolkits/github) - GitHub is a code hosting platform for version control and collaborative software development. It streamlines project management, code review, and team workflows in one place.
- [Ably](https://composio.dev/toolkits/ably) - Ably is a real-time messaging platform for live chat and data sync in modern apps. It offers global scale and rock-solid reliability for seamless, instant experiences.
- [Abuselpdb](https://composio.dev/toolkits/abuselpdb) - Abuselpdb is a central database for reporting and checking IPs linked to malicious online activity. Use it to quickly identify and report suspicious or abusive IP addresses.
- [Alchemy](https://composio.dev/toolkits/alchemy) - Alchemy is a blockchain development platform offering APIs and tools for Ethereum apps. It simplifies building and scaling Web3 projects with robust infrastructure.
- [Algolia](https://composio.dev/toolkits/algolia) - Algolia is a hosted search API that powers lightning-fast, relevant search experiences for web and mobile apps. It helps developers deliver instant, typo-tolerant, and scalable search without complex infrastructure.
- [Anchor browser](https://composio.dev/toolkits/anchor_browser) - Anchor browser is a developer platform for AI-powered web automation. It transforms complex browser actions into easy API endpoints for streamlined web interaction.
- [Apiflash](https://composio.dev/toolkits/apiflash) - Apiflash is a website screenshot API for programmatically capturing web pages. It delivers high-quality screenshots on demand for automation, monitoring, or reporting.
- [Apiverve](https://composio.dev/toolkits/apiverve) - Apiverve delivers a suite of powerful APIs that simplify integration for developers. It's designed for reliability and scalability so you can build faster, smarter applications without the integration headache.
- [Appcircle](https://composio.dev/toolkits/appcircle) - Appcircle is an enterprise-grade mobile CI/CD platform for building, testing, and publishing mobile apps. It streamlines mobile DevOps so teams ship faster and with more confidence.
- [Appdrag](https://composio.dev/toolkits/appdrag) - Appdrag is a cloud platform for building websites, APIs, and databases with drag-and-drop tools and code editing. It accelerates development and iteration by combining hosting, database management, and low-code features in one place.
- [Appveyor](https://composio.dev/toolkits/appveyor) - AppVeyor is a cloud-based continuous integration service for building, testing, and deploying applications. It helps developers automate and streamline their software delivery pipelines.
- [Backendless](https://composio.dev/toolkits/backendless) - Backendless is a backend-as-a-service platform for mobile and web apps, offering database, file storage, user authentication, and APIs. It helps developers ship scalable applications faster without managing server infrastructure.
- [Baserow](https://composio.dev/toolkits/baserow) - Baserow is an open-source no-code database platform for building collaborative data apps. It makes it easy for teams to organize data and automate workflows without writing code.
- [Bench](https://composio.dev/toolkits/bench) - Bench is a benchmarking tool for automated performance measurement and analysis. It helps you quickly evaluate, compare, and track your systems or workflows.
- [Better stack](https://composio.dev/toolkits/better_stack) - Better Stack is a monitoring, logging, and incident management solution for apps and services. It helps teams ensure application reliability and performance with real-time insights.
- [Bitbucket](https://composio.dev/toolkits/bitbucket) - Bitbucket is a Git-based code hosting and collaboration platform for teams. It enables secure repository management and streamlined code reviews.
- [Blazemeter](https://composio.dev/toolkits/blazemeter) - Blazemeter is a continuous testing platform for web and mobile app performance. It empowers teams to automate and analyze large-scale tests with ease.
- [Blocknative](https://composio.dev/toolkits/blocknative) - Blocknative delivers real-time mempool monitoring and transaction management for public blockchains. Instantly track pending transactions and optimize blockchain interactions with live data.

## Frequently Asked Questions

### What are the differences in Tool Router MCP and Scrapingbee MCP?

With a standalone Scrapingbee MCP server, the agents and LLMs can only access a fixed set of Scrapingbee tools tied to that server. However, with the Composio Tool Router, agents can dynamically load tools from Scrapingbee and many other apps based on the task at hand, all through a single MCP endpoint.

### Can I use Tool Router MCP with LangChain?

Yes, you can. LangChain fully supports MCP integration. You get structured tool calling, message history handling, and model orchestration while Tool Router takes care of discovering and serving the right Scrapingbee tools.

### Can I manage the permissions and scopes for Scrapingbee while using Tool Router?

Yes, absolutely. You can configure which Scrapingbee scopes and actions are allowed when connecting your account to Composio. You can also bring your own OAuth credentials or API configuration so you keep full control over what the agent can do.

### How safe is my data with Composio Tool Router?

All sensitive data such as tokens, keys, and configuration is fully encrypted at rest and in transit. Composio is SOC 2 Type 2 compliant and follows strict security practices so your Scrapingbee data and credentials are handled as safely as possible.

---
[See all toolkits](https://composio.dev/toolkits) · [Composio docs](https://docs.composio.dev/llms.txt)
