Diffbot MCP for AI Agents

Securely connect your AI agents and chatbots (Claude, ChatGPT, Cursor, etc) with Diffbot MCP or direct API to extract article content, analyze product listings, enrich structured web data, and automate web research through natural language.
Trusted by
AWS
Glean
Zoom
Airtable

30 min · no commitment · see it on your stack

Diffbot Logo
Gradient Top
Gradient Middle
Gradient Bottom
divider

Try Diffbot now

Enter a prompt below to test the integration in our Tool Router playground. You'll be redirected to sign in and try it live.

Supported Tools

Tools
Combine Entity ProfilesCombine multiple entity profiles into a unified view using the Diffbot Knowledge Graph.
Create Bulk Extract JobTool to submit a bulk extract job to process multiple URLs with Extract APIs.
Create or Update Custom APITool to create or update the parameters and ruleset of a Custom API.
Create Bulk Enhance JobTool to submit a bulk enhance job to enrich multiple entities asynchronously.
Delete Custom APITool to delete custom API definitions for a given URL pattern.
Delete KG Enhance BulkjobTool to delete an Enhance Bulkjob.
Download Bulk Job ResultsTool to download results of a bulk enhance job with filtering options via POST request.
Enhance Entity with Knowledge GraphEnrich a person or organization with comprehensive data from the Diffbot Knowledge Graph.
Diffbot Extract JobTool to extract structured job posting data from job listing pages.
Diffbot Extract ListTool to extract structured data from list-style pages like news indexes, product listings, and directory pages.
Get Diffbot Account DetailsRetrieves comprehensive Diffbot account information including subscription plan details, credit balance, usage history, and account status.
Diffbot AnalyzeAutomatically analyzes a web page to determine its type and extract structured data.
Get Article DataTool to extract information from articles, including authors, publication dates, and images.
Get Bulk Job DataTool to download extracted results from a completed bulk job.
Get Bulk Job StatusTool to poll the status of a specific Diffbot Knowledge Graph Enhance bulk job.
Get Bulk Job ResultsTool to download the results of a completed Enhance Bulkjob.
Get Bulk Single ResultTool to download the result of a single job within a Diffbot bulk enhance job.
Get Crawl DataDownload extracted results from a completed crawl job.
Get Discussion ThreadExtract structured discussion threads from web pages including forums, comment sections, product reviews, Reddit discussions, and blog comments.
Diffbot Get EventTool to extract event details from web pages.
Diffbot Get ImageTool to extract detailed information about images, including dimensions and recognition data.
Get KG Coverage Report by IDDownload Knowledge Graph coverage report by report ID.
Diffbot Get ProductTool to extract product information such as specifications, prices, availability, and reviews.
Get Video DataTool to extract information from videos, including titles, descriptions, and embedded HTML.
List Bulk JobsTool to list all Bulk jobs associated with a specific token.
List Bulk Jobs Status For TokenTool to get the status of all bulk enhance jobs for a token.
List Custom APIsTool to retrieve all Custom APIs and their extraction rules currently defined on your Diffbot token.
Manage Crawl JobManages Diffbot crawl jobs: pause, restart, delete, or view status.
Resolve Lost IDTool to resolve lost IDs in the Knowledge Graph.
Diffbot Knowledge Graph SearchSearch the Diffbot Knowledge Graph using DQL (Diffbot Query Language).
Search Crawl Job DataTool to query crawl job collections using DQL (Diffbot Query Language).
Start Bulk JobTool to start a Bulk Extract job.
Start Crawl JobInitiates a Diffbot crawl job that spiders a website starting from seed URLs and processes discovered pages with a specified Extract API.
Stop Bulk JobTool to pause (stop) a running Bulk job.
Stop KG Bulk Job By IDTool to stop an active Knowledge Graph Enhance bulk job by its ID.
Python
TypeScript

Install Composio

python
pip install composio claude-agent-sdk
Install the Composio SDK and Claude Agent SDK

Create Tool Router Session

python
from composio import Composio
from claude_agent_sdk import ClaudeSDKClient, ClaudeAgentOptions

composio = Composio(api_key='your-composio-api-key')
session = composio.create(user_id='your-user-id')
url = session.mcp.url
Initialize the Composio client and create a Tool Router session

Connect to AI Agent

python
import asyncio

options = ClaudeAgentOptions(
    permission_mode='bypassPermissions',
    mcp_servers={
        'tool_router': {
            'type': 'http',
            'url': url,
            'headers': {
                'x-api-key': 'your-composio-api-key'
            }
        }
    },
    system_prompt='You are a helpful assistant with access to Diffbot tools.',
    max_turns=10
)

async def main():
    async with ClaudeSDKClient(options=options) as client:
        await client.query('Extract product details from https://www.example.com/product/12345')
        async for message in client.receive_response():
            if hasattr(message, 'content'):
                for block in message.content:
                    if hasattr(block, 'text'):
                        print(block.text)

asyncio.run(main())
Use the MCP server with your AI agent

Why Use Composio?

AI Native Diffbot Integration

  • Supports both Diffbot MCP and direct API based integrations
  • Structured, LLM-friendly schemas for reliable tool execution
  • Rich coverage for extracting, analyzing, and enriching web data

Managed Auth

  • Built-in API key management with secure storage and rotation
  • Central place to manage, scope, and revoke Diffbot API keys
  • Per user and per environment credentials instead of hard-coded keys

Agent Optimized Design

  • Tools are tuned using real error and success rates to improve reliability over time
  • Comprehensive execution logs so you always know what ran, when, and on whose behalf

Enterprise Grade Security

  • Fine-grained RBAC so you control which agents and users can access Diffbot
  • Scoped, least privilege access to Diffbot resources
  • Full audit trail of agent actions to support review and compliance

Frequently Asked Questions

Do I need my own developer credentials to use Diffbot with Composio?

Yes, Diffbot requires you to configure your own API key credentials. Once set up, Composio handles secure credential storage and API request handling for you.

Can I use multiple toolkits together?

Yes! Composio's Tool Router enables agents to use multiple toolkits. Learn more.

Is Composio secure?

Composio is SOC 2 and ISO 27001 compliant with all data encrypted in transit and at rest. Learn more.

What if the API changes?

Composio maintains and updates all toolkit integrations automatically, so your agents always work with the latest API versions.

Used by agents from

Context
Letta
glean
HubSpot
Agent.ai
Altera
DataStax
Entelligence
Rolai
Context
Letta
glean
HubSpot
Agent.ai
Altera
DataStax
Entelligence
Rolai
Context
Letta
glean
HubSpot
Agent.ai
Altera
DataStax
Entelligence
Rolai

Never worry about agent reliability

We handle tool reliability, observability, and security so you never have to second-guess an agent action.