# How to integrate Google BigQuery MCP with LlamaIndex

```json
{
  "title": "How to integrate Google BigQuery MCP with LlamaIndex",
  "toolkit": "Google BigQuery",
  "toolkit_slug": "googlebigquery",
  "framework": "LlamaIndex",
  "framework_slug": "llama-index",
  "url": "https://composio.dev/toolkits/googlebigquery/framework/llama-index",
  "markdown_url": "https://composio.dev/toolkits/googlebigquery/framework/llama-index.md",
  "updated_at": "2026-05-12T10:13:49.095Z"
}
```

## Introduction

This guide walks you through connecting Google BigQuery to LlamaIndex using the Composio tool router. By the end, you'll have a working Google BigQuery agent that can run yesterday's sales summary query, find top 10 customers by revenue, analyze traffic data for last quarter through natural language commands.
This guide will help you understand how to give your LlamaIndex agent real control over a Google BigQuery account through Composio's Google BigQuery MCP server.
Before we dive in, let's take a quick look at the key ideas and tools involved.

## Also integrate Google BigQuery with

- [ChatGPT](https://composio.dev/toolkits/googlebigquery/framework/chatgpt)
- [OpenAI Agents SDK](https://composio.dev/toolkits/googlebigquery/framework/open-ai-agents-sdk)
- [Claude Agent SDK](https://composio.dev/toolkits/googlebigquery/framework/claude-agents-sdk)
- [Claude Code](https://composio.dev/toolkits/googlebigquery/framework/claude-code)
- [Claude Cowork](https://composio.dev/toolkits/googlebigquery/framework/claude-cowork)
- [Codex](https://composio.dev/toolkits/googlebigquery/framework/codex)
- [Cursor](https://composio.dev/toolkits/googlebigquery/framework/cursor)
- [VS Code](https://composio.dev/toolkits/googlebigquery/framework/vscode)
- [OpenCode](https://composio.dev/toolkits/googlebigquery/framework/opencode)
- [OpenClaw](https://composio.dev/toolkits/googlebigquery/framework/openclaw)
- [Hermes](https://composio.dev/toolkits/googlebigquery/framework/hermes-agent)
- [CLI](https://composio.dev/toolkits/googlebigquery/framework/cli)
- [Google ADK](https://composio.dev/toolkits/googlebigquery/framework/google-adk)
- [LangChain](https://composio.dev/toolkits/googlebigquery/framework/langchain)
- [Vercel AI SDK](https://composio.dev/toolkits/googlebigquery/framework/ai-sdk)
- [Mastra AI](https://composio.dev/toolkits/googlebigquery/framework/mastra-ai)
- [CrewAI](https://composio.dev/toolkits/googlebigquery/framework/crew-ai)

## TL;DR

Here's what you'll learn:
- Set your OpenAI and Composio API keys
- Install LlamaIndex and Composio packages
- Create a Composio Tool Router session for Google BigQuery
- Connect LlamaIndex to the Google BigQuery MCP server
- Build a Google BigQuery-powered agent using LlamaIndex
- Interact with Google BigQuery through natural language

## What is LlamaIndex?

LlamaIndex is a data framework for building LLM applications. It provides tools for connecting LLMs to external data sources and services through agents and tools.
Key features include:
- ReAct Agent: Reasoning and acting pattern for tool-using agents
- MCP Tools: Native support for Model Context Protocol
- Context Management: Maintain conversation context across interactions
- Async Support: Built for async/await patterns

## What is the Google BigQuery MCP server, and what's possible with it?

The Google BigQuery MCP server is an implementation of the Model Context Protocol that connects your AI agent and assistants like Claude, Cursor, etc directly to your Google BigQuery account. It provides structured and secure access to your data warehouse, so your agent can perform actions like running SQL queries, analyzing datasets, extracting insights, and automating reporting on your behalf.
- Instant SQL query execution: Have your agent run complex analytical queries on any of your BigQuery datasets and get results in real time.
- Custom data analysis and reporting: Instruct your agent to generate summaries, trends, or statistics by querying specific tables or views.
- Automated data extraction: Let your agent fetch and transform data for integration with other tools or for further analysis.
- Interactive business intelligence: Enable your agent to answer ad hoc data questions, visualize aggregated data, or pull specific metrics from massive datasets instantly.
- Streamlined workflow automation: Use your agent to automate recurring BigQuery tasks, such as daily audits or data slice generation, without manual effort.

## Supported Tools

| Tool slug | Name | Description |
|---|---|---|
| `GOOGLEBIGQUERY_CANCEL_JOB` | Cancel BigQuery Job | Tool to cancel a running BigQuery job. This call returns immediately, and you need to poll for the job status to see if the cancel completed successfully. Note that cancelled jobs may still incur costs. |
| `GOOGLEBIGQUERY_CREATE_CAPACITY_COMMITMENT` | Create Capacity Commitment | Tool to create a new capacity commitment resource in BigQuery Reservation. Use when you need to purchase compute capacity (slots) with a committed period of usage for BigQuery jobs. Supports various commitment plans (FLEX, MONTHLY, ANNUAL, THREE_YEAR) and editions (STANDARD, ENTERPRISE, ENTERPRISE_PLUS). |
| `GOOGLEBIGQUERY_CREATE_CONNECTION` | Create BigQuery Connection | Tool to create a new BigQuery connection to external data sources using the BigQuery Connection API. Use when setting up connections to AWS, Azure, Cloud Spanner, Cloud SQL, Salesforce DataCloud, or Apache Spark. |
| `GOOGLEBIGQUERY_CREATE_DATA_EXCHANGE` | Create Analytics Hub Data Exchange | Tool to create a new Analytics Hub data exchange for sharing BigQuery datasets. Use when you need to set up a container for data sharing with descriptive information and listings. |
| `GOOGLEBIGQUERY_CREATE_DATAEXCHANGES_LISTINGS` | Create Analytics Hub Listing | Tool to create a new listing in a BigQuery Analytics Hub data exchange. Use when you need to share a BigQuery dataset with specific subscribers or make it available for discovery. The dataset must exist and be in the same region as the data exchange. |
| `GOOGLEBIGQUERY_CREATE_DATASET` | Create BigQuery Dataset | Tool to create a new BigQuery dataset with explicit location, labels, and description using the BigQuery Datasets API. Use when the workflow needs to set up a staging/warehouse dataset and correctness of region is critical to avoid downstream job location mismatches. Surfaces 409 Already Exists errors cleanly without retrying. |
| `GOOGLEBIGQUERY_CREATE_LISTING` | Create Analytics Hub Listing | Tool to create a new listing in a data exchange using Analytics Hub API. Use when publishing a BigQuery dataset to make it available for subscription by other users or organizations. |
| `GOOGLEBIGQUERY_CREATE_LOCATIONS_DATAPOLICIES` | Create BigQuery Data Policy (v2beta1) | Tool to create a new data policy under a project with specified location using the v2beta1 BigQuery Data Policy API. Use when you need to set up data masking rules or column-level security for sensitive data. The v2beta1 endpoint uses a nested request structure. |
| `GOOGLEBIGQUERY_CREATE_QUERY_TEMPLATE` | Create Analytics Hub Query Template | Tool to create a new query template in a BigQuery Analytics Hub Data Clean Room (DCR) data exchange. Use when you need to define predefined and approved queries for data clean room use cases. Query templates must be created in DCR data exchanges only. |
| `GOOGLEBIGQUERY_CREATE_RESERVATION` | Create BigQuery Reservation | Tool to create a new BigQuery reservation resource to guarantee compute capacity (slots) for query and pipeline jobs. Use when you need to reserve dedicated compute resources for predictable performance and cost management. Reservations can be configured with autoscaling, concurrency limits, and edition-based features. |
| `GOOGLEBIGQUERY_CREATE_RESERVATION_ASSIGNMENT` | Create BigQuery Reservation Assignment | Tool to create a BigQuery reservation assignment that allows a project, folder, or organization to submit jobs using slots from a specified reservation. Use when setting up resource allocation for BigQuery workloads. Note: A resource can only have one assignment per (job_type, location) combination. |
| `GOOGLEBIGQUERY_CREATE_ROUTINE` | Create BigQuery Routine | Tool to create a new user-defined routine (function or procedure) in a BigQuery dataset. Use when you need to define SQL, JavaScript, Python, Java, or Scala functions/procedures for reusable logic, data transformations, or custom masking. Supports scalar functions, table-valued functions, procedures, and aggregate functions with comprehensive type definitions. |
| `GOOGLEBIGQUERY_CREATE_TABLE` | Create BigQuery Table | Tool to create a new, empty table in a BigQuery dataset. Use when setting up data infrastructure for standard tables, external tables, views, or materialized views. Supports partitioning, clustering, and encryption configuration. |
| `GOOGLEBIGQUERY_DELETE_DATASET` | Delete BigQuery Dataset | Tool to delete a BigQuery dataset specified by datasetId via the datasets.delete API. Before deletion, you must delete all tables unless deleteContents=True is specified. Use when cleaning up test datasets or removing unused data warehouses. Immediately after deletion, you can create another dataset with the same name. |
| `GOOGLEBIGQUERY_DELETE_JOB_METADATA` | Delete BigQuery Job Metadata | Tool to delete the metadata of a BigQuery job. Use when you need to remove job metadata from the system. If this is a parent job with child jobs, metadata from all child jobs will be deleted as well. |
| `GOOGLEBIGQUERY_DELETE_MODEL` | Delete BigQuery ML Model | Tool to delete a BigQuery ML model from a dataset. Use when you need to remove a trained machine learning model permanently. The operation deletes the model and cannot be undone. |
| `GOOGLEBIGQUERY_DELETE_ROUTINE` | Delete BigQuery Routine | Tool to delete a BigQuery routine by its ID. Use when you need to remove a stored procedure, user-defined function, or table function from a dataset. This operation is irreversible. |
| `GOOGLEBIGQUERY_DELETE_TABLE` | Delete BigQuery Table | Tool to delete a BigQuery table from a dataset. Use when you need to remove a table and all its data permanently. The operation deletes all data in the table and cannot be undone. |
| `GOOGLEBIGQUERY_GET_BIGQUERY_MODEL` | Get BigQuery ML Model | Tool to retrieve a specific BigQuery ML model resource by model ID. Use when you need detailed information about a trained machine learning model including its configuration, training runs, hyperparameters, and evaluation metrics. |
| `GOOGLEBIGQUERY_GET_CONNECTION_IAM_POLICY` | Get BigQuery Connection IAM Policy | Tool to get the IAM access control policy for a BigQuery connection resource. Returns an empty policy if the resource exists but has no policy set. Use this to check who has access to a specific connection before modifying permissions. |
| `GOOGLEBIGQUERY_GET_DATASET` | Get BigQuery Dataset Metadata | Tool to retrieve BigQuery dataset metadata including location via the datasets.get API. Use this before creating jobs/queries if the workflow has been failing with location mismatch to confirm the dataset's region and correct the job location accordingly. |
| `GOOGLEBIGQUERY_GET_JOB` | Get BigQuery Job | Tool to retrieve information about a specific BigQuery job. Returns job configuration, status, and statistics. Use this to check job status after running queries or to get details about job execution. |
| `GOOGLEBIGQUERY_GET_QUERY_RESULTS` | Get BigQuery Query Results | Tool to get the results of a BigQuery query job via RPC. Use this to retrieve results after running a query, or to check job completion status and fetch paginated results. |
| `GOOGLEBIGQUERY_GET_ROUTINE` | Get BigQuery Routine | Tool to retrieve a BigQuery routine (user-defined function or stored procedure) by its ID. Use to inspect routine definitions, arguments, return types, and metadata. |
| `GOOGLEBIGQUERY_GET_ROUTINE_IAM_POLICY` | Get BigQuery Routine IAM Policy | Tool to retrieve the IAM access control policy for a BigQuery routine resource. Returns an empty policy if the routine exists but has no policy set. Use this to check current access permissions before modifying them. |
| `GOOGLEBIGQUERY_GET_SERVICE_ACCOUNT` | Get BigQuery Service Account | Tool to get the service account for a project used for interactions with Google Cloud KMS. Use when you need to retrieve the BigQuery service account email for KMS encryption configuration or key access permissions. |
| `GOOGLEBIGQUERY_GET_TABLE_IAM_POLICY` | Get BigQuery Table IAM Policy | Tool to retrieve the IAM access control policy for a BigQuery table resource. Returns an empty policy if the resource exists but has no policy set. Use this to check current access permissions before modifying them. |
| `GOOGLEBIGQUERY_GET_TABLE_SCHEMA` | Get BigQuery Table Schema | Tool to fetch a BigQuery table's schema and metadata without querying row data. Use before generating SQL queries to avoid column name typos and confirm field types and nullable modes. This is especially useful when INFORMATION_SCHEMA access is restricted. |
| `GOOGLEBIGQUERY_INSERT_ALL` | Insert Data into BigQuery Table | Tool to stream data into BigQuery one record at a time without running a load job. Use when you need immediate data availability or inserting small batches. Supports row-level deduplication via insertId and error handling via skipInvalidRows. |
| `GOOGLEBIGQUERY_INSERT_JOB` | Insert BigQuery Job | Tool to start a new asynchronous BigQuery job (query, load, extract, or copy). Use when you need to run a query as a job, load data from Cloud Storage, extract table data to GCS, or copy tables. For dry-run validation without execution, set dryRun to true in configuration. |
| `GOOGLEBIGQUERY_INSERT_JOB_WITH_UPLOAD` | Insert BigQuery Job with Upload | Tool to start a new BigQuery load job with file upload. Uploads a file (CSV, JSON, etc.) and loads it into a BigQuery table in a single operation. Use when you need to upload data from a local file directly to BigQuery rather than loading from Cloud Storage. |
| `GOOGLEBIGQUERY_LIST_ANALYTICS_HUB_LISTINGS` | List Analytics Hub Listings | Tool to list all listings in a given Analytics Hub data exchange. Use when you need to discover available data listings within a specific data exchange that can be subscribed to for data sharing. |
| `GOOGLEBIGQUERY_LIST_BIG_QUERY_CONNECTIONS` | List BigQuery Connections | Tool to list BigQuery connections in a given project and location. Use when you need to discover available external data source connections (Cloud SQL, AWS, Azure, Spark, etc.) configured for BigQuery. |
| `GOOGLEBIGQUERY_LIST_CAPACITY_COMMITMENTS` | List BigQuery Capacity Commitments | Tool to list all capacity commitments for the admin project. Use when you need to view purchased compute capacity slots and their commitment details (plan, state, duration). |
| `GOOGLEBIGQUERY_LIST_DATAEXCHANGES_LISTINGS` | List Data Exchange Listings | Tool to list all listings in a given Analytics Hub data exchange using the v1beta1 API. Use when you need to discover available data listings within a specific data exchange that can be subscribed to for data sharing. |
| `GOOGLEBIGQUERY_LIST_DATASETS` | List BigQuery Datasets | Tool to list datasets in a specific BigQuery project, including dataset locations. Use after identifying an accessible project to discover available datasets and their locations before querying. The dataset location is critical for avoiding location-related query/job errors. |
| `GOOGLEBIGQUERY_LIST_JOBS` | List BigQuery Jobs | Tool to list all jobs that you started in a BigQuery project. Job information is available for a six month period after creation. Jobs are sorted in reverse chronological order by creation time. Use to monitor query execution, track job statuses, and retrieve job history. |
| `GOOGLEBIGQUERY_LIST_LOCATIONS` | List BigQuery Data Transfer Locations | Tool to list information about supported locations for BigQuery Data Transfer Service. Use when you need to discover available regions/locations where BigQuery Data Transfer operations can be performed. |
| `GOOGLEBIGQUERY_LIST_LOCATIONS_CONNECTIONS` | List Connections in Location | Tool to list BigQuery connections in a given project and location using the v1beta1 API. Use when you need to discover available external data source connections (Cloud SQL, AWS, Azure, Spark, etc.) configured for BigQuery in a specific location. |
| `GOOGLEBIGQUERY_LIST_LOCATIONS_DATAPOLICIES` | List BigQuery Location Data Policies | Tool to list all data policies in a specified parent project and location using the v2beta1 API. Use when you need to discover data masking policies and column-level security policies configured for BigQuery datasets. |
| `GOOGLEBIGQUERY_LIST_MODELS` | List BigQuery Models | Tool to list all BigQuery ML models in a specified dataset. Requires READER dataset role. Use this to discover available models before getting detailed information via models.get method. |
| `GOOGLEBIGQUERY_LIST_ORGANIZATION_DATA_EXCHANGES` | List Organization Data Exchanges | Tool to list all data exchanges from projects in a given organization and location using Analytics Hub API. Use when you need to discover available data exchanges within an organization that can be used for data sharing. |
| `GOOGLEBIGQUERY_LIST_PROJECTS` | List BigQuery Projects | Tool to list BigQuery projects to which the user has been granted any project role. Returns projects with at least READ access. For enhanced capabilities, consider using the Resource Manager API. |
| `GOOGLEBIGQUERY_LIST_QUERY_TEMPLATES` | List Analytics Hub Query Templates | Tool to list all query templates in a given Analytics Hub data exchange. Use when you need to discover available query templates that define predefined and approved queries for data clean room use cases. |
| `GOOGLEBIGQUERY_LIST_RESERVATION_ASSIGNMENTS` | List BigQuery Reservation Assignments | Tool to list BigQuery reservation assignments. Only explicitly created assignments will be returned (no expansion or merge happens). Use wildcard "-" in parent path to list assignments across all reservations in a location. |
| `GOOGLEBIGQUERY_LIST_RESERVATION_GROUPS` | List BigQuery Reservation Groups | Tool to list all BigQuery reservation groups for a project in a specified location. Use when you need to discover available reservation groups which serve as containers for reservations. |
| `GOOGLEBIGQUERY_LIST_RESERVATIONS` | List BigQuery Reservations | Tool to list all BigQuery reservations for a project in a specified location. Use when you need to discover available reservations or view reservation details including slot capacity and autoscale configuration. |
| `GOOGLEBIGQUERY_LIST_ROUTINES` | List BigQuery Routines | Tool to list all routines (user-defined functions and stored procedures) in a BigQuery dataset. Requires the READER dataset role. Use this to discover available routines before executing or inspecting them. |
| `GOOGLEBIGQUERY_LIST_ROW_ACCESS_POLICIES` | List BigQuery Row Access Policies | Tool to list all row access policies on a specified BigQuery table. Use when you need to discover which row-level security policies are applied to a table and their filter predicates. |
| `GOOGLEBIGQUERY_LIST_TABLE_DATA` | List BigQuery Table Data | Tool to list the content of a BigQuery table in rows via the REST API. Use this to retrieve actual data from a table without writing SQL queries. Returns paginated results with row data in the native BigQuery format. |
| `GOOGLEBIGQUERY_LIST_TABLES` | List BigQuery Tables | Tool to list tables in a BigQuery dataset via the REST API. Use this early in exploration to discover accessible tables without relying on INFORMATION_SCHEMA, especially when SQL-based metadata queries are blocked or restricted. This provides a deterministic inventory of tables even when dataset-level permissions prevent INFORMATION_SCHEMA access. |
| `GOOGLEBIGQUERY_PATCH_DATASET` | Patch BigQuery Dataset | Tool to update an existing BigQuery dataset using RFC5789 PATCH semantics. Only replaces fields provided in the request, leaving other fields unchanged. Use when you need to modify dataset properties like description, labels, expiration settings, or access controls without affecting other configuration. |
| `GOOGLEBIGQUERY_PATCH_MODEL` | Patch BigQuery ML Model | Tool to update specific fields in an existing BigQuery ML model using PATCH semantics. Use when you need to modify model metadata like description, friendly name, labels, or expiration time without replacing the entire model resource. |
| `GOOGLEBIGQUERY_PATCH_TABLE` | Patch BigQuery Table | Tool to update specific fields in an existing BigQuery table using RFC5789 PATCH semantics. Only the fields provided in the request are updated; unspecified fields remain unchanged. Use when you need to modify table metadata like description, friendly name, labels, or expiration time without replacing the entire table resource. |
| `GOOGLEBIGQUERY_QUERY` | Query | Query Tool runs a SQL query in BigQuery using the REST API. Use proper BigQuery SQL syntax, e.g., SELECT * FROM `project.dataset.table` WHERE column_name = 'value'. Results are returned under data.rows; an empty rows array means no matching data. Large result sets may be returned via remote_file_info instead of inline rows. Verify exact project_id, dataset, table, and column names before running; wrong identifiers trigger invalidQuery or notFound errors. |
| `GOOGLEBIGQUERY_SEARCH_ALL_ASSIGNMENTS` | Search All BigQuery Reservation Assignments | Tool to search all BigQuery reservation assignments for a specified resource in a particular region. Use when you need to find assignments for a project, folder, or organization. Returns assignments created on the resource or its closest ancestor, covering all JobTypes. |
| `GOOGLEBIGQUERY_SET_ROUTINE_IAM_POLICY` | Set BigQuery Routine IAM Policy | Tool to set the IAM access control policy for a BigQuery routine resource. Use this to grant or modify access permissions for users, service accounts, or groups. Include the etag from getIamPolicy to prevent concurrent modifications. |
| `GOOGLEBIGQUERY_TEST_ROUTINE_IAM_PERMISSIONS` | Test BigQuery Routine IAM Permissions | Tool to test which IAM permissions the caller has on a BigQuery routine. Returns the subset of requested permissions that the caller actually has. Use to verify access before performing operations. |
| `GOOGLEBIGQUERY_UNDELETE_DATASET` | Undelete BigQuery Dataset | Tool to undelete a BigQuery dataset within the time travel window. If a deletion time is specified, the dataset version deleted at that time is undeleted; otherwise, the most recently deleted version is restored. |
| `GOOGLEBIGQUERY_UPDATE_CONNECTION` | Update BigQuery Connection | Tool to update a specified BigQuery connection using the BigQuery Connection API. Use when you need to modify connection properties such as friendly name, description, or connection-specific settings. For security reasons, credentials are automatically reset if connection properties are included in the update mask. |
| `GOOGLEBIGQUERY_UPDATE_DATASET` | Update BigQuery Dataset | Tool to update information in an existing BigQuery dataset using the PUT method. The update method replaces the entire dataset resource, whereas the patch method only replaces fields that are provided in the submitted dataset resource. Use when you need to modify dataset properties like description, access controls, or default settings. |
| `GOOGLEBIGQUERY_UPDATE_ROUTINE` | Update BigQuery Routine | Tool to update an existing BigQuery routine (function or stored procedure). This replaces the entire routine resource with the provided definition. Use when modifying routine logic, arguments, return types, or other configuration. Ensure all required fields are provided as this is a full replacement operation. |
| `GOOGLEBIGQUERY_UPDATE_TABLE` | Update BigQuery Table | Tool to update an existing BigQuery table. The update method replaces the entire Table resource, whereas the patch method only replaces fields that are provided. Use when you need to modify table properties like schema, description, labels, partitioning, or clustering configuration. |

## Supported Triggers

None listed.

## Creating MCP Server - Stand-alone vs Composio SDK

The Google BigQuery MCP server is an implementation of the Model Context Protocol that connects your AI agent to Google BigQuery. It provides structured and secure access so your agent can perform Google BigQuery operations on your behalf through a secure, permission-based interface.
With Composio's managed implementation, you don't have to create your own developer app. For production, if you're building an end product, we recommend using your own credentials. The managed server helps you prototype fast and go from 0-1 faster.

## Step-by-step Guide

### 1. Prerequisites

Before you begin, make sure you have:
- Python 3.8/Node 16 or higher installed
- A Composio account with the API key
- An OpenAI API key
- A Google BigQuery account and project
- Basic familiarity with async Python/Typescript

### 1. Getting API Keys for OpenAI, Composio, and Google BigQuery

No description provided.

### 2. Installing dependencies

No description provided.
```python
pip install composio-llamaindex llama-index llama-index-llms-openai llama-index-tools-mcp python-dotenv
```

```typescript
npm install @composio/llamaindex @llamaindex/openai @llamaindex/tools @llamaindex/workflow dotenv
```

### 3. Set environment variables

Create a .env file in your project root:
These credentials will be used to:
- Authenticate with OpenAI's GPT-5 model
- Connect to Composio's Tool Router
- Identify your Composio user session for Google BigQuery access
```bash
OPENAI_API_KEY=your-openai-api-key
COMPOSIO_API_KEY=your-composio-api-key
COMPOSIO_USER_ID=your-user-id
```

### 4. Import modules

No description provided.
```python
import asyncio
import os
import dotenv

from composio import Composio
from composio_llamaindex import LlamaIndexProvider
from llama_index.core.agent.workflow import ReActAgent
from llama_index.core.workflow import Context
from llama_index.llms.openai import OpenAI
from llama_index.tools.mcp import BasicMCPClient, McpToolSpec

dotenv.load_dotenv()
```

```typescript
import "dotenv/config";
import readline from "node:readline/promises";
import { stdin as input, stdout as output } from "node:process";

import { Composio } from "@composio/core";

import { mcp } from "@llamaindex/tools";
import { agent as createAgent } from "@llamaindex/workflow";
import { openai } from "@llamaindex/openai";

dotenv.config();
```

### 5. Load environment variables and initialize Composio

No description provided.
```python
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
COMPOSIO_API_KEY = os.getenv("COMPOSIO_API_KEY")
COMPOSIO_USER_ID = os.getenv("COMPOSIO_USER_ID")

if not OPENAI_API_KEY:
    raise ValueError("OPENAI_API_KEY is not set in the environment")
if not COMPOSIO_API_KEY:
    raise ValueError("COMPOSIO_API_KEY is not set in the environment")
if not COMPOSIO_USER_ID:
    raise ValueError("COMPOSIO_USER_ID is not set in the environment")
```

```typescript
const OPENAI_API_KEY = process.env.OPENAI_API_KEY;
const COMPOSIO_API_KEY = process.env.COMPOSIO_API_KEY;
const COMPOSIO_USER_ID = process.env.COMPOSIO_USER_ID;

if (!OPENAI_API_KEY) throw new Error("OPENAI_API_KEY is not set");
if (!COMPOSIO_API_KEY) throw new Error("COMPOSIO_API_KEY is not set");
if (!COMPOSIO_USER_ID) throw new Error("COMPOSIO_USER_ID is not set");
```

### 6. Create a Tool Router session and build the agent function

What's happening here:
- We create a Composio client using your API key and configure it with the LlamaIndex provider
- We then create a tool router MCP session for your user, specifying the toolkits we want to use (in this case, google bigquery)
- The session returns an MCP HTTP endpoint URL that acts as a gateway to all your configured tools
- LlamaIndex will connect to this endpoint to dynamically discover and use the available Google BigQuery tools.
- The MCP tools are mapped to LlamaIndex-compatible tools and plug them into the Agent.
```python
async def build_agent() -> ReActAgent:
    composio_client = Composio(
        api_key=COMPOSIO_API_KEY,
        provider=LlamaIndexProvider(),
    )

    session = composio_client.create(
        user_id=COMPOSIO_USER_ID,
        toolkits=["googlebigquery"],
    )

    mcp_url = session.mcp.url
    print(f"Composio MCP URL: {mcp_url}")

    mcp_client = BasicMCPClient(mcp_url, headers={"x-api-key": COMPOSIO_API_KEY})
    mcp_tool_spec = McpToolSpec(client=mcp_client)
    tools = await mcp_tool_spec.to_tool_list_async()

    llm = OpenAI(model="gpt-5")

    description = "An agent that uses Composio Tool Router MCP tools to perform Google BigQuery actions."
    system_prompt = """
    You are a helpful assistant connected to Composio Tool Router.
    Use the available tools to answer user queries and perform Google BigQuery actions.
    """
    return ReActAgent(tools=tools, llm=llm, description=description, system_prompt=system_prompt, verbose=True)
```

```typescript
async function buildAgent() {

  console.log(`Initializing Composio client...${COMPOSIO_USER_ID!}...`);
  console.log(`COMPOSIO_USER_ID: ${COMPOSIO_USER_ID!}...`);

  const composio = new Composio({
    apiKey: COMPOSIO_API_KEY,
    provider: new LlamaindexProvider(),
  });

  const session = await composio.create(
    COMPOSIO_USER_ID!,
    {
      toolkits: ["googlebigquery"],
    },
  );

  const mcpUrl = session.mcp.url;
  console.log(`Composio Tool Router MCP URL: ${mcpUrl}`);

  const server = mcp({
    url: mcpUrl,
    clientName: "composio_tool_router_with_llamaindex",
    requestInit: {
      headers: {
        "x-api-key": COMPOSIO_API_KEY!,
      },
    },
    // verbose: true,
  });

  const tools = await server.tools();

  const llm = openai({ apiKey: OPENAI_API_KEY, model: "gpt-5" });

  const agent = createAgent({
    name: "composio_tool_router_with_llamaindex",
        description : "An agent that uses Composio Tool Router MCP tools to perform actions.",
    systemPrompt:
      "You are a helpful assistant connected to Composio Tool Router."+
"Use the available tools to answer user queries and perform Google BigQuery actions." ,
    llm,
    tools,
  });

  return agent;
}
```

### 7. Create an interactive chat loop

No description provided.
```python
async def chat_loop(agent: ReActAgent) -> None:
    ctx = Context(agent)
    print("Type 'quit', 'exit', or Ctrl+C to stop.")

    while True:
        try:
            user_input = input("\nYou: ").strip()
        except (KeyboardInterrupt, EOFError):
            print("\nBye!")
            break

        if not user_input or user_input.lower() in {"quit", "exit"}:
            print("Bye!")
            break

        try:
            print("Agent: ", end="", flush=True)
            handler = agent.run(user_input, ctx=ctx)

            async for event in handler.stream_events():
                # Stream token-by-token from LLM responses
                if hasattr(event, "delta") and event.delta:
                    print(event.delta, end="", flush=True)
                # Show tool calls as they happen
                elif hasattr(event, "tool_name"):
                    print(f"\n[Using tool: {event.tool_name}]", flush=True)

            # Get final response
            response = await handler
            print()  # Newline after streaming
        except KeyboardInterrupt:
            print("\n[Interrupted]")
            continue
        except Exception as e:
            print(f"\nError: {e}")
```

```typescript
async function chatLoop(agent: ReturnType<typeof createAgent>) {
  const rl = readline.createInterface({ input, output });

  console.log("Type 'quit' or 'exit' to stop.");

  while (true) {
    let userInput: string;

    try {
      userInput = (await rl.question("\nYou: ")).trim();
    } catch {
      console.log("\nAgent: Bye!");
      break;
    }

    if (!userInput) {
      continue;
    }

    const lower = userInput.toLowerCase();
    if (lower === "quit" || lower === "exit") {
      console.log("Agent: Bye!");
      break;
    }

    try {
      process.stdout.write("Agent: ");

      const stream = agent.runStream(userInput);
      let finalResult: any = null;

      for await (const event of stream) {
        // The event.data contains the streamed content
        const data: any = event.data;

        // Check for streaming delta content
        if (data?.delta) {
          process.stdout.write(data.delta);
        }

        // Store final result for fallback
        if (data?.result || data?.message) {
          finalResult = data;
        }
      }

      // If no streaming happened, show the final result
      if (finalResult) {
        const answer =
          finalResult.result ??
          finalResult.message?.content ??
          finalResult.message ??
          "";
        if (answer && typeof answer === "string" && !answer.includes("[object")) {
          process.stdout.write(answer);
        }
      }

      console.log(); // New line after streaming completes
    } catch (err: any) {
      console.error("\nAgent error:", err?.message ?? err);
    }
  }

  rl.close();
}
```

### 8. Define the main entry point

What's happening here:
- We're orchestrating the entire application flow
- The agent gets built with proper error handling
- Then we kick off the interactive chat loop so you can start talking to Google BigQuery
```python
async def main() -> None:
    agent = await build_agent()
    await chat_loop(agent)

if __name__ == "__main__":
    # Handle Ctrl+C gracefully
    signal.signal(signal.SIGINT, lambda s, f: (print("\nBye!"), exit(0)))
    try:
        asyncio.run(main())
    except KeyboardInterrupt:
        print("\nBye!")
```

```typescript
async function main() {
  try {
    const agent = await buildAgent();
    await chatLoop(agent);
  } catch (err) {
    console.error("Failed to start agent:", err);
    process.exit(1);
  }
}

main();
```

### 9. Run the agent

When prompted, authenticate and authorise your agent with Google BigQuery, then start asking questions.
```bash
python llamaindex_agent.py
```

```typescript
npx ts-node llamaindex-agent.ts
```

## Complete Code

```python
import asyncio
import os
import signal
import dotenv

from composio import Composio
from composio_llamaindex import LlamaIndexProvider
from llama_index.core.agent.workflow import ReActAgent
from llama_index.core.workflow import Context
from llama_index.llms.openai import OpenAI
from llama_index.tools.mcp import BasicMCPClient, McpToolSpec

dotenv.load_dotenv()

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
COMPOSIO_API_KEY = os.getenv("COMPOSIO_API_KEY")
COMPOSIO_USER_ID = os.getenv("COMPOSIO_USER_ID")

if not OPENAI_API_KEY:
    raise ValueError("OPENAI_API_KEY is not set")
if not COMPOSIO_API_KEY:
    raise ValueError("COMPOSIO_API_KEY is not set")
if not COMPOSIO_USER_ID:
    raise ValueError("COMPOSIO_USER_ID is not set")

async def build_agent() -> ReActAgent:
    composio_client = Composio(
        api_key=COMPOSIO_API_KEY,
        provider=LlamaIndexProvider(),
    )

    session = composio_client.create(
        user_id=COMPOSIO_USER_ID,
        toolkits=["googlebigquery"],
    )

    mcp_url = session.mcp.url
    print(f"Composio MCP URL: {mcp_url}")

    mcp_client = BasicMCPClient(mcp_url, headers={"x-api-key": COMPOSIO_API_KEY})
    mcp_tool_spec = McpToolSpec(client=mcp_client)
    tools = await mcp_tool_spec.to_tool_list_async()

    llm = OpenAI(model="gpt-5")
    description = "An agent that uses Composio Tool Router MCP tools to perform Google BigQuery actions."
    system_prompt = """
    You are a helpful assistant connected to Composio Tool Router.
    Use the available tools to answer user queries and perform Google BigQuery actions.
    """
    return ReActAgent(
        tools=tools,
        llm=llm,
        description=description,
        system_prompt=system_prompt,
        verbose=True,
    );

async def chat_loop(agent: ReActAgent) -> None:
    ctx = Context(agent)
    print("Type 'quit', 'exit', or Ctrl+C to stop.")

    while True:
        try:
            user_input = input("\nYou: ").strip()
        except (KeyboardInterrupt, EOFError):
            print("\nBye!")
            break

        if not user_input or user_input.lower() in {"quit", "exit"}:
            print("Bye!")
            break

        try:
            print("Agent: ", end="", flush=True)
            handler = agent.run(user_input, ctx=ctx)

            async for event in handler.stream_events():
                # Stream token-by-token from LLM responses
                if hasattr(event, "delta") and event.delta:
                    print(event.delta, end="", flush=True)
                # Show tool calls as they happen
                elif hasattr(event, "tool_name"):
                    print(f"\n[Using tool: {event.tool_name}]", flush=True)

            # Get final response
            response = await handler
            print()  # Newline after streaming
        except KeyboardInterrupt:
            print("\n[Interrupted]")
            continue
        except Exception as e:
            print(f"\nError: {e}")

async def main() -> None:
    agent = await build_agent()
    await chat_loop(agent)

if __name__ == "__main__":
    # Handle Ctrl+C gracefully
    signal.signal(signal.SIGINT, lambda s, f: (print("\nBye!"), exit(0)))
    try:
        asyncio.run(main())
    except KeyboardInterrupt:
        print("\nBye!")
```

```typescript
import "dotenv/config";
import readline from "node:readline/promises";
import { stdin as input, stdout as output } from "node:process";

import { Composio } from "@composio/core";
import { LlamaindexProvider } from "@composio/llamaindex";

import { mcp } from "@llamaindex/tools";
import { agent as createAgent } from "@llamaindex/workflow";
import { openai } from "@llamaindex/openai";

dotenv.config();

const OPENAI_API_KEY = process.env.OPENAI_API_KEY;
const COMPOSIO_API_KEY = process.env.COMPOSIO_API_KEY;
const COMPOSIO_USER_ID = process.env.COMPOSIO_USER_ID;

if (!OPENAI_API_KEY) {
    throw new Error("OPENAI_API_KEY is not set in the environment");
  }
if (!COMPOSIO_API_KEY) {
    throw new Error("COMPOSIO_API_KEY is not set in the environment");
  }
if (!COMPOSIO_USER_ID) {
    throw new Error("COMPOSIO_USER_ID is not set in the environment");
  }

async function buildAgent() {

  console.log(`Initializing Composio client...${COMPOSIO_USER_ID!}...`);
  console.log(`COMPOSIO_USER_ID: ${COMPOSIO_USER_ID!}...`);

  const composio = new Composio({
    apiKey: COMPOSIO_API_KEY,
    provider: new LlamaindexProvider(),
  });

  const session = await composio.create(
    COMPOSIO_USER_ID!,
    {
      toolkits: ["googlebigquery"],
    },
  );

  const mcpUrl = session.mcp.url;
  console.log(`Composio Tool Router MCP URL: ${mcpUrl}`);

  const server = mcp({
    url: mcpUrl,
    clientName: "composio_tool_router_with_llamaindex",
    requestInit: {
      headers: {
        "x-api-key": COMPOSIO_API_KEY!,
      },
    },
    // verbose: true,
  });

  const tools = await server.tools();

  const llm = openai({ apiKey: OPENAI_API_KEY, model: "gpt-5" });

  const agent = createAgent({
    name: "composio_tool_router_with_llamaindex",
    description:
      "An agent that uses Composio Tool Router MCP tools to perform actions.",
    systemPrompt:
      "You are a helpful assistant connected to Composio Tool Router."+
"Use the available tools to answer user queries and perform Google BigQuery actions." ,
    llm,
    tools,
  });

  return agent;
}

async function chatLoop(agent: ReturnType<typeof createAgent>) {
  const rl = readline.createInterface({ input, output });

  console.log("Type 'quit' or 'exit' to stop.");

  while (true) {
    let userInput: string;

    try {
      userInput = (await rl.question("\nYou: ")).trim();
    } catch {
      console.log("\nAgent: Bye!");
      break;
    }

    if (!userInput) {
      continue;
    }

    const lower = userInput.toLowerCase();
    if (lower === "quit" || lower === "exit") {
      console.log("Agent: Bye!");
      break;
    }

    try {
      process.stdout.write("Agent: ");

      const stream = agent.runStream(userInput);
      let finalResult: any = null;

      for await (const event of stream) {
        // The event.data contains the streamed content
        const data: any = event.data;

        // Check for streaming delta content
        if (data?.delta) {
          process.stdout.write(data.delta);
        }

        // Store final result for fallback
        if (data?.result || data?.message) {
          finalResult = data;
        }
      }

      // If no streaming happened, show the final result
      if (finalResult) {
        const answer =
          finalResult.result ??
          finalResult.message?.content ??
          finalResult.message ??
          "";
        if (answer && typeof answer === "string" && !answer.includes("[object")) {
          process.stdout.write(answer);
        }
      }

      console.log(); // New line after streaming completes
    } catch (err: any) {
      console.error("\nAgent error:", err?.message ?? err);
    }
  }

  rl.close();
}

async function main() {
  try {
    const agent = await buildAgent();
    await chatLoop(agent);
  } catch (err: any) {
    console.error("Failed to start agent:", err?.message ?? err);
    process.exit(1);
  }
}

main();
```

## Conclusion

You've successfully connected Google BigQuery to LlamaIndex through Composio's Tool Router MCP layer.
Key takeaways:
- Tool Router dynamically exposes Google BigQuery tools through an MCP endpoint
- LlamaIndex's ReActAgent handles reasoning and orchestration; Composio handles integrations
- The agent becomes more capable without increasing prompt size
- Async Python provides clean, efficient execution of agent workflows
You can easily extend this to other toolkits like Gmail, Notion, Stripe, GitHub, and more by adding them to the toolkits parameter.

## How to build Google BigQuery MCP Agent with another framework

- [ChatGPT](https://composio.dev/toolkits/googlebigquery/framework/chatgpt)
- [OpenAI Agents SDK](https://composio.dev/toolkits/googlebigquery/framework/open-ai-agents-sdk)
- [Claude Agent SDK](https://composio.dev/toolkits/googlebigquery/framework/claude-agents-sdk)
- [Claude Code](https://composio.dev/toolkits/googlebigquery/framework/claude-code)
- [Claude Cowork](https://composio.dev/toolkits/googlebigquery/framework/claude-cowork)
- [Codex](https://composio.dev/toolkits/googlebigquery/framework/codex)
- [Cursor](https://composio.dev/toolkits/googlebigquery/framework/cursor)
- [VS Code](https://composio.dev/toolkits/googlebigquery/framework/vscode)
- [OpenCode](https://composio.dev/toolkits/googlebigquery/framework/opencode)
- [OpenClaw](https://composio.dev/toolkits/googlebigquery/framework/openclaw)
- [Hermes](https://composio.dev/toolkits/googlebigquery/framework/hermes-agent)
- [CLI](https://composio.dev/toolkits/googlebigquery/framework/cli)
- [Google ADK](https://composio.dev/toolkits/googlebigquery/framework/google-adk)
- [LangChain](https://composio.dev/toolkits/googlebigquery/framework/langchain)
- [Vercel AI SDK](https://composio.dev/toolkits/googlebigquery/framework/ai-sdk)
- [Mastra AI](https://composio.dev/toolkits/googlebigquery/framework/mastra-ai)
- [CrewAI](https://composio.dev/toolkits/googlebigquery/framework/crew-ai)

## Related Toolkits

- [Firecrawl](https://composio.dev/toolkits/firecrawl) - Firecrawl automates large-scale web crawling and data extraction. It helps organizations efficiently gather, index, and analyze content from online sources.
- [Tavily](https://composio.dev/toolkits/tavily) - Tavily offers powerful search and data retrieval from documents, databases, and the web. It helps teams locate and filter information instantly, saving hours on research.
- [Exa](https://composio.dev/toolkits/exa) - Exa is a data extraction and search platform for gathering and analyzing information from websites, APIs, or databases. It helps teams quickly surface insights and automate data-driven workflows.
- [Serpapi](https://composio.dev/toolkits/serpapi) - SerpApi is a real-time API for structured search engine results. It lets you automate SERP data collection, parsing, and analysis for SEO and research.
- [Peopledatalabs](https://composio.dev/toolkits/peopledatalabs) - Peopledatalabs delivers B2B data enrichment and identity resolution APIs. Supercharge your apps with accurate, up-to-date business and contact data.
- [Snowflake](https://composio.dev/toolkits/snowflake) - Snowflake is a cloud data warehouse built for elastic scaling, secure data sharing, and fast SQL analytics across major clouds.
- [Posthog](https://composio.dev/toolkits/posthog) - PostHog is an open-source analytics platform for tracking user interactions and product metrics. It helps teams refine features, analyze funnels, and reduce churn with actionable insights.
- [Amplitude](https://composio.dev/toolkits/amplitude) - Amplitude is a digital analytics platform for product and behavioral data insights. It helps teams analyze user journeys and make data-driven decisions quickly.
- [Bright Data MCP](https://composio.dev/toolkits/brightdata_mcp) - Bright Data MCP is an AI-powered web scraping and data collection platform. Instantly access public web data in real time with advanced scraping tools.
- [Browseai](https://composio.dev/toolkits/browseai) - Browseai is a web automation and data extraction platform that turns any website into an API. It's perfect for monitoring websites and retrieving structured data without manual scraping.
- [ClickHouse](https://composio.dev/toolkits/clickhouse) - ClickHouse is an open-source, column-oriented database for real-time analytics and big data processing using SQL. Its lightning-fast query performance makes it ideal for handling large datasets and delivering instant insights.
- [Coinmarketcal](https://composio.dev/toolkits/coinmarketcal) - CoinMarketCal is a community-powered crypto calendar for upcoming events, announcements, and releases. It helps traders track market-moving developments and stay ahead in the crypto space.
- [Control d](https://composio.dev/toolkits/control_d) - Control d is a customizable DNS filtering and traffic redirection platform. It helps you manage internet access, enforce policies, and monitor usage across devices and networks.
- [Databox](https://composio.dev/toolkits/databox) - Databox is a business analytics platform that connects your data from any tool and device. It helps you track KPIs, build dashboards, and discover actionable insights.
- [Databricks](https://composio.dev/toolkits/databricks) - Databricks is a unified analytics platform for big data and AI on the lakehouse architecture. It empowers data teams to collaborate, analyze, and build scalable solutions efficiently.
- [Datagma](https://composio.dev/toolkits/datagma) - Datagma delivers data intelligence and analytics for business growth and market discovery. Get actionable market insights and track competitors to inform your strategy.
- [Delighted](https://composio.dev/toolkits/delighted) - Delighted is a customer feedback platform based on the Net Promoter System®. It helps you quickly gather, track, and act on customer sentiment.
- [Dovetail](https://composio.dev/toolkits/dovetail) - Dovetail is a research analysis platform for transcript review and insight generation. It helps teams code interviews, analyze feedback, and create actionable research summaries.
- [Dub](https://composio.dev/toolkits/dub) - Dub is a short link management platform with analytics and API access. Use it to easily create, manage, and track branded short links for your business.
- [Elasticsearch](https://composio.dev/toolkits/elasticsearch) - Elasticsearch is a distributed, RESTful search and analytics engine for all types of data. It delivers fast, scalable search and powerful analytics across massive datasets.

## Frequently Asked Questions

### What are the differences in Tool Router MCP and Google BigQuery MCP?

With a standalone Google BigQuery MCP server, the agents and LLMs can only access a fixed set of Google BigQuery tools tied to that server. However, with the Composio Tool Router, agents can dynamically load tools from Google BigQuery and many other apps based on the task at hand, all through a single MCP endpoint.

### Can I use Tool Router MCP with LlamaIndex?

Yes, you can. LlamaIndex fully supports MCP integration. You get structured tool calling, message history handling, and model orchestration while Tool Router takes care of discovering and serving the right Google BigQuery tools.

### Can I manage the permissions and scopes for Google BigQuery while using Tool Router?

Yes, absolutely. You can configure which Google BigQuery scopes and actions are allowed when connecting your account to Composio. You can also bring your own OAuth credentials or API configuration so you keep full control over what the agent can do.

### How safe is my data with Composio Tool Router?

All sensitive data such as tokens, keys, and configuration is fully encrypted at rest and in transit. Composio is SOC 2 Type 2 compliant and follows strict security practices so your Google BigQuery data and credentials are handled as safely as possible.

---
[See all toolkits](https://composio.dev/toolkits) · [Composio docs](https://docs.composio.dev/llms.txt)