How to integrate Census bureau MCP with LlamaIndex

Trusted by
AWS
Glean
Zoom
Airtable

30 min · no commitment · see it on your stack

Census bureau logo
LlamaIndex logo
divider

Introduction

This guide walks you through connecting Census bureau to LlamaIndex using the Composio tool router. By the end, you'll have a working Census bureau agent that can get latest population estimate for los angeles county, list top industries in texas by employment, fetch 5-year acs median income for chicago through natural language commands.

This guide will help you understand how to give your LlamaIndex agent real control over a Census bureau account through Composio's Census bureau MCP server.

Before we dive in, let's take a quick look at the key ideas and tools involved.

Also integrate Census bureau with

TL;DR

Here's what you'll learn:
  • Set your OpenAI and Composio API keys
  • Install LlamaIndex and Composio packages
  • Create a Composio Tool Router session for Census bureau
  • Connect LlamaIndex to the Census bureau MCP server
  • Build a Census bureau-powered agent using LlamaIndex
  • Interact with Census bureau through natural language

What is LlamaIndex?

LlamaIndex is a data framework for building LLM applications. It provides tools for connecting LLMs to external data sources and services through agents and tools.

Key features include:

  • ReAct Agent: Reasoning and acting pattern for tool-using agents
  • MCP Tools: Native support for Model Context Protocol
  • Context Management: Maintain conversation context across interactions
  • Async Support: Built for async/await patterns

What is the Census bureau MCP server, and what's possible with it?

The Census bureau MCP server is an implementation of the Model Context Protocol that connects your AI agent and assistants like Claude, Cursor, etc directly to Census Bureau data resources. It provides structured and secure access to a wide range of U.S. demographic, business, and community datasets, so your agent can retrieve population statistics, analyze survey results, fetch business patterns, and explore census variables on your behalf.

  • Retrieve up-to-date population estimates: Have your agent pull the latest demographic and population data for specific states, counties, or cities using the Population Estimates Program (PEP).
  • Analyze American Community Survey results: Access detailed ACS 1-year and 5-year estimates for any geography, helping you understand community trends, housing, and economic data.
  • Explore business statistics by region: Automatically fetch County Business Patterns (CBP) and Annual Business Survey (ABS) data to examine local industry and employment trends.
  • Access decennial census data: Let your agent retrieve variables and statistics from the decennial census by vintage and dataset for deep historical and demographic analysis.
  • Investigate variable metadata and definitions: Effortlessly obtain detailed information about any census variable, including descriptions, data types, and valid values for more informed analysis.

Supported Tools & Triggers

Tools
Geocode AddressTool to geocode a single address to get latitude/longitude coordinates.
Geocode Address for Census GeographiesGeocode an address and return Census geography identifiers including state, county, tract, block group, and block FIPS codes.
Geocode Address PartsTool to geocode an address using separate components (street, city, state, ZIP) to get latitude/longitude coordinates.
Geocode Address with GeographyTool to geocode an address and return both coordinates and Census geography information.
Geocode CoordinatesReverse geocode latitude/longitude coordinates to Census geographic areas.
Geocode Puerto Rico Address with GeographyTool to geocode a Puerto Rico address and return coordinates plus Census geography data.
Batch Geocode Addresses with GeographiesBatch geocode multiple addresses from a CSV file and return Census geography codes.
Geocode Puerto Rico AddressTool to geocode a Puerto Rico address with urbanization to latitude/longitude coordinates.
Get ACS 1-Year EstimatesTool to retrieve 1-year American Community Survey (ACS) estimates for a specified geography.
Get ACS 5-Year EstimatesRetrieve 5-year American Community Survey (ACS) estimates from the U.
Get Community Resilience EstimatesRetrieve U.
Get County Business PatternsTool to retrieve County Business Patterns (CBP) data for a specified year.
Get Dataset Examples HTMLTool to retrieve example queries for a Census dataset in HTML format.
Get Dataset Examples JSONTool to retrieve example API query patterns for a specific Census dataset and vintage.
Get Dataset Examples (XML)Tool to retrieve example queries for a Census Bureau dataset in XML format.
Get Dataset Geography HTMLTool to retrieve available geographies for a Census dataset in HTML format.
Get Dataset Geography JSONTool to get the list of supported geography levels for a specific Census dataset with their hierarchy and required predicates.
Get Dataset Geography XMLTool to retrieve available geographies for a Census Bureau dataset in XML format.
Get Dataset GroupsTool to retrieve the list of table groups for a Census dataset and vintage.
Get Dataset SortsTool to list available sort options for a specific Census dataset and vintage.
Get Dataset TagsTool to list available tags/keywords for a specific Census dataset and vintage.
Get Dataset Variables JSONTool to retrieve the complete list of available variables for a specific Census dataset.
Get Decennial Census DataRetrieve Decennial Census data (population, demographics, housing) from the U.
Get Planning Database DataGet Planning Database (PDB) data containing Census tract and block group level data useful for planning.
Get Population EstimatesRetrieves Population Estimates Program (PEP) data from the US Census Bureau API.
Get TIGERweb ACS Generalized BoundariesTool to access generalized ACS (American Community Survey) boundary services from TIGERweb for specific survey years (2012-2024).
Get TIGERweb Map Service MetadataTool to retrieve TIGERweb MapServer service metadata including available layers, capabilities, and spatial reference information.
Get Timeseries Examples HTMLTool to retrieve HTML-formatted example queries for a Census Bureau timeseries dataset.
Get Timeseries Examples JSONTool to get example queries for a timeseries dataset in JSON format.
Get Timeseries Examples XMLTool to retrieve example queries for a Census Bureau timeseries dataset in XML format.
Get Timeseries Geography HTMLTool to retrieve available FIPS geographies for a timeseries dataset in HTML format.
Get Timeseries Geography JSONTool to get available geographies for a timeseries dataset in JSON format.
Get Timeseries Geography XMLTool to retrieve available geographies for a Census Bureau timeseries dataset in XML format.
Get Timeseries Variables HTMLTool to retrieve a list of available variables for a Census timeseries dataset in HTML format.
Get Timeseries Variables JSONTool to get a list of variables available for a timeseries dataset in JSON format.
Get Timeseries Variables XMLTool to get a list of variables available for a timeseries dataset in XML format.
Get Variable DetailsTool to retrieve metadata for a specific variable in a Census dataset for a given year.
List Available DatasetsLists all available Census Bureau datasets with their metadata, vintages, and API endpoints.
List Datasets HTMLTool to retrieve a complete HTML listing of all available (non-timeseries) Census Bureau datasets.
List Datasets XMLTool to retrieve a list of all available Census Bureau datasets in XML format.
List Geocoder BenchmarksList all available benchmark versions for the Census Bureau geocoding service.
List Geocoder VintagesTool to list available geography vintages for a given Census geocoder benchmark.
List TIGERweb ServicesTool to discover all available TIGERweb map services for Census geographic boundaries.
List Timeseries Datasets (HTML)Tool to retrieve a list of all available timeseries datasets from the US Census Bureau API in HTML format.
List Timeseries Datasets (JSON)Tool to list all available timeseries datasets from the US Census Bureau API.
List Timeseries Datasets (XML)Tool to retrieve a list of all available Census Bureau timeseries datasets in XML format.
Query ACS Supplemental EstimatesQuery ACS Supplemental Estimates data by variables and geography.
Query ACS Comparison ProfilesQuery ACS Comparison Profiles data by variables and geography.
Query ACS Migration FlowsTool to query American Community Survey (ACS) Migration Flows data by variables and geography.
Query ACS Data ProfileTool to query ACS Data Profiles by variables and geography.
Query ACS Selected Population ProfilesTool to query ACS Selected Population Profiles (SPP) data by variables and geography for specific population groups.
Query ACS Subject TablesTool to query ACS Subject Tables data by variables and geography.
Query Annual Business SurveyTool to query Annual Business Survey Company Summary (abscs) data with demographic filters.
Query Commodity Flow SurveyQuery Commodity Flow Survey data on freight shipments by origin, destination, mode, and commodity.
Query CPS Survey DataTool to query Current Population Survey (CPS) microdata including basic monthly employment data and supplemental surveys.
Query Decennial DHCTool to query Decennial Census Demographic and Housing Characteristics (DHC) data by variables and geography.
Query Decennial Census Demographic ProfileTool to query Decennial Census Demographic Profile data by variables and geography.
Query Decennial Census P.L. Redistricting DataTool to query Decennial Census P.
Query Economic Census DataTool to query Economic Census data including establishments, employment, payroll, and receipts by geography and industry (NAICS).
Query International Trade TimeseriesTool to query International Trade timeseries data from Census Bureau API.
Query Nonemployer StatisticsTool to query Nonemployer Statistics data covering businesses with no paid employees.
Query PEP CharAgeGroupsQuery population estimates by age groups, sex, race, and Hispanic origin from the Census Bureau PEP CharAgeGroups dataset.
Query PEP ComponentsQuery components of population change from the Census Bureau Population Estimates Program (PEP).
Query PEP Housing EstimatesQuery housing unit estimates from the US Census Bureau Population Estimates Program (PEP).
Query Population ProjectionsQuery population projections from the Census Bureau API.
Query Surname DataQuery surname frequency data from the U.
Query TIGERweb LayerTool to query TIGERweb GeoServices for Census geographic boundaries and features.
Query Business Dynamics StatisticsQuery Business Dynamics Statistics (BDS) time series data from the Census Bureau.
Query Timeseries DataQuery Census timeseries datasets containing longitudinal data for multiple time periods.
Query Economic Indicators Time SeriesTool to query Economic Indicators Time Series (EITS) data from the US Census Bureau.
Query Residential Construction StatsTool to query Residential Construction statistics from Census Bureau Economic Indicators Time Series (EITS).
Query Residential Sales DataQuery Residential Sales statistics from Census Bureau's Economic Indicator Time Series (EITS).
Query Health Insurance EstimatesQuery Small Area Health Insurance Estimates (SAHIE) from the Census Bureau timeseries API.
Query Household Pulse Survey TimeseriesTool to query Household Pulse Survey (HPS) timeseries data measuring household experiences during the COVID-19 pandemic.
Query International DatabaseQuery International Database (IDB) demographic data for 227 countries and areas worldwide.
Query Timeseries International Trade Exports by HSTool to query international trade exports by Harmonized System code from Census Bureau time series API.
Query Timeseries International Trade Imports by End UseQuery international trade imports by end-use category from Census Bureau timeseries data.
Query Timeseries PovertyQuery poverty statistics from the Census Bureau's timeseries poverty datasets.
Query QWI Timeseries DataQuery Quarterly Workforce Indicators (QWI) timeseries data on employment, earnings, and job flows.
Query Timeseries QWI State/AreaQuery Quarterly Workforce Indicators (QWI) State/Area characteristics from the Census Bureau's time series API.
Query ZIP Business PatternsTool to query ZIP Code Business Patterns (ZBP) data including establishments and employment by ZIP code and industry.

What is the Composio tool router, and how does it fit here?

What is Composio SDK?

Composio's Composio SDK helps agents find the right tools for a task at runtime. You can plug in multiple toolkits (like Gmail, HubSpot, and GitHub), and the agent will identify the relevant app and action to complete multi-step workflows. This can reduce token usage and improve the reliability of tool calls. Read more here: Getting started with Composio SDK

The tool router generates a secure MCP URL that your agents can access to perform actions.

How the Composio SDK works

The Composio SDK follows a three-phase workflow:

  1. Discovery: Searches for tools matching your task and returns relevant toolkits with their details.
  2. Authentication: Checks for active connections. If missing, creates an auth config and returns a connection URL via Auth Link.
  3. Execution: Executes the action using the authenticated connection.

Step-by-step Guide

Prerequisites

Before you begin, make sure you have:
  • Python 3.8/Node 16 or higher installed
  • A Composio account with the API key
  • An OpenAI API key
  • A Census bureau account and project
  • Basic familiarity with async Python/Typescript

Getting API Keys for OpenAI, Composio, and Census bureau

OpenAI API key (OPENAI_API_KEY)
  • Go to the OpenAI dashboard
  • Create an API key if you don't have one
  • Assign it to OPENAI_API_KEY in .env
Composio API key and user ID
  • Log into the Composio dashboard
  • Copy your API key from Settings
    • Use this as COMPOSIO_API_KEY
  • Pick a stable user identifier (email or ID)
    • Use this as COMPOSIO_USER_ID

Installing dependencies

pip install composio-llamaindex llama-index llama-index-llms-openai llama-index-tools-mcp python-dotenv

Create a new Python project and install the necessary dependencies:

  • composio-llamaindex: Composio's LlamaIndex integration
  • llama-index: Core LlamaIndex framework
  • llama-index-llms-openai: OpenAI LLM integration
  • llama-index-tools-mcp: MCP client for LlamaIndex
  • python-dotenv: Environment variable management

Set environment variables

bash
OPENAI_API_KEY=your-openai-api-key
COMPOSIO_API_KEY=your-composio-api-key
COMPOSIO_USER_ID=your-user-id

Create a .env file in your project root:

These credentials will be used to:

  • Authenticate with OpenAI's GPT-5 model
  • Connect to Composio's Tool Router
  • Identify your Composio user session for Census bureau access

Import modules

import asyncio
import os
import dotenv

from composio import Composio
from composio_llamaindex import LlamaIndexProvider
from llama_index.core.agent.workflow import ReActAgent
from llama_index.core.workflow import Context
from llama_index.llms.openai import OpenAI
from llama_index.tools.mcp import BasicMCPClient, McpToolSpec

dotenv.load_dotenv()

Create a new file called census bureau_llamaindex_agent.py and import the required modules:

Key imports:

  • asyncio: For async/await support
  • Composio: Main client for Composio services
  • LlamaIndexProvider: Adapts Composio tools for LlamaIndex
  • ReActAgent: LlamaIndex's reasoning and action agent
  • BasicMCPClient: Connects to MCP endpoints
  • McpToolSpec: Converts MCP tools to LlamaIndex format

Load environment variables and initialize Composio

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
COMPOSIO_API_KEY = os.getenv("COMPOSIO_API_KEY")
COMPOSIO_USER_ID = os.getenv("COMPOSIO_USER_ID")

if not OPENAI_API_KEY:
    raise ValueError("OPENAI_API_KEY is not set in the environment")
if not COMPOSIO_API_KEY:
    raise ValueError("COMPOSIO_API_KEY is not set in the environment")
if not COMPOSIO_USER_ID:
    raise ValueError("COMPOSIO_USER_ID is not set in the environment")

What's happening:

This ensures missing credentials cause early, clear errors before the agent attempts to initialise.

Create a Tool Router session and build the agent function

async def build_agent() -> ReActAgent:
    composio_client = Composio(
        api_key=COMPOSIO_API_KEY,
        provider=LlamaIndexProvider(),
    )

    session = composio_client.create(
        user_id=COMPOSIO_USER_ID,
        toolkits=["census_bureau"],
    )

    mcp_url = session.mcp.url
    print(f"Composio MCP URL: {mcp_url}")

    mcp_client = BasicMCPClient(mcp_url, headers={"x-api-key": COMPOSIO_API_KEY})
    mcp_tool_spec = McpToolSpec(client=mcp_client)
    tools = await mcp_tool_spec.to_tool_list_async()

    llm = OpenAI(model="gpt-5")

    description = "An agent that uses Composio Tool Router MCP tools to perform Census bureau actions."
    system_prompt = """
    You are a helpful assistant connected to Composio Tool Router.
    Use the available tools to answer user queries and perform Census bureau actions.
    """
    return ReActAgent(tools=tools, llm=llm, description=description, system_prompt=system_prompt, verbose=True)

What's happening here:

  • We create a Composio client using your API key and configure it with the LlamaIndex provider
  • We then create a tool router MCP session for your user, specifying the toolkits we want to use (in this case, census bureau)
  • The session returns an MCP HTTP endpoint URL that acts as a gateway to all your configured tools
  • LlamaIndex will connect to this endpoint to dynamically discover and use the available Census bureau tools.
  • The MCP tools are mapped to LlamaIndex-compatible tools and plug them into the Agent.

Create an interactive chat loop

async def chat_loop(agent: ReActAgent) -> None:
    ctx = Context(agent)
    print("Type 'quit', 'exit', or Ctrl+C to stop.")

    while True:
        try:
            user_input = input("\nYou: ").strip()
        except (KeyboardInterrupt, EOFError):
            print("\nBye!")
            break

        if not user_input or user_input.lower() in {"quit", "exit"}:
            print("Bye!")
            break

        try:
            print("Agent: ", end="", flush=True)
            handler = agent.run(user_input, ctx=ctx)

            async for event in handler.stream_events():
                # Stream token-by-token from LLM responses
                if hasattr(event, "delta") and event.delta:
                    print(event.delta, end="", flush=True)
                # Show tool calls as they happen
                elif hasattr(event, "tool_name"):
                    print(f"\n[Using tool: {event.tool_name}]", flush=True)

            # Get final response
            response = await handler
            print()  # Newline after streaming
        except KeyboardInterrupt:
            print("\n[Interrupted]")
            continue
        except Exception as e:
            print(f"\nError: {e}")

What's happening here:

  • We're creating a direct terminal interface to chat with your Census bureau database
  • The LLM's responses are streamed to the CLI for faster interaction.
  • The agent uses context to maintain conversation history
  • You can type 'quit' or 'exit' to stop the chat loop gracefully
  • Agent responses and any errors are displayed in a clear, readable format

Define the main entry point

async def main() -> None:
    agent = await build_agent()
    await chat_loop(agent)

if __name__ == "__main__":
    # Handle Ctrl+C gracefully
    signal.signal(signal.SIGINT, lambda s, f: (print("\nBye!"), exit(0)))
    try:
        asyncio.run(main())
    except KeyboardInterrupt:
        print("\nBye!")

What's happening here:

  • We're orchestrating the entire application flow
  • The agent gets built with proper error handling
  • Then we kick off the interactive chat loop so you can start talking to Census bureau

Run the agent

npx ts-node llamaindex-agent.ts

When prompted, authenticate and authorise your agent with Census bureau, then start asking questions.

Complete Code

Here's the complete code to get you started with Census bureau and LlamaIndex:

import asyncio
import os
import signal
import dotenv

from composio import Composio
from composio_llamaindex import LlamaIndexProvider
from llama_index.core.agent.workflow import ReActAgent
from llama_index.core.workflow import Context
from llama_index.llms.openai import OpenAI
from llama_index.tools.mcp import BasicMCPClient, McpToolSpec

dotenv.load_dotenv()

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
COMPOSIO_API_KEY = os.getenv("COMPOSIO_API_KEY")
COMPOSIO_USER_ID = os.getenv("COMPOSIO_USER_ID")

if not OPENAI_API_KEY:
    raise ValueError("OPENAI_API_KEY is not set")
if not COMPOSIO_API_KEY:
    raise ValueError("COMPOSIO_API_KEY is not set")
if not COMPOSIO_USER_ID:
    raise ValueError("COMPOSIO_USER_ID is not set")

async def build_agent() -> ReActAgent:
    composio_client = Composio(
        api_key=COMPOSIO_API_KEY,
        provider=LlamaIndexProvider(),
    )

    session = composio_client.create(
        user_id=COMPOSIO_USER_ID,
        toolkits=["census_bureau"],
    )

    mcp_url = session.mcp.url
    print(f"Composio MCP URL: {mcp_url}")

    mcp_client = BasicMCPClient(mcp_url, headers={"x-api-key": COMPOSIO_API_KEY})
    mcp_tool_spec = McpToolSpec(client=mcp_client)
    tools = await mcp_tool_spec.to_tool_list_async()

    llm = OpenAI(model="gpt-5")
    description = "An agent that uses Composio Tool Router MCP tools to perform Census bureau actions."
    system_prompt = """
    You are a helpful assistant connected to Composio Tool Router.
    Use the available tools to answer user queries and perform Census bureau actions.
    """
    return ReActAgent(
        tools=tools,
        llm=llm,
        description=description,
        system_prompt=system_prompt,
        verbose=True,
    );

async def chat_loop(agent: ReActAgent) -> None:
    ctx = Context(agent)
    print("Type 'quit', 'exit', or Ctrl+C to stop.")

    while True:
        try:
            user_input = input("\nYou: ").strip()
        except (KeyboardInterrupt, EOFError):
            print("\nBye!")
            break

        if not user_input or user_input.lower() in {"quit", "exit"}:
            print("Bye!")
            break

        try:
            print("Agent: ", end="", flush=True)
            handler = agent.run(user_input, ctx=ctx)

            async for event in handler.stream_events():
                # Stream token-by-token from LLM responses
                if hasattr(event, "delta") and event.delta:
                    print(event.delta, end="", flush=True)
                # Show tool calls as they happen
                elif hasattr(event, "tool_name"):
                    print(f"\n[Using tool: {event.tool_name}]", flush=True)

            # Get final response
            response = await handler
            print()  # Newline after streaming
        except KeyboardInterrupt:
            print("\n[Interrupted]")
            continue
        except Exception as e:
            print(f"\nError: {e}")

async def main() -> None:
    agent = await build_agent()
    await chat_loop(agent)

if __name__ == "__main__":
    # Handle Ctrl+C gracefully
    signal.signal(signal.SIGINT, lambda s, f: (print("\nBye!"), exit(0)))
    try:
        asyncio.run(main())
    except KeyboardInterrupt:
        print("\nBye!")

Conclusion

You've successfully connected Census bureau to LlamaIndex through Composio's Tool Router MCP layer. Key takeaways:
  • Tool Router dynamically exposes Census bureau tools through an MCP endpoint
  • LlamaIndex's ReActAgent handles reasoning and orchestration; Composio handles integrations
  • The agent becomes more capable without increasing prompt size
  • Async Python provides clean, efficient execution of agent workflows
You can easily extend this to other toolkits like Gmail, Notion, Stripe, GitHub, and more by adding them to the toolkits parameter.

How to build Census bureau MCP Agent with another framework

FAQ

What are the differences in Tool Router MCP and Census bureau MCP?

With a standalone Census bureau MCP server, the agents and LLMs can only access a fixed set of Census bureau tools tied to that server. However, with the Composio Tool Router, agents can dynamically load tools from Census bureau and many other apps based on the task at hand, all through a single MCP endpoint.

Can I use Tool Router MCP with LlamaIndex?

Yes, you can. LlamaIndex fully supports MCP integration. You get structured tool calling, message history handling, and model orchestration while Tool Router takes care of discovering and serving the right Census bureau tools.

Can I manage the permissions and scopes for Census bureau while using Tool Router?

Yes, absolutely. You can configure which Census bureau scopes and actions are allowed when connecting your account to Composio. You can also bring your own OAuth credentials or API configuration so you keep full control over what the agent can do.

How safe is my data with Composio Tool Router?

All sensitive data such as tokens, keys, and configuration is fully encrypted at rest and in transit. Composio is SOC 2 Type 2 compliant and follows strict security practices so your Census bureau data and credentials are handled as safely as possible.

Used by agents from

Context
Letta
glean
HubSpot
Agent.ai
Altera
DataStax
Entelligence
Rolai
Context
Letta
glean
HubSpot
Agent.ai
Altera
DataStax
Entelligence
Rolai
Context
Letta
glean
HubSpot
Agent.ai
Altera
DataStax
Entelligence
Rolai

Never worry about agent reliability

We handle tool reliability, observability, and security so you never have to second-guess an agent action.