Building a deep research agent using Composio and LangGraph

by HarshMay 19, 20258 min read
AI Agents

In the last blog post, we learned how to build a local deep research assistant using open-source tools. It was a great start, but it had some limitations. This agent improves upon it:

  • It can focus on specific topics

  • It has a broader toolkit support

  • You can save research summaries to Google Docs or Notion.

This can now do domain-specific research, connect easily to external apps, and even save your findings to Google Docs via Composio.

In this blog, I’ll show you how to build it yourself—step by step, with no advanced experience needed. Additionally, you can read about how LangGraph compares to other agent-based frameworks.

Let’s get started!

Workings

Our local deep researcher agent will follow a 4-step process:

Topic Generation

  • The user enters the research topic and domain.

  • LLM generates 3 Yes/No research questions based on the input.

Topic Research

  • Agent node (LLM) analyzes the question and decides if a tool is needed.

  • If tool use is required, control passes to the tool node (e.g., Composio search).

  • The tool executes, returns results, and the agent refines the response.

The final answer for each question is stored.

Report Generation

  • After all questions, LLM compiles answers into a structured HTML report.

  • The Google Docs tool is invoked to create a professional document.

Explicit Tool call

  • The user can optionally ask follow-up questions or call for report generation. (sanity check)

  • Can be inferred by logs generated during agent runs.

Here is a workflow image for better understanding:

This approach makes the process modular and easy to debug using debug logs.

However, a Couple of caveats here:

  • The architecture is changed to support more fine-grained research over a topic

  • The agent also includes a human in the loop for explicit tool calls.

  • For demo purposes, the blog only includes the backend / terminal version of the program.

  • Familiarity with Python and d-type validation is preferred

Having understood the entire workflow, let’s get building the tool

Let’s Get Building!

Workspace

First things first: to keep things isolated, let’s create a separate workspace, create a virtual environment, and install all dependencies.

Create Virtual Environment

Head to the terminal and run commands one by one (ignore #)

This will create a new folder – deep-research-agent Navigate to it, create a new environment (.venv), and activate it.

Install Dependencies

Inside the editor, create a new file requirements.txt and paste the following dependencies – ensure not to remove the version number.

We will use a handful of dependencies here, compisio_core, langraph, and langchain_ollama, which allow us to implement tool calling and run LLM locally.

Setup Tools

Once libraries are added, head to the terminal and hit

Authenticate using OAUTH when prompted.

This adds the Google Docs tool.

NOTE: If you are not logged in, you must log in to Composio and authenticate. API KEY must be present in the .env file.

Define Secrets

Next, in the root directory, create a new file called .env to store API keys.

You can get one for free by going to Composio and creating an API Key. You may need to log in or sign up.

Set up Ollama

Now, set up Ollama by downloading it from the official site and installing it using the default configuration (imp). Then, in the command line, run:

This pulls the model if it is not present; otherwise, it loads it. You can choose any model except 256 GB (massive compute req). The current model is a 4-bit quantised 8-bit model.

Not done yet;

Finally, head to the environment variables and create a new environment variable.

• OLLAMA_HOST with value 127.0.0.1:3030 & save the file.

In case you are facing the issue, kindly follow this easy guide

Once done, you are now ready to write the main agent code.

Creating Research Agent

Create a new file deep_research_agent.py Follow along, or copy and paste the code from the deep_research_agent.py gist.

Anyway’s if you are up for learning, here we go 🚀

We begin by importing all necessary libraries to build the research agent.

These include:

  • Standard libraries like os and typing,

  • Environment loading with dotenv,

  • LangGraph components for defining workflow logic,

  • LLM integration via langchain_ollama,

  • Tooling support with Composio.

After sorting dependencies, the next step is to define the state—a container for the data that flows between nodes.

In LangGraph, a State defines an agent’s system memory. Think of it as storage that maintains and tracks information as AI agents process data.

In this case, the code defines a:

  • State, a TypedDict object, holds the ongoing conversation (messages).

  • graph_builder Then initialises the state based on the defined State schema.

After preparing the graph structure, configure the language models and external tools.

Here’s what we’re doing:

  • Initializing two instances of the Qwen 3 (8B) model locally via Ollama

  • Registering tools for search and document generation using Composio

  • Binding tools to one model instance for tool-augmented responses

Here, you can choose from over 250 tools directly via Composio, allowing you to build far better agents.

Once tools and models are in place, add the agent nodes.

Nodes define the processing unit in LangGraph.

They can be functional, class-based, tool-based, conditional, or runnable based on their specific purpose.

In this case

  • The agent node interacts with the LLM and decides whether to call a tool-functional node.

  • The tools node executes the selected tool action and returns results, in this case, the tool node.

Next, join the nodes to define how control flows across the graph and memory setup.

The edge in LangGraph determines the flow of the graph.

So, a conditional edge checks if a tool needs to be invoked. If yes, the agent passes the control to the tool node you defined earlier.

In the process:

  • Memory Saver saves the interaction in memory,

  • Multiple memory checkpoints across the runs are saved

  • and the Config object is created with the required key for the check-pointer, cast to RunnableConfig

With the architecture ready, you must define a clear, instructive prompt for the assistant.

This prompt sets tone, format, depth, and tool usage expectations.

Let’s add support to prompt the user to input the research topic and domain.

These inputs guide the assistant in crafting tailored and domain-relevant research.

Next, we define the question generator component, which generates 3 specific yes/no research questions based on the user’s topic.

This step transforms an open-ended topic into actionable, researchable questions. A very alternative to IterDag approach from earlier.

Here is what’s happening:

  • We prompt the LLM to generate 3 research questions using the defined prompt.

  • Then extract questions from the response (handling potential “thinking” output)

  • Finally displays the questions to the user after handling the processing (for clarity)

  • To debug the agent execution, we also add a print statement.

We now loop through each question, run the graph, and collect answers using the tools.

A lot of code, but here is what it does in a nutshell:

  • Creates a new graph instance with its own memory.

  • Sets up a configuration with a unique thread ID.

  • Crafts a prompt instructing the agent to answer the question using search tools.

  • Streams the response and collects the answer.

  • Store question-answer pairs for later use.

Again, to debug tool calls, we add logs for the console/terminal. This will help us later determine whether the tool was called.

Note: Each question is researched in isolation, using a new graph with its own memory context.

With all answers ready, we instruct the assistant to compile a professional HTML report and create a Google Doc.

The assistant:

  • Formats all answers using structured HTML. – Good for conversion with Markdown

  • Uses the GOOGLEDOCS_CREATE_DOCUMENT_MARKDOWN tool to convert HTML to markdown. (very important)

  • Streams the Google Doc link once created. (disable if not req)

  • Saves agent interaction and tool call data at checkpoints within the State.

To wrap up, we can optionally ask follow-up questions with context preserved. If the agent workflow fails, the user can perform explicit tool calling.

This ensures a continuous, contextual dialogue after the report is completed.

With this, the code is now complete, so it’s time to test it.

Run The Code

To run the code, open the terminal and type:

You will get an output of the server running:

If you face any issue, ensure the OLLAMA_HOST env variable is defined! (For details, refer to the workspace setup section)

Now open a new terminal, activate the environment (.venv) and start the agent with

In a few seconds, you will be prompted to enter your research topic and domain, and the process will start. Ignore warnings, if any.

Once the research doc is generated, you will get a link to check the file.

If the Google Docs tool execution fails (due to an ambiguous prompt), you will be prompted to enter text to perform the tool call explicitly. (ensures sanity)

Here is a demo of me using it 👇

As you can see, the link is generated at the end. You can even inspect logs to discover what Composio Search and Google Docs tools are called as needed!

Final Thoughts

Building your own local research assistant might sound complex, but as demonstrated, it’s not.

Combining LangGraph, Ollama, and Composio gives you a powerful, flexible, and private system you control.

This project is just the beginning. You can keep improving it by:

  • • Adding more tools

  • • Customising the research flow as per your needs

  • • Add MCP support

and much more. Check out the Composio documentation for building better agents..

H
AuthorHarsh

Share