TL;DR
Deep research is the hottest AI feature right now. Every LLM provider is adding to their product. But the models are proprietary, and you have less freedom.
Check this out for using Composio's SaaS apps with LangGraph for building deep research agents.
So, I built a local deep research agent. Here’s what I used
LangGraph for orchestrating the agentic workflow
DuckDuckGo search tool for searching content online
Ollama for locally hosting the model
Qwen 3 as the model provider
How It Works
Local Deep Researcher is inspired by IterDRAG.
We will be using IterDRAG approach to build our agent. Here is a simple diagram to help you understand the flow in terms of langraph.

In this approach, we will decompose a query into sub-queries, retrieve documents for each one, answer the sub-query, and then build on the answer by retrieving documents for the second sub-query.
In Langgraph, nodes (rectangular blocks) represent execution, while the edges (arrows) represent flow. I hope the diagram is straightforward.
However, I will explain more details in the respective sections.
The Code
Now that you have understood how our project will work, let’s get building.
For simplicity, I have divided this section into three subparts: Workspace Setup, Main Code, and Running the Program. If you are a seasoned developer, you can skip to the main code; otherwise, kindly follow the workspace setup.
Workspace Setup
Let’s start setting up the workspace to ensure our code runs in an isolated environment.
1. Define Folder Structure (optional)
Head to the terminal and type the following command one by one
Once executed, verify that your folder structure is
Now that we have the required files in the root, it’s time to fill them out. For simplicity, you can click on each of the filenames and copy the code:
.env– stores all the environment variables, secrets
pyproject.toml – acts as a project config file to standardise & simplify the configuration of Python projects. Or
requirements.txt– Alternatively, you can also create a
requirements.txtfile to store all the dependencies (obviously, you must create it).
2. Virtual Environment Setup
Next, head to the root folder in terminal and type:
Ensure the output shows:
This activates the virtual environment.
Next do:
This command looks for a pyproject.toml file and installs all the dependencies present in the dependencies.
The -e means editable mode, which links the project directory to the environment (rather than copying files to the site-package folder), instantly making all edited modules (files) available, rather than re-installing. Pretty handy
With this, our working environment is set up. Time to write the main code
Writing Main Code (agent.py)
Assuming you are clear with the logic in “How it Works”, let’s start by loading all the required libraries
1. Load All Required Libraries
The project uses various libraries for web interaction (httpx, markdownify), LLM agent orchestration (langchain_ollama, langchain_core, langsmith, langgraph), structured data handling (json, dataclasses, typing, operator).
Next, let’s define the local LLM to use.
2. Define Local LLM
I will use the qwen3:8b-q4_k_m model for the demo, i.e., Qwen 3 – 8 billion parameters quantized to 4 bits with low GPU usage. You are free to use your own.
Paste the following code:
I will use the qwen3:8b-q4_k_m The demo’s model is Qwen 3 – 8 billion parameters quantized to 4 bits with low GPU usage. You are free to use your own.
Paste the following code
The code imports ChatOllama module, loads the qwen3 model, creates an instance of the model, defines a couple of parameters (temperature and format) and activates the json mode.
Next, let’s define the states
3. Define States
In langgraph, states are structures/objects that preserve all the information during the agent’s lifetime. Think of them as memories that store every interaction and update them based on agent actions.
Let’s define 3 states:
SummaryState– to all interaction data of the agent,SummaryStateInput– to store user-input data interaction – in this case,*research_topic*, the only part exposed to the userSummaryStateOutput– to store the agent output
Here is how to define it:
A lot is going on here, but let me simplify a bit:
The
SummaryStateclass with fields that track the entire research and summarisation workflow, which includes:research_topicandsearch_query– to capture the focus of the work,web_research_resultsandsources_gathered– lists to store accumulated findings and source URLs,research_loop_count– to track how many iterative steps the research has gone through.final_summary– to hold the completed summary report.
Using
@dataclass(kw_only=True)enforces keyword-only initialisation, improving clarity and reducing errors during object construction.Then, a
Summarystateinput, aTypedDict, defines what the user needs to provide to kick off the research process—currently, just a single field:research_topic.Finally,
SummaryStateOutput, also aTypedDict, holds the result of the process, specifically, thefinal_summary.
Next, let’s add all the required prompts.
4. Define Prompts
Prompt defines how models should act and is often considered the most essential part of any AI application.
As a side note, be specific, explicit and detailed when writing your prompt to generate the best possible output.
For our application, we will define three prompts:
Query Writer Prompt – Generates a query out of the given research topic, kind of an entry point
Summarise Prompt – Summarises all the text fetched by the web research agent to ponder next.
Reflection Prompt – Reflect on the summary, find gaps, and generate a follow-up question for the next web search.
Here are the prompts I have used; feel free to modify them if you like:
The query_writer_prompt passes instructions to the model as a JSON dict, and the same goes for the reflection prompt.
All this continues in a loop till a specific condition is met. (spoiler alert 😉)
Next, let’s add the nodes.
5. Add Nodes & Build Graph
In a language node, the task is executed, and the results are passed to the next node. For our use case, we will define five nodes:
generate_query– generates the query usingquery_writer_prompt, calls the model and store the result as JSON usingllm_json_modedefined earlierweb_research– performs the web research using any search tool/api/mcp, formats the data in a human-readable format, stores the sources in thesources_gatheredattribute ofSummaryStateand increments theresearch_loop_countattribute.summarize_sources– Invokes the LLM to update or create a summary by combining the existing summary (if any) with the latest research.reflect_on_summary– Reflects on the generated summary and creates a follow-up query using llm injsonmode.route_research—Routes the research based on the follow-up query unless the number of iterations is not met or the output is unsatisfactory. Once the output is met, finalized, or satisfied, it routes to thefinal_summarynode.finalize_summary– Combines the current summary with all the gathered sources and creates a final summary in markdown format.
Here is the code for the same:
Now, let’s combine all of the nodes and edges to create a graph that resembles the app’s flow.
Notice the use of add_conditional_edges; this was done to add loop-back logic in route_research
7. Invoking Agent & Post Processing Response
Finally, add the following code:
Now let’s test our local Qwen3 deep researcher.
Run The Program
Open your terminal and type:
In another terminal type (within the env):
As we have hardcoded the research topic Please wait for it to end execution and show the final summary!
I got this as a response to a given query (Benefits of Paneer).

It included key benefits and considerations, balancing the best of both worlds. It also included all the relevant sources, like Google/Openai deep search.
However, as a user, all this feels limited without a UI So, let’s add the UI component to the project (optional).
Extras – Add UI (optional)
To keep things simple, I will use LangSmith for the job (it’s not a frontend, but it works). LangSmith allows me to test the agent in a pleasant UI environment with built-in evaluation and testing. One library to handle them all.
Adding a UI component requires code to be modular and adds a couple more configuration files.
I have added everything to the Project GitHub Repo for simplicity. However, you are free to build your own UI.
Here is the overview of changes:
All states, prompts and graphs are moved to
state.py,prompts.py, andgraph.py, respectively in theollama_deep_researcherfolderNew files like
__init__.py,configuration.py(holds langsmith UI configuration) andutils.py(helper utility function) are added as part of configuration and utilities, if using standalone, don’t need to add them.Readme.md file added for reproducibility.
Make sure you follow the instructions and get your repo set up.
Now let’s set up the langgraph.
Open your terminal and run:
We ran the llama to serve on a specific port, installed all dependencies in development mode, installed the langraph CLI, and ran the server.
If all is done successfully, you will see the following output:

Go to the green URL, and you can interact with the agent by adding your research topic in the field. You can tweak the settings to see how they affect behaviour.
Here is a video of me using the researcher 👇
I hope you also got a similar output to mine, or even better. To get a more detailed output, increase the Research Depth parameter.To modify options and add extra ones, edit the configuration.py file.We have concluded this comprehensive article, but here are my final thoughts on the project.Final ThoughtsBuilding this project will give you a solid foundation in professionally building agents with Langchain & Langraph.To keep things simple, I have kept the version to core features. But feel free to make it your own by adding a frontend, plugging in RAG, exploring multimodal inputs, or connecting with MCP tools using Composio Toolset / MCP (coming up next 😉) & Others.