RUBE(MCP)

pricing

blog

Solutions

docs

Toolkits

RUBE(MCP)

pricing

blog

Solutions

docs

Toolkits

MCP

Pricing

blog

tools

docs

EXPLORE

AGENT AUTH

ENTERPRISES

STARTUPS

Prathit Joshi

Feb 13, 2026

AI Agents

Building OpenClaw from scratch without the security issues

Prathit Joshi

•

Feb 13, 2026

AI Agents

Ship powerful agents fast

Add 10K+ tools to your AI Agent

Try for free

OpenClaw launched with great fanfare, and I was curious whether you could truly "vibe code" the entire project on your own, especially since the original creator built it with Codex. We're in the era of "build it yourself instead of setting it up" and I wanted to take that philosophy a step further by recreating it from scratch.

This is the story of how I rebuilt OpenClaw using modern coding agent SDKs, tackled integration challenges across multiple messaging platforms, and deployed it securely in production,all while avoiding the security pitfalls of the original. You can take the open-source 'secure openclaw' repo for a spin to see inner workings.

Research & Planning

The first thing I did was use GPT Pro mode to research the entire codebase and explain all the features and tools used. The Pro model excels at these broad tasks that require processing large amounts of information in a single shot. It gave me a detailed product spec on how OpenClaw works and what it uses for each functionality.

I decided to use coding agent SDKs because they represent the first real use cases people have had with LLMs beyond writing. Claude provides the Claude Agent SDK, and OpenCode provides a similar SDK. These SDKs natively provide access to tools like read, write, bash, edit, and support for skills and MCP (Model Context Protocol).

Architecture Overview

I wanted to set up two modes:

Terminal mode: For direct interaction and development
Gateway mode: For 24/7 operation, listening to WhatsApp, Telegram, Signal, iMessage, and other messaging apps

The gateway architecture is what makes OpenClaw powerful, it runs continuously in the background, monitoring multiple communication channels and responding autonomously.

Messaging Platform Integrations

WhatsApp integration uses a library called Baileys to establish a WhatsApp Web connection. Here's how it works:

Baileys connects to WhatsApp Web's WebSocket
When a message arrives, WhatsApp's server pushes it via WebSocket
Baileys emits a messages.upsert event with type 'notify'
The agent can then process and respond to the message

One challenge I encountered was creating the allowlist for WhatsApp numbers. WhatsApp doesn't use phone numbers directly in the WebSocket connection, it uses link IDs. Messages arrive with these IDs, and I needed bidirectional conversion between phone numbers and link IDs. Claude Code initially struggled with building the right mapping, but after some iteration, we got it working correctly.

Telegram was much more straightforward thanks to its Bot API. The implementation uses long polling:

Periodically calls Telegram's getUpdates API
Waits up to 30 seconds for new messages
When a message arrives, it immediately returns and calls getUpdates again
Emits a message event for each new message

The Bot API is well-documented and significantly easier to set up than WhatsApp.

iMessage

iMessage integration was a fascinating unlock. It uses a library called imsg, built by Peter Steinberger himself. The approach:

Reads the SQLite database where all iMessages are stored
Monitors the database using FSEvents, a kernel-level file system monitoring API on macOS
Detects new messages in real-time as they're written to the database

This gives the agent access to iMessage without requiring any official API.

Tools & Integrations

As they say, an agent is nothing without the tools it uses. I equipped the agent with:

Core Tools:

Read, Write, Edit (file operations)
Bash (command execution)
Glob, Grep (file searching)
TodoWrite (task management)
Skill (access to predefined workflows)
AskUserQuestion (user interaction)

Custom Tools:

Cron tools for scheduled tasks
Gateway tools for WhatsApp and Telegram communication

Third-Party Integrations: For secure integration with services like Slack, GitHub, Teams, and more, I used Composio. Composio lets you securely connect and use these tools in a sandbox environment while handling all the credentials and authentication.

Composio also solved three major security issues with original OpenClaw. The original requires pasting API keys into env vars (prompt injection can leak them), runs code directly on your machine (malicious email = compromised filesystem), and pulls skills from ClawHub (malicious ones were caught distributing malware). Composio uses OAuth for managed auth, runs code in sandboxed environments instead of locally, and provides governed tools without dependency on third-party skill registries.

Deployment Challenges

The Docker Setup

I created a Docker setup designed to run in the background on a DigitalOcean droplet. The goal was to make it quickly deployable with minimal setup hassles. However, I ran into several issues:

Problem 1: OOM (Out of Memory) Errors

Running on a $6/month instance with 2GB RAM, the container kept crashing. The issue? It tried installing Claude Code and OpenCode together simultaneously, exhausting available memory. Once I identified this, I staggered the installations and the problem was resolved.

Problem 2: Permission Mode Conflicts

The gateway uses permissionMode: 'bypassPermissions' so the agent can run autonomously without human approval for each tool call. However, Claude Code refuses to enable this when running as root, a built-in security feature.

The Solution:

I had to restructure the entire Dockerfile to use a non-root user:

This cascaded into fixing:

All file paths (/root/ → /home/claw/)
Docker Compose volume mounts
CLI installation directories
Workspace permissions

The refactoring took several hours but resulted in a much more secure deployment that adheres to best practices.

Key Takeaways

Modern coding agents are incredibly capable - With proper tooling and context, they can rebuild complex systems from scratch
Security by design matters - The forced non-root user setup, while initially frustrating, led to a more secure architecture
Integration complexity varies wildly - Telegram took 30 minutes, WhatsApp took hours, iMessage required creative solutions
Resource constraints force better architecture - The 2GB RAM limitation pushed me to optimize installation and runtime behavior
Documentation is everything - Services with good APIs (like Telegram) are significantly easier to integrate than those requiring reverse engineering

What's Next

The rebuilt OpenClaw is now running in production, handling messages across multiple platforms without the security issues that plagued the original. Future improvements include:

Adding more messaging platforms (Discord, Slack DMs)
Implementing better error handling and retry logic
Creating a web dashboard for monitoring and configuration
Optimizing memory usage to run on even smaller instances

Building this from scratch was an excellent exercise in understanding how modern AI agents work in production. The combination of LLM capabilities, proper tooling, and careful architecture makes it possible to create powerful autonomous systems that were previously extremely difficult to build.

Research & Planning

Architecture Overview

I wanted to set up two modes:

Terminal mode: For direct interaction and development
Gateway mode: For 24/7 operation, listening to WhatsApp, Telegram, Signal, iMessage, and other messaging apps

The gateway architecture is what makes OpenClaw powerful, it runs continuously in the background, monitoring multiple communication channels and responding autonomously.

Messaging Platform Integrations

WhatsApp integration uses a library called Baileys to establish a WhatsApp Web connection. Here's how it works:

Baileys connects to WhatsApp Web's WebSocket
When a message arrives, WhatsApp's server pushes it via WebSocket
Baileys emits a messages.upsert event with type 'notify'
The agent can then process and respond to the message

Telegram was much more straightforward thanks to its Bot API. The implementation uses long polling:

Periodically calls Telegram's getUpdates API
Waits up to 30 seconds for new messages
When a message arrives, it immediately returns and calls getUpdates again
Emits a message event for each new message

The Bot API is well-documented and significantly easier to set up than WhatsApp.

iMessage

iMessage integration was a fascinating unlock. It uses a library called imsg, built by Peter Steinberger himself. The approach:

Reads the SQLite database where all iMessages are stored
Monitors the database using FSEvents, a kernel-level file system monitoring API on macOS
Detects new messages in real-time as they're written to the database

This gives the agent access to iMessage without requiring any official API.

Tools & Integrations

As they say, an agent is nothing without the tools it uses. I equipped the agent with:

Core Tools:

Read, Write, Edit (file operations)
Bash (command execution)
Glob, Grep (file searching)
TodoWrite (task management)
Skill (access to predefined workflows)
AskUserQuestion (user interaction)

Custom Tools:

Cron tools for scheduled tasks
Gateway tools for WhatsApp and Telegram communication

Deployment Challenges

The Docker Setup

I created a Docker setup designed to run in the background on a DigitalOcean droplet. The goal was to make it quickly deployable with minimal setup hassles. However, I ran into several issues:

Problem 1: OOM (Out of Memory) Errors

Problem 2: Permission Mode Conflicts

The Solution:

I had to restructure the entire Dockerfile to use a non-root user:

This cascaded into fixing:

All file paths (/root/ → /home/claw/)
Docker Compose volume mounts
CLI installation directories
Workspace permissions

The refactoring took several hours but resulted in a much more secure deployment that adheres to best practices.

Key Takeaways

Modern coding agents are incredibly capable - With proper tooling and context, they can rebuild complex systems from scratch
Security by design matters - The forced non-root user setup, while initially frustrating, led to a more secure architecture
Integration complexity varies wildly - Telegram took 30 minutes, WhatsApp took hours, iMessage required creative solutions
Resource constraints force better architecture - The 2GB RAM limitation pushed me to optimize installation and runtime behavior
Documentation is everything - Services with good APIs (like Telegram) are significantly easier to integrate than those requiring reverse engineering

What's Next

The rebuilt OpenClaw is now running in production, handling messages across multiple platforms without the security issues that plagued the original. Future improvements include: