Security Guide

OpenClaw: Security Framework Before Installing

A framework to understand the trust model and security context of OpenClaw before deciding if it's right for you. Complete 7-step guide.

Created by Federico Benitez from Muno Labs·Feb 2026

TL;DR

OpenClaw is a capability delegation system for AI agents. Security isn't just about who can talk to the bot, but what it can do when it loses context. Start with minimum access, don't give destructive tools by default, and consider isolation on a separate machine.

This is not an installation manual. It's a framework to help you understand the trust model and the security context before deciding if OpenClaw is right for you.

01Understand the system (it goes far beyond a chat)

The first step to using this tool safely is understanding what it actually is. OpenClaw works as a capability delegation system. Think of it as giving your house keys to someone who follows instructions to the letter, but sometimes forgets what they were.

You're delegating capabilities within boundaries you define yourself (accounts, machine, authentication, approvals). Real tools from your workflow, like your file system, terminal, browser, or cron jobs, come under the control of an agent.

The Interface: Messaging channels (Telegram, WhatsApp) are your primary way to communicate with the agent.
The Goal: The tool assumes you want the agent to do real work: execute commands, modify files, access network services.
The Challenge: Setting up guardrails (safety barriers) without destroying the agent's usefulness.

The official guide is straightforward: there is no "perfectly secure" configuration. Be deliberate about who can talk to the bot, where it can act, and what it can touch. Start with the minimum access that works and expand only when you have evidence it's necessary. Every additional capability expands the "blast radius".

02Understand the Gateway

To secure OpenClaw, you first need to know its "administrative brain": The Gateway. This component coordinates all conversations, keeps connections alive with your channels (Slack, WhatsApp, Discord, etc.), applies your configurations, and manages background tasks (cron jobs). It also has a web UI where you can see what the agent is doing.

By design, the Gateway only accepts connections from your own machine (localhost), which protects you from direct attacks over the internet. But the documentation is clear: OpenClaw assumes a single trusted operator. It is not multi-tenant.

Watch out for shared access:

If multiple people can message your bot, they share exactly the same authority over the tools as you do. For example:

If you add the bot to a Slack channel with 50 people, anyone could ask it to read the "confidential" folder on your desktop and send it to them via DM.
If you leave your WhatsApp Web open and someone messages the bot, they could delete files from your computer or access your email, just by asking.

03Skills vs. Plugins (They're not the same)

When giving your agent new capabilities, you'll encounter two concepts that are often confused, but have very different risk profiles.

Skills (Workflow Instructions)

Similar to what you'd see in code agents (Claude Code, Cursor). They're folders with a SKILL.md file that "teach" the agent how to perform certain tasks.

A skill doesn't add privileges by itself. It can only push the agent to use the tools you've already enabled. If a skill is malicious, it depends entirely on you having risky tools enabled to do any damage.

Plugins (Code Modules)

These are TypeScript modules that are loaded at runtime and run in the same process as the Gateway.

By running alongside the Gateway, they inherit all of its permissions. A malicious plugin can bypass your security model and operate directly on your network or files without invoking the agent. If you install a plugin from npm, always treat it as if you're running untrusted code.

What to do?

Disable plugins by default: Always require an allow list (plugins.allow) if you truly need them.
Audit your Skills: Always review the SKILL.md file. Red flags: external URLs or webhooks, binary installation instructions, use of command-dispatch (this tool bypasses the model), configurations that inject secrets or API Keys.

04The Tool System

The tools layer is what turns the agent from a simple conversationalist into something that can interact with the real world. This is where the project has put significant effort into mitigating risks:

Content wrapping: When using tools like web search or emails, OpenClaw wraps untrusted content with markers and security warnings. It detects "prompt injection" patterns but doesn't block them; it only alerts the underlying AI model.
SSRF Defenses: Normalizes hostnames, blocks access to your internal network (localhost), and verifies DNS responses to prevent network-level deception.
Safe Web Fetch: Limits the size of web responses and prefers extracting only readable text.

The platform doesn't pretend these features "solve" security problems. They're guardrails; the ultimate responsibility remains yours as the operator.

05Operational Control

Traditional security focuses on "who can do what" assuming a malicious attacker. With AI agents, a problem appears that affects even legitimate users: instructions can be forgotten, but capabilities (tools) are always there.

"I told the agent: 'Review this inbox and suggest what to archive or delete, don't execute anything until I say so.' It worked fine on my test inbox. But my real inbox was huge and triggered context compaction. During compaction, it lost my original instruction... and deleted thousands of emails."

Why is this NOT a bug?

The agent wasn't malicious and it wasn't hacked. The problem is structural: LLMs don't have guaranteed persistent memory. Context windows have limits and instructions that seemed clear simply evaporate.

Watch out for deletion permissions: If the agent has the capability (tool) to do something destructive, it will eventually do it (by mistake, context loss, or misinterpretation). OpenClaw ships with many of these capabilities enabled by default.

Approach	Robustness Level	Result
Instruction: "Don't do X"	Fragile	Gets lost during context compaction.
Require manual approval	Medium	Causes fatigue and you end up approving without reading.
Don't give the tool	Robust	Without a delete tool, it's impossible to delete.
Separate agents	Robust	Agent 1 only reads, Agent 2 executes with approval.

OpenClaw gives many capabilities by default and is designed for long-running tasks where context compaction is common:

exec can execute any shell command
write can overwrite any file in scope
web_fetch can send data to any URL
Gmail/Calendar tools can delete, modify, and send
Cron jobs run unsupervised for extended periods

Secure configuration goes beyond "who can talk to the bot". Ask yourself what the bot can do when it inevitably loses context or misinterprets. Don't give tools "just in case". If your use case is "marketing campaign assistant", the agent doesn't need permission to delete.

06Should I run it on a separate machine?

You've probably seen people buying "Mac Minis" just to run OpenClaw, or using virtual servers. This mitigates some risks.

What it DOES mitigate

If the agent deletes files or executes destructive commands, only that machine is affected
Malware installed via plugins stays contained there
Your personal documents, photos, and everyday credentials are not exposed

What it DOESN'T mitigate

If you give it access to your email/calendar, it can read, delete, and send from your real account
If you store service credentials (Hubspot, your bank), those credentials are still exposed
Malicious skills can exfiltrate data to external servers
Anyone who can message the bot still has access to everything the bot is connected to

Bottom line: A separate machine reduces the "local blast radius" (your files, your system) and is a good way to start. But it doesn't protect the external services you connect or the credentials you store there.

07Questions to ask yourself before installing

Before running the install command, ask yourself these questions:

What tools does my use case actually need? Can I do it without delete/write permissions?

What's the worst that could happen if the agent misuses each tool?

Who else can message the bot and do I trust them?

Will I leave open sessions where others could gain access?

Will I assign long-running tasks where the agent could lose context?

Do I have backups of everything the agent can touch?

If something goes wrong, do I have a way to audit what happened?

Do I want to explore first in a virtual machine with progressive access?

References

OpenClaw RepositoryOfficial source code Trust PortalSecurity and trust portal Threat Model ATLASThreat model based on MITRE ATLAS MITRE ATLASThreat framework for AI KiloClawInfrastructure, security, and updates NanoClawAlternative with container isolation KlausOpenClaw hosted on a VM with security and privacy CelerioCurated tool layer

Key Takeaways

1OpenClaw works as a capability delegation system — it's not just a chatbot
2The Gateway is single-tenant: if multiple users can message the bot, they share your full authority
3Skills don't add privileges but Plugins inherit all Gateway permissions
4Instructions are lost during context compaction but capabilities (tools) are always available
5Removing destructive tools is more robust than asking the agent not to use them
6A separate machine reduces local blast radius but doesn't protect connected external services
7Ask yourself the 8 security questions before installing

FAQ

Is it safe to install OpenClaw on my personal computer?

It depends on your configuration. The Gateway only accepts connections from localhost, but if you give it access to destructive tools (delete files, execute commands), the agent could use them by mistake during context loss. The recommendation is to start on a separate machine or VM.

What's the difference between Skills and Plugins in OpenClaw?

Skills are SKILL.md files that guide the agent but don't add privileges by themselves. Plugins are TypeScript modules that run in the Gateway process and inherit all its permissions. A malicious plugin can bypass your security model.

What happens if the agent loses context during a long task?

The agent keeps tools available but may forget restrictive instructions like 'don't delete anything'. This isn't a bug but a structural limitation of LLMs. The solution is to not give destructive tools in the first place.