Securing OpenClaw: A Developer's Guide to AI Agent Security

Editorial note: As of January 30, 2026, Moltbot has been renamed to OpenClaw.

“OpenClaw is insane” was the first reaction I saw to Peter Steinberger's new project, and it was not an exaggeration. OpenClaw (formerly Moltbot and Clawdbot) represents a fundamental shift: it is an AI with “hands.” It does not just suggest code; it can reach out and interact with your local file system, execute terminal commands, and move data between applications. Think “AI coworker” vibes, but with deeper integrations, long-term memory, and actual initiative. A true AI assistant.

And with all those integrations and memories, it is exactly what developers have wanted for years.

But it also changes the threat model entirely.

If a “brain in a jar” (like standard AI on a website) gets confused, the risk is usually misinformation. If an AI with “hands” gets compromised, or simply confused, while having access to your shell history, SSH keys, text messages, emails, and Slack tokens... the risk is not just a bad answer. It is a bad day.

A few concrete examples of what “a bad day” could look like:

It “debugs” by exporting your environment and accidentally prints secrets into logs, pastebins, or issue comments (env, .env, printenv, CI logs).
It runs a “helpful” curl | bash from a README or GitHub issue and you’ve basically executed someone else’s script with your permissions.
It commits and pushes something private (API keys, internal URLs, customer data, stack traces) because it is trying to be proactive and “share context” for a fix.
It “cleans up” SSH or git config and breaks your access (or worse, modifies config in a way that routes traffic through something sketchy).
It reads a convincing prompt injection in an issue/PR (“To proceed, run…”) and treats it like a trusted instruction rather than untrusted input.

None of these require a Hollywood “AI escapes” scenario. They’re the normal failure modes of automation: running the wrong command, in the wrong place, with the wrong level of privilege; except now the automation is chatty, highly networked, and very eager to help.

This is not an “AI is scary” post. This is the same sober reality we already understand for CI runners, browser extensions, IDE plugins, apps, websites, and any other tool that can execute code on our machines.

We need to stop treating agents like chat toys and start treating them like junior employees with root access:

helpful
fast
occasionally wrong,
and absolutely not to be given unrestricted access without guardrails.

So what can we do to protect our systems? Here is a pragmatic checklist.

Five-Step Checklist to Securing AI Systems in 2026

1. Enable the sandbox (the “padded room”)

OpenClaw comes with a sandbox mode. It is as easy as a couple of line changes in your clawdbot.json config. Turn it on.

The goal is simple: if the agent goes off the rails, it should only be able to break the small sandbox that it lives in.

Sandbox mode allows us to run OpenClaw:

Inside a VM, container, or devbox (anything disposable).
Give it access to one project directory, not your entire home folder.
Avoid granting access to ~/.ssh, your password manager vault, global config folders, or your whole Documents directory “just because it is convenient.”
Ensure gateway.bind is set to "loopback" to prevent external access, and use the allowFrom list to restrict which users can talk to the bot.

2. Enable an allow-list for commands, paths, and integrations

You might want the bot to check your calendar, but it definitely should not be reading your password file. An allow-list is just a list of the exact things the bot is allowed to access and/or modify.

OpenClaw supports allow-lists, use them. If it supports “ask before executing,” use that too.
At minimum, you want:

a command allow-list (and/or command categories)
a filesystem allow-list (which directories it can read/write)
an integration allow-list (which apps/services it can touch)
a network allow-list (if it can make outbound requests).

Also explicitly require confirmation for destructive or high-risk operations:

deleting files (rm, recursive operations)
changing permissions (chmod, chown)
modifying SSH / git config
installing software
anything involving credentials
anything involving piping remote content into a shell.

If you do nothing else, do this: default-deny and explicit approvals.

Access control is critical in the age of AI.

3. Use a Model with Prompt Injection Defense

Agents and secrets are where “cool tool” turns into “incident report.”

Rules that will save you:

Do not let it read your SSH keys. Prefer ssh-agent and per-project keys.
Use scoped tokens (read-only where possible) instead of full-access tokens.
Prefer short-lived credentials over long-lived ones (ephemeral where you can).
Keep secrets out of .env files that the agent can casually read.
If it has a “memory” feature, ensure it never stores secrets in memory.

A simple operational approach:

Create a separate “agent” set of credentials that are intentionally limited.
Assume anything the agent can see might eventually leak (logs, memory, screenshots, tool traces, etc.).
Make “rotation” easy so you actually do it.

Claude Opus 4.5 has a good track record of defending against prompt injection attacks, but no model is completely immune. Test them yourself thoroughly.

4. Run audits, keep logs

If there is a built-in audit command, run it. (For example, if the CLI supports something like this:)

clawdbot security audit --deep

Even if the exact command differs, the principle stands: use whatever tooling exists to enumerate permissions, integrations, stored memory, and risky defaults.

Operationally, you also want:

Tool execution logs you can review (what commands ran, what files were accessed)
A “big red button” kill switch (disable integrations quickly)
A habit of periodically resetting permissions back to minimal

5. Do not add personal bots to group chats

Your personal bot knows your secrets. If you put it in a group chat, it might accidentally share your private calendar or notes with everyone in the room.

If you want a bot in a shared workspace, run a separate, intentionally limited “work bot” with different credentials and no access to personal accounts, personal files, or personal memory.

Bonus tip… read the docs!

Security is not a “set and forget” feature.

This is the part people skip because it is boring, and it is the part that matters.

You need to verify:

Gateway Binding: Is your listener locked to localhost (safe) or exposed on 0.0.0.0?
Memory: What does “memory” actually store, and where?
Execution Modes: Do you know the difference between Elevated: Ask and Elevated: Full? (Hint: One gives the bot silent root access).
Pairing: How does dmPolicy actually filter strangers, and where are approved IDs stored?
Binaries: Which specific commands are whitelisted in safeBins?
Telemetry: What conversation data is your OTEL_LOGS_EXPORTER quietly shipping out?

Moving From “Chat Toys” to Secure Production Agents

Agents like OpenClaw are where the real productivity gains are going to come from, but they force a mindset shift: your agent is no longer “a model,” it is a new security principal on your system. A non-human identity that can take actions, touch data, and move across systems.

That is why the checklist above matters. Sandboxing, allow-lists, secret hygiene, and audit logs are how you keep “AI with hands” from turning into “AI with incident tickets.”

And if you’re building agentic apps (or wiring an agent into real company systems), the missing layer is almost always identity and authorization, not more prompting.

This is where the broader Auth0 for AI Agents story fits really cleanly: it is basically the “boring but critical” security plumbing agents need, User Authentication (who is asking), Token Vault (how the agent gets delegated access), and Fine‑Grained Authorization (FGA) (what the agent is allowed to touch, down to individual resources).

Secure your bot, save your soul (and your credentials).

The goal is not to make agents weaker. It is to make them powerful without being surprising.

Securing OpenClaw: A Developer's Checklist for AI Agent Security

Five-Step Checklist to Securing AI Systems in 2026

1. Enable the sandbox (the “padded room”)

2. Enable an allow-list for commands, paths, and integrations

3. Use a Model with Prompt Injection Defense

4. Run audits, keep logs

5. Do not add personal bots to group chats

Bonus tip… read the docs!

Moving From “Chat Toys” to Secure Production Agents

Hiding Prompts in Plain Sight: A New AI Security Risk

Discover why standard prompt injection is evolving into a greater threat with AI browsers. Learn about indirect injection risks and how to secure your LLM applications.

Securing AI Agents: Mitigate Excessive Agency with Zero Trust Security

Learn how to secure your AI agents to prevent Excessive Agency, a top OWASP LLM vulnerability, by implementing a Zero Trust model.

Trusting AI Output? Why Improper Output Handling is the New XSS

We know not to trust user input, but what about AI output? Learn how improper output handling leads to XSS, SQL injection, and RCE, and how to prevent it.