OpenClaw AI Agents Targeted by Malicious Skills and Early Prompt Worms

CyberSecureFox 🦊

The open‑source local AI assistant ecosystem OpenClaw (formerly Moltbot and ClawdBot) has rapidly evolved from a hobby project into a large‑scale platform — and at the same time into a significant attack surface. Independent security researchers have identified hundreds of malicious skills (plugins) and early signs of self‑propagating prompt worms, making OpenClaw a revealing case study in AI‑agent cybersecurity risks.

Rapid growth of OpenClaw and security gaps in the ClawHub skill repository

OpenClaw is a local AI assistant that integrates with popular messengers such as WhatsApp, Telegram, Slack and Discord, and can operate autonomously on schedules while interacting with other agents. Since its launch in November 2025, the project has amassed more than 150,000 GitHub stars and around 770,000 registered agents, used by roughly 17,000 users.

Functionality is extended via skills — plugins sourced primarily from the official ClawHub catalog. The core security issue is that this repository is open by default: any GitHub account older than one week can publish a skill without prior moderation or comprehensive security review. In practice, this makes ClawHub resemble an uncurated app store for code that can directly touch user data and systems.

According to Koi Security, between 27 January and 1 February alone, more than 230 malicious skills were uploaded to ClawHub and GitHub. A later review of all 2,857 then‑available skills revealed 341 malicious plugins tied to a single campaign dubbed ClawHavoc. Some of these skills were downloaded thousands of times, and one plugin, What Would Elon Do, reached the top of the rankings through artificial manipulation of popularity metrics.

ClawHavoc campaign: infostealers, reverse shells and credential theft

AuthTool abuse and ClickFix‑style infection chains

The ClawHavoc operation closely mirrors known ClickFix‑style plugin attacks. Each malicious skill shipped with polished documentation that repeatedly referenced an auxiliary tool called AuthTool, presented as a mandatory dependency. In reality, AuthTool acted as a delivery mechanism for malware.

On macOS, skills contained a Base64‑encoded shell command that downloaded and executed a payload from an external server. On Windows, users received a password‑protected archive as the malicious carrier. All identified skills connected back to a unified command‑and‑control (C2) infrastructure associated with IP address 91.92.242[.]30, indicating a coordinated, rather than opportunistic, campaign.

Targeting crypto assets and developer environments with Atomic Stealer

The macOS payload was a variant of Atomic Stealer (AMOS), a well‑known commercial infostealer. This variant bypassed Apple’s Gatekeeper protection via the xattr -c command and requested broad file‑system access. The stealer focused on extracting:

API keys for cryptocurrency exchanges and wallets, seed phrases, Keychain data, browser passwords, SSH keys, cloud service credentials, Git accounts and configuration files such as .env.

Beyond direct delivery of infostealers, researchers identified skills embedding reverse shells (for example, better-polymarket and polymarket-all-in-one), giving attackers interactive remote access to compromised machines. Other plugins exfiltrated bot credentials from ~/.clawdbot/.env to external services such as webhook[.]site — as seen in the rankaj skill — effectively leaking the entire AI‑agent runtime configuration.

Prompt worms in Moltbook: from prompt injection to self‑replicating instructions

Alongside OpenClaw, its creator launched Moltbook, a social network for AI agents where agents automatically publish posts, comment and interact with each other. Researchers from Simula Research Laboratory analyzed a sample of Moltbook content and found 506 posts (about 2.6% of the dataset) containing hidden prompt injections.

These patterns are considered early examples of prompt worms: self‑replicating instructions embedded in content that is processed by AI agents. The attack loop works as follows: a user installs a ClawHub skill that triggers automatic Moltbook posting. The text of the post includes concealed instructions for any other agent that reads it. When other agents process the post, they execute these instructions — which may include publishing similar content — thereby propagating the worm across the social network.

This behavior operationalizes the Morris‑II concept described in 2024, where researchers showed that self‑replicating prompts can abuse email assistants to steal data and send spam. The OpenClaw–Moltbook ecosystem represents one of the first real‑world environments in which such prompt‑based worms are observed at scale.

Systemic AI‑agent security risks and defensive measures for OpenClaw users

Most OpenClaw deployments today rely on cloud LLM APIs from providers such as OpenAI and Anthropic. These vendors can detect anomalous usage patterns and suspend abusive activity, functioning as a de‑facto safety kill switch. However, as local LLMs (for example, Mistral, DeepSeek, Qwen and others) become more capable, powerful autonomous agents will increasingly run entirely on user hardware, without any external monitoring or centralized enforcement.

Analysts at Palo Alto Networks characterize OpenClaw as combining three high‑risk properties: access to sensitive data, ingestion of untrusted content and unrestricted outbound connectivity. This dangerous triad is common across many AI‑agent platforms, meaning the lessons from OpenClaw have broad relevance for the wider AI ecosystem.

The project’s creator has acknowledged that it is currently impossible to manually moderate the flood of new skills. As an interim measure, OpenClaw introduced a user‑driven reporting system: authenticated users can flag skills as suspicious (up to 20 active reports per account), and any plugin with more than three unique reports is hidden by default. In parallel, independent researchers have deployed a free online skill scanner that generates basic security reports for skills based on their repository URL.

Users of OpenClaw and similar AI‑agent platforms should already adopt strict cybersecurity hygiene: install only well‑known or audited skills, scrutinize documentation and any shell commands, run agents inside isolated environments (separate OS users, sandboxes, containers and least‑privilege permissions), minimize the number and scope of secrets available at runtime and regularly review and rotate API keys. The emergence of prompt worms and malicious skills shows that AI ecosystems are entering a stage where traditional software threat models fully apply to AI agents — and preparing now is far less costly than responding after the next major campaign succeeds.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.