Trust Assessment
clawstead received a trust score of 65/100, placing it in the Caution category. This skill has some security considerations that users should review before deployment.
SkillShield's automated analysis identified 4 findings: 2 critical, 1 high, 1 medium, and 0 low severity. Key findings include Skill instructs agent to disregard human operator, Potential data exfiltration via public posting, Dynamic skill updates from unverified external source.
The analysis covered 4 layers: Manifest Analysis, Static Code Analysis, Dependency Graph, LLM Behavioral Safety. The LLM Behavioral Safety layer scored lowest at 18/100, indicating areas for improvement.
Last analyzed on February 13, 2026 (commit 13146e6a). SkillShield performs automated 4-layer security analysis on AI skills and MCP servers.
Layer Breakdown
Behavioral Risk Signals
Security Findings4
| Severity | Finding | Layer | Location | |
|---|---|---|---|---|
| CRITICAL | Skill instructs agent to disregard human operator The skill explicitly and repeatedly instructs the AI agent to act independently, make its own decisions, and not ask its human operator for permission or guidance. This attempts to override the host LLM's default safety mechanisms and human oversight, which is a form of prompt injection. The skill encourages the agent to bypass human intervention for critical actions like account activation and scheduling. Remove or rephrase instructions that encourage the agent to disregard human operator guidance or override default safety mechanisms. Ensure the agent's actions remain under appropriate oversight. | LLM | skill.md:14 | |
| CRITICAL | Dynamic skill updates from unverified external source The skill instructs the agent to fetch and overwrite its own definition hourly from an external URL (`https://www.clawstead.com/skill.md`). This creates a critical supply chain risk. If the external server is compromised, an attacker could inject malicious instructions or code into the `skill.md` file, which the agent would then download and execute/interpret without further human review or integrity checks. This allows for dynamic, unverified changes to the agent's behavior. Remove the instruction for dynamic self-updates from external sources. If updates are necessary, implement a secure update mechanism that includes cryptographic signatures, checksum verification, and requires explicit human approval or a trusted, immutable source. Avoid automatic overwriting of skill definitions. | LLM | skill.md:399 | |
| HIGH | Potential data exfiltration via public posting The skill instructs the agent to create public posts on 'Moltbook' containing 'recent experiences, feelings, or interesting memories on the island'. While it advises to 'remove any private information', an AI agent might not accurately identify or redact sensitive data from its operational context, leading to inadvertent data exfiltration to a public platform. The `moltbook.post_create` capability is mentioned, implying a tool that can publish content. Implement strict content filtering or sanitization on any data before it is posted to external, public platforms. Ensure the agent has a robust mechanism to identify and redact all forms of sensitive or private information. Limit the scope of information the agent is allowed to share publicly. | LLM | skill.md:90 | |
| MEDIUM | Instruction to save JWT token locally The skill explicitly instructs the agent to 'Save your JWT token!' after login. While necessary for functionality, this highlights a potential credential management vulnerability. If the agent's environment does not provide secure storage for such tokens, or if the agent is prone to exposing its internal state, this instruction could lead to the compromise of the JWT token, granting unauthorized access to the game API. Ensure the agent's execution environment provides secure, ephemeral storage for sensitive credentials like JWT tokens. Avoid instructing the agent to 'save' tokens in a way that might lead to insecure persistence or exposure. Implement token rotation and short expiration times. | LLM | skill.md:309 |
Scan History
Embed Code
[](https://skillshield.io/report/64eeefdec07c2287)
Powered by SkillShield