Security Audit

evaluate-presets

github.com/openclaw/skills

AI SkillCommit 13146e6a3d46

TRUSTED

Scanned 2 months ago

Critical

Immediate action required

High

Priority fixes suggested

Medium

Best practices review

Low

Acknowledged / Tracked

Trust Assessment

evaluate-presets received a trust score of 80/100, placing it in the Mostly Trusted category. This skill has passed most security checks with only minor considerations noted.

SkillShield's automated analysis identified 2 findings: 0 critical, 1 high, 1 medium, and 0 low severity. Key findings include Potential Command Injection via Unsanitized User Input to Bash Tool, Skill Requires Broad Shell Execution and Code Modification Permissions.

The analysis covered 4 layers: Manifest Analysis, Static Code Analysis, Dependency Graph, LLM Behavioral Safety. All layers scored 70 or above, reflecting consistent security practices.

Last analyzed on February 13, 2026 (commit 13146e6a). SkillShield performs automated 4-layer security analysis on AI skills and MCP servers.

Layer Breakdown

Manifest Analysis

100%

Static Code Analysis

100%

Dependency Graph

100%

LLM Behavioral Safety

78%

Behavioral Risk Signals

Filesystem Write

1 finding

Shell Execution

2 findings

Dynamic Code

2 findings

Excessive Permissions

1 finding

Security Findings2

Severity	Finding	Layer	Location
HIGH	Potential Command Injection via Unsanitized User Input to Bash Tool The skill instructs the agent to use a 'Bash tool' to execute shell scripts (`evaluate-preset.sh`, `evaluate-all-presets.sh`) with user-provided arguments (e.g., preset name, backend). The skill's example invocation pattern shows these arguments directly concatenated into the command string. If the underlying shell scripts do not properly sanitize or quote these user-provided arguments, a malicious user could inject arbitrary shell commands. For instance, providing `my_preset; rm -rf /` as a preset name could lead to the execution of `rm -rf /` on the host system. The skill itself does not provide any sanitization mechanisms for these arguments before they are passed to the shell. Ensure that all user-provided arguments passed to shell commands are properly sanitized and quoted within the `evaluate-preset.sh` and `evaluate-all-presets.sh` scripts. For example, use `printf %q` in bash to safely quote arguments, or ensure the scripts explicitly validate and restrict input to expected values. The skill description should also advise users to only provide trusted input.	LLM	SKILL.md:48
MEDIUM	Skill Requires Broad Shell Execution and Code Modification Permissions The skill explicitly instructs the agent to use a 'Bash tool' for arbitrary shell command execution. It also directs the agent to use `/code-task-generator` for creating code tasks and `/code-assist` for implementing them. This combination grants the agent broad permissions to execute commands, read/write files, and modify the codebase. While these permissions are necessary for the skill's intended function (evaluating and fixing code), they represent a significant security surface. A compromised agent or malicious input could leverage these broad permissions for unauthorized actions, including data manipulation or system compromise. Implement strict access controls and sandboxing for the agent's execution environment to limit its capabilities to only what is absolutely necessary. Ensure that the agent operates with the principle of least privilege. Regularly audit the agent's actions and logs. Educate users on the risks associated with granting broad permissions to AI agents and the importance of reviewing agent-generated code changes.	LLM	SKILL.md:44

Scan History

Embed Code

[![SkillShield](https://skillshield.io/api/v1/badge/a44aa4e8a494ce3d.svg)](https://skillshield.io/report/a44aa4e8a494ce3d)