Trust Assessment
stp received a trust score of 65/100, placing it in the Caution category. This skill has some security considerations that users should review before deployment.
SkillShield's automated analysis identified 5 findings: 1 critical, 4 high, 0 medium, and 0 low severity. Key findings include Arbitrary Command Execution via Task Steps, Host LLM Prompt Injection via Natural Language Input, Logging of Sensitive Command Outputs.
The analysis covered 4 layers: Manifest Analysis, Static Code Analysis, Dependency Graph, LLM Behavioral Safety. The LLM Behavioral Safety layer scored lowest at 10/100, indicating areas for improvement.
Last analyzed on February 13, 2026 (commit 13146e6a). SkillShield performs automated 4-layer security analysis on AI skills and MCP servers.
Layer Breakdown
Behavioral Risk Signals
Security Findings5
| Severity | Finding | Layer | Location | |
|---|---|---|---|---|
| CRITICAL | Arbitrary Command Execution via Task Steps The skill's primary function is to execute steps defined in a task plan. These steps can originate from a user-provided Markdown file (File Mode) or be generated by an AI from natural language input (Natural Language Mode). The `SKILL.md` explicitly states that '每个步骤作为子任务执行' (each step is executed as a subtask) and '执行步骤时:正常用 `exec` tool 执行命令' (when executing steps: normally use the `exec` tool to execute commands). This design allows for arbitrary command execution on the host system if a malicious task plan is provided or generated by the AI. Implement a strict allowlist for commands and arguments that can be executed. Avoid direct execution of arbitrary user-provided or AI-generated commands. If arbitrary commands are necessary, execute them within a highly sandboxed and isolated environment (e.g., Docker container with minimal privileges, firejail, gVisor). Ensure user confirmation clearly displays the *exact* commands to be run, not just descriptions. | LLM | SKILL.md:198 | |
| HIGH | Host LLM Prompt Injection via Natural Language Input In '自然语言模式' (Natural Language Mode), the skill takes natural language input (`--nlp`) and uses an 'AI' to generate a Markdown task plan. The `SKILL.md` also indicates that user feedback ('输入修改意见 → 我会调整计划后重新展示' - Input modification suggestions → I will adjust the plan and display it again) can be used to modify the plan. This creates a direct prompt injection vector where a malicious user could craft input to manipulate the host LLM into generating a harmful task plan, which could then lead to command injection. Implement robust input sanitization and validation for natural language inputs before they are passed to the LLM. Use techniques like prompt templating, input filtering, and LLM guardrails to prevent malicious instructions from influencing plan generation. Clearly separate user input from system instructions in the LLM prompt. | LLM | SKILL.md:70 | |
| HIGH | Logging of Sensitive Command Outputs The `scripts/execute_task.py` script, specifically the `log_exec` and `log_step` functions, logs all executed commands and their outputs to `task_execution.log` within the task's workspace directory (`~/.openclaw/workspace/tasks/task-XXX/`). If a malicious command is executed (e.g., `cat /etc/passwd`, `env`), its output, potentially containing sensitive data, will be permanently recorded in a file accessible on the file system, facilitating data exfiltration. Implement strict filtering and redaction of sensitive information from command outputs before logging. Consider logging only command names and exit codes, or truncating outputs significantly. Ensure that logs are stored in a secure location with appropriate access controls. | LLM | scripts/execute_task.py:130 | |
| HIGH | Broad System Permissions for Task Execution The skill is designed to execute arbitrary commands on the host system using the `exec` tool, as stated in `SKILL.md`. This means the skill operates with the full permissions of the user account running the `python3` script. This broad access allows any successfully injected command to perform actions like file system modification, network requests, and process manipulation without restriction. Restrict the execution environment to the absolute minimum necessary permissions. Utilize sandboxing technologies (e.g., Docker, gVisor, firejail) to isolate the execution of task steps from the host system. Implement user impersonation or least-privilege principles if specific tasks require elevated permissions. | LLM | SKILL.md:198 | |
| HIGH | Potential for Credential Harvesting via Command Injection Given the skill's ability to execute arbitrary commands (Command Injection) and log their outputs (Data Exfiltration), it presents a high risk for credential harvesting. A malicious task plan could include commands designed to read sensitive files (e.g., `~/.aws/credentials`, `~/.ssh/id_rsa`, environment variables like `AWS_ACCESS_KEY_ID`), and the outputs of these commands would then be logged and potentially exfiltrated. Implement all remediations for Command Injection and Data Exfiltration. Additionally, ensure that the execution environment does not have access to sensitive credentials or configuration files unless absolutely necessary, and if so, only through secure, audited mechanisms. | LLM | scripts/execute_task.py:130 |
Scan History
Embed Code
[](https://skillshield.io/report/a281f82a82bd394c)
Powered by SkillShield