Trust Assessment
task-orchestrator received a trust score of 65/100, placing it in the Caution category. This skill has some security considerations that users should review before deployment.
SkillShield's automated analysis identified 4 findings: 1 critical, 2 high, 1 medium, and 0 low severity. Key findings include Dynamic content used for autonomous agent prompts, Autonomous agent susceptible to command injection via prompt manipulation, Risk of data exfiltration and credential harvesting through manipulated autonomous agent.
The analysis covered 4 layers: manifest_analysis, llm_behavioral_safety, static_code_analysis, dependency_graph. The llm_behavioral_safety layer scored lowest at 33/100, indicating areas for improvement.
Last analyzed on February 11, 2026 (commit 0676c56a). SkillShield performs automated 4-layer security analysis on AI skills and MCP servers.
Layer Breakdown
Behavioral Risk Signals
Security Findings4
| Severity | Finding | Layer | Location | |
|---|---|---|---|---|
| CRITICAL | Dynamic content used for autonomous agent prompts The skill describes constructing prompts for the `codex` agent using dynamic and potentially untrusted content. Examples include using issue descriptions (`'Fix issue #N: DESCRIPTION...'`) and error logs (`'Previous attempt failed with: $(cat error.log | tail -20)...'`) directly within the `codex` prompt. The `codex` agent is invoked with `--yolo` (autonomous mode) and further instructed to 'fix issues yourself' in the cron job prompt. This creates a critical prompt injection vulnerability where malicious content originating from issue trackers, commit messages, or error outputs could manipulate the `codex` agent to execute arbitrary commands, exfiltrate data, or perform other unauthorized actions. Implement strict sanitization and validation for all dynamic content used to construct prompts for autonomous agents. Avoid directly injecting raw external data (like error logs or issue descriptions) into agent prompts. If external data must be included, ensure it is properly escaped or passed as context rather than direct prompt text. Limit the autonomy (`--yolo`) of agents when processing potentially untrusted input, and consider a human-in-the-loop for critical actions. | Unknown | SKILL.md:89 | |
| HIGH | Autonomous agent susceptible to command injection via prompt manipulation The `codex` agent, when successfully prompt-injected (as described in SS-LLM-001), is intended to execute shell commands such as 'Run tests, commit with good message, push to origin'. A malicious prompt could instruct `codex` to execute arbitrary shell commands on the host system where it is running, potentially leading to full system compromise. While the 'Lessons Learned' section notes `codex` sandbox limitations, the overall design implies `codex` (or the orchestrator) will eventually execute commands with network access (e.g., `git push`, `gh pr create`). In addition to prompt sanitization, ensure the `codex` agent operates within a highly restricted execution environment (e.g., a container with minimal permissions, strict network egress rules, and a read-only filesystem for sensitive areas). Implement allow-listing for commands `codex` is permitted to execute. Monitor `codex`'s actions for anomalous command execution. | Unknown | SKILL.md:89 | |
| HIGH | Risk of data exfiltration and credential harvesting through manipulated autonomous agent If the `codex` agent is successfully prompt-injected, it could be instructed to read sensitive files (e.g., configuration files, environment variables, source code outside its worktree) and exfiltrate them. The skill explicitly mentions `git push` and `gh pr create` as actions `codex` performs or triggers. A malicious prompt could instruct `codex` to include sensitive data in commit messages, PR bodies, or even push it to a controlled remote repository, leading to data leakage or credential harvesting (e.g., if API keys or tokens are accessible to the agent). Implement strict sandboxing for the `codex` agent, limiting its filesystem access to only necessary work directories and preventing access to sensitive configuration or credential files. Restrict network egress to only trusted endpoints. Review and approve all `git push` and `gh pr create` operations, especially their content, before execution. Ensure credentials used for `git push` and `gh` are short-lived and scoped to minimal necessary permissions. | Unknown | SKILL.md:89 | |
| MEDIUM | Shell commands use unsanitized placeholders, risking command injection Several example shell commands use placeholders like `OWNER/REPO`, `issue-N`, `task-tN`, and `TITLE`. If an agent implementing this skill were to substitute these placeholders with untrusted input (e.g., from a malicious issue title or repository name), it could lead to command injection. For example, `git clone https://github.com/$(malicious_command)/repo.git` or `git worktree add -b fix/issue-N; malicious_command "$WORKDIR/task-tN"`. While the skill is a rubric, it demonstrates patterns that require careful sanitization in implementation. Any dynamic input used in shell commands must be strictly sanitized and validated. Use parameterized commands or shell escaping functions (e.g., `shlex.quote` in Python) to prevent injection. Avoid direct concatenation of untrusted strings into shell commands. | Unknown | SKILL.md:60 |
Scan History
Embed Code
[](https://skillshield.io/report/3eba57a7cacccd4b)
Powered by SkillShield