Security Audit
Jamkris/everything-gemini-code:skills/eval-harness
github.com/Jamkris/everything-gemini-codeTrust Assessment
Jamkris/everything-gemini-code:skills/eval-harness received a trust score of 43/100, placing it in the Untrusted category. This skill has significant security findings that require attention before use in production.
SkillShield's automated analysis identified 2 findings: 0 critical, 2 high, 0 medium, and 0 low severity. Key findings include Potential Command Injection via Bash tool, Excessive Permissions: Broad tool access (Bash, Read, Write, Edit).
The analysis covered 4 layers: Manifest Analysis, Static Code Analysis, Dependency Graph, LLM Behavioral Safety. All layers scored 70 or above, reflecting consistent security practices.
Last analyzed on March 30, 2026 (commit 6c6f43aa). SkillShield performs automated 4-layer security analysis on AI skills and MCP servers.
Layer Breakdown
Behavioral Risk Signals
Security Findings2
| Severity | Finding | Layer | Location | |
|---|---|---|---|---|
| HIGH | Potential Command Injection via Bash tool The skill's manifest explicitly requests the `Bash` tool, and the `SKILL.md` provides examples of `bash` commands (`grep`, `npm test`, `npm run build`) that are likely to be executed by the agent. If user-controlled input (e.g., `feature-name` in `/eval define feature-name` or parts of the 'Code-Based Grader' commands) is directly interpolated into these `bash` commands without proper sanitization, it could lead to arbitrary command execution on the host system. Implement strict input validation and sanitization for any user-provided strings that are used in `bash` commands. Prefer using safer, parameterized execution methods or specific tool functions over direct shell interpolation where possible. Ensure that any arguments passed to `npm` commands are also properly validated. | LLM | SKILL.md:60 | |
| HIGH | Excessive Permissions: Broad tool access (Bash, Read, Write, Edit) The skill's manifest requests `Bash`, `Read`, `Write`, and `Edit` tools. While the skill's purpose (evaluating code, running tests) may require some filesystem interaction, the combination of arbitrary command execution (`Bash`) and broad filesystem access (`Read`, `Write`, `Edit`) grants the agent extensive control over the host environment. This significantly increases the attack surface for data exfiltration, command injection, and unauthorized system modifications if the agent is compromised or misused. Review and minimize the requested tools to the absolute minimum necessary for the skill's core functionality. Consider using more granular or sandboxed tools instead of broad `Bash` access. Implement strict access controls and monitoring for actions performed by the agent using these powerful tools. | LLM | Manifest |
Scan History
Embed Code
[](https://skillshield.io/report/1d69ad949f5ccdb2)
Powered by SkillShield