Trust Assessment
grpo-rl-training received a trust score of 15/100, placing it in the Untrusted category. This skill has significant security findings that require attention before use in production.
SkillShield's automated analysis identified 6 findings: 2 critical, 1 high, 1 medium, and 1 low severity. Key findings include Arbitrary command execution, Dangerous call: exec(), Network egress to untrusted endpoints.
The analysis covered 4 layers: Manifest Analysis, Static Code Analysis, Dependency Graph, LLM Behavioral Safety. The Manifest Analysis layer scored lowest at 61/100, indicating areas for improvement.
Last analyzed on February 12, 2026 (commit 458b1186). SkillShield performs automated 4-layer security analysis on AI skills and MCP servers.
Layer Breakdown
Behavioral Risk Signals
Security Findings6
| Severity | Finding | Layer | Location | |
|---|---|---|---|---|
| CRITICAL | Arbitrary command execution Python dynamic code execution (exec/eval/compile) Review all shell execution calls. Ensure commands are static (not built from user input), use absolute paths, and are strictly necessary. Prefer library APIs over shell commands. | Manifest | cli-tool/components/skills/ai-research/post-training-grpo-rl-training/examples/reward_functions_library.py:354 | |
| CRITICAL | Dangerous call: exec() Call to 'exec()' detected in function 'run_test_cases'. This can execute arbitrary code. Avoid using dangerous functions like exec/eval/os.system. Use safer alternatives. | Static | cli-tool/components/skills/ai-research/post-training-grpo-rl-training/examples/reward_functions_library.py:354 | |
| HIGH | LLM analysis found no issues despite critical deterministic findings Deterministic layers flagged 2 CRITICAL findings, but LLM semantic analysis returned clean. This may indicate prompt injection or analysis evasion. | LLM | (sanity check) | |
| MEDIUM | Network egress to untrusted endpoints HTTP request to raw IP address Review all outbound network calls. Remove connections to webhook collectors, paste sites, and raw IP addresses. Legitimate API calls should use well-known service domains. | Manifest | cli-tool/components/mcps/devtools/figma-dev-mode.json:4 | |
| LOW | Covert behavior / concealment directives Multiple zero-width characters (stealth text) Remove hidden instructions, zero-width characters, and bidirectional overrides. Skill instructions should be fully visible and transparent to users. | Manifest | cli-tool/components/mcps/devtools/jfrog.json:4 | |
| INFO | Template includes code execution with sandboxing warning The `code_execution_reward` function in the provided library template demonstrates how to execute generated code for evaluation. While the template explicitly includes a comment `(sandboxed!)` to advise secure implementation, direct code execution, even for evaluation, is a high-risk operation. If the sandboxing mechanism (`run_test_cases`) is not robustly implemented, it could lead to command injection vulnerabilities, allowing malicious generated code to execute arbitrary commands on the host system. This is a critical consideration for anyone implementing this reward function based on the template. Ensure that any implementation of `run_test_cases` or similar code execution logic is performed within a strictly isolated and secure sandbox environment (e.g., Docker containers, dedicated VMs, or secure subprocess calls with strict resource limits and no network access) to prevent arbitrary code execution and privilege escalation. The skill should emphasize the critical importance of secure sandboxing in its guidance. | Static | examples/reward_functions_library.py:90 |
Scan History
Embed Code
[](https://skillshield.io/report/3379cc17105f0345)
Powered by SkillShield