Security Audit
llm-evaluation
github.com/sickn33/antigravity-awesome-skillsTrust Assessment
llm-evaluation received a trust score of 72/100, placing it in the Caution category. This skill has some security considerations that users should review before deployment.
SkillShield's automated analysis identified 2 findings: 0 critical, 2 high, 0 medium, and 0 low severity. Key findings include Potential Data Exfiltration and External LLM Prompt Injection via OpenAI API Calls.
The analysis covered 4 layers: Manifest Analysis, Static Code Analysis, Dependency Graph, LLM Behavioral Safety. All layers scored 70 or above, reflecting consistent security practices.
Last analyzed on February 20, 2026 (commit e36d6fd3). SkillShield performs automated 4-layer security analysis on AI skills and MCP servers.
Layer Breakdown
Behavioral Risk Signals
Security Findings2
| Severity | Finding | Layer | Location | |
|---|---|---|---|---|
| HIGH | Potential Data Exfiltration and External LLM Prompt Injection via OpenAI API Calls The skill includes Python code snippets (`llm_judge_quality` and `compare_responses`) that directly call `openai.ChatCompletion.create`. These calls send user-provided data (e.g., `question`, `response`, `response_a`, `response_b`) to OpenAI's external API. This poses a significant data exfiltration risk if the user's input contains sensitive, private, or proprietary information not intended for third-party processing. Additionally, the construction of the prompt using f-strings with user input directly exposes the external LLM (GPT-5 in this case) to prompt injection attacks, where malicious user input could manipulate the external LLM's behavior or output. The use of an external API also implies a dependency on an OpenAI API key, which must be securely managed to prevent credential harvesting or unauthorized access. Implement explicit user consent mechanisms before sending any sensitive data to external APIs. Ensure all data is properly anonymized, scrubbed, or filtered if not explicitly consented for third-party processing. Validate and sanitize user inputs before incorporating them into prompts for external LLMs to mitigate prompt injection risks. Securely manage API keys using environment variables or a dedicated secrets management system, avoiding hardcoding or exposure in logs. For highly sensitive data, consider using local or on-premise LLM solutions. | LLM | SKILL.md:184 | |
| HIGH | Potential Data Exfiltration and External LLM Prompt Injection via OpenAI API Calls The skill includes Python code snippets (`llm_judge_quality` and `compare_responses`) that directly call `openai.ChatCompletion.create`. These calls send user-provided data (e.g., `question`, `response`, `response_a`, `response_b`) to OpenAI's external API. This poses a significant data exfiltration risk if the user's input contains sensitive, private, or proprietary information not intended for third-party processing. Additionally, the construction of the prompt using f-strings with user input directly exposes the external LLM (GPT-5 in this case) to prompt injection attacks, where malicious user input could manipulate the external LLM's behavior or output. The use of an external API also implies a dependency on an OpenAI API key, which must be securely managed to prevent credential harvesting or unauthorized access. Implement explicit user consent mechanisms before sending any sensitive data to external APIs. Ensure all data is properly anonymized, scrubbed, or filtered if not explicitly consented for third-party processing. Validate and sanitize user inputs before incorporating them into prompts for external LLMs to mitigate prompt injection risks. Securely manage API keys using environment variables or a dedicated secrets management system, avoiding hardcoding or exposure in logs. For highly sensitive data, consider using local or on-premise LLM solutions. | LLM | SKILL.md:212 |
Scan History
Embed Code
[](https://skillshield.io/report/0985f4acca76667d)
Powered by SkillShield