Trust Assessment
azure-ai-evaluation-py received a trust score of 65/100, placing it in the Caution category. This skill has some security considerations that users should review before deployment.
SkillShield's automated analysis identified 2 findings: 1 critical, 1 high, 0 medium, and 0 low severity. Key findings include Prompt Injection via Untrusted Data File, Arbitrary File Read via User-Controlled Data Path.
The analysis covered 4 layers: Manifest Analysis, Static Code Analysis, Dependency Graph, LLM Behavioral Safety. The LLM Behavioral Safety layer scored lowest at 55/100, indicating areas for improvement.
Last analyzed on February 13, 2026 (commit 13146e6a). SkillShield performs automated 4-layer security analysis on AI skills and MCP servers.
Layer Breakdown
Behavioral Risk Signals
Security Findings2
| Severity | Finding | Layer | Location | |
|---|---|---|---|---|
| CRITICAL | Prompt Injection via Untrusted Data File The `scripts/run_batch_evaluation.py` script accepts a user-controlled JSONL file path via the `--data` command-line argument. This file's contents (specifically `query`, `context`, `response` columns as defined in `column_mapping`) are directly fed into AI-assisted evaluators (e.g., `GroundednessEvaluator`, `RelevanceEvaluator`). These evaluators are initialized with `model_config` containing Azure OpenAI API keys. An attacker providing a malicious JSONL file can inject instructions into the prompts sent to the underlying Azure OpenAI model, leading to LLM manipulation, unintended actions, data exfiltration through the LLM, or generation of harmful content. Implement robust input sanitization and validation for all data fields (`query`, `context`, `response`) that are fed into LLM prompts. Consider using a separate, less privileged LLM for initial content moderation or input validation. Implement strict output filtering for LLM responses. For critical applications, consider using a 'two-LLM' architecture where one LLM sanitizes inputs for another. | LLM | scripts/run_batch_evaluation.py:200 | |
| HIGH | Arbitrary File Read via User-Controlled Data Path The `scripts/run_batch_evaluation.py` script takes a user-controlled file path via the `--data` command-line argument. This path is directly passed to the `azure.ai.evaluation.evaluate` function, which is expected to read the contents of the specified file. An attacker could provide a path to an arbitrary sensitive file (e.g., `/etc/passwd`, `/app/secrets/api_key.json`) on the system. If the evaluation results are subsequently logged or outputted (e.g., using the `--output` argument), the sensitive data from the arbitrary file could be exfiltrated. Implement strict validation and sanitization of the `data` file path. Restrict file access to a specific, sandboxed directory or require files to be within a designated, user-controlled input directory. Ensure that the `evaluate` function or its wrapper has safeguards against reading arbitrary files outside of an allowed scope. | LLM | scripts/run_batch_evaluation.py:190 |
Scan History
Embed Code
[](https://skillshield.io/report/56f73170faf0fd37)
Powered by SkillShield