Trust Assessment
pdf-to-structured received a trust score of 62/100, placing it in the Caution category. This skill has some security considerations that users should review before deployment.
SkillShield's automated analysis identified 4 findings: 0 critical, 2 high, 2 medium, and 0 low severity. Key findings include Missing required field: name, Unpinned Python Dependencies, Potential Command Injection via Tesseract Language Parameter.
The analysis covered 4 layers: Manifest Analysis, Static Code Analysis, Dependency Graph, LLM Behavioral Safety. The LLM Behavioral Safety layer scored lowest at 63/100, indicating areas for improvement.
Last analyzed on February 13, 2026 (commit 13146e6a). SkillShield performs automated 4-layer security analysis on AI skills and MCP servers.
Layer Breakdown
Behavioral Risk Signals
Security Findings4
| Severity | Finding | Layer | Location | |
|---|---|---|---|---|
| HIGH | Unpinned Python Dependencies The skill's installation instructions specify Python packages without pinning their versions (e.g., `pip install pdfplumber pandas openpyxl`). This introduces a significant supply chain risk. An attacker could publish a malicious version of one of these packages, or compromise a legitimate package, leading to arbitrary code execution when the skill is installed or updated. Without pinned versions, the skill's behavior can change unexpectedly, and security vulnerabilities in newer versions of dependencies could be automatically introduced. Pin all Python dependencies to specific versions (e.g., `pdfplumber==0.12.0`). Use a `requirements.txt` file with exact versions or a dependency management tool like Poetry or PDM. | LLM | SKILL.md:40 | |
| HIGH | Potential Command Injection via Tesseract Language Parameter The `ocr_scanned_pdf` function passes the `language` argument directly to `pytesseract.image_to_string(image, lang=language)`. `pytesseract` is a wrapper around the external Tesseract OCR executable. If the `language` parameter is derived from untrusted user input, an attacker could inject shell commands (e.g., `lang='eng; rm -rf /'`) if `pytesseract` or the underlying `tesseract` command-line parsing does not properly escape or sanitize the input. This risk is amplified by the unpinned `pytesseract` dependency, as older versions may be more vulnerable. Sanitize or validate the `language` parameter to ensure it only contains valid language codes. Consider using a whitelist of allowed language codes. Ensure `pytesseract` is updated to a version that mitigates known command injection vulnerabilities (e.g., >=0.3.10) and pin its version. | LLM | SKILL.md:134 | |
| MEDIUM | Missing required field: name The 'name' field is required for claude_code skills but is missing from frontmatter. Add a 'name' field to the SKILL.md frontmatter. | Static | skills/datadrivenconstruction/pdf-to-structured/SKILL.md:1 | |
| MEDIUM | Excessive File System Access with Untrusted Paths Several functions, such as `extract_tables_from_pdf`, `ocr_scanned_pdf`, and `batch_extract_tables`, accept file paths (`pdf_path`, `folder_path`, `output_folder`) as arguments and perform read/write operations based on these paths. If these path arguments are supplied by untrusted user input without proper validation (e.g., path traversal checks), an attacker could specify arbitrary file system locations. This could lead to data exfiltration (reading sensitive files like `/etc/passwd`), data corruption (overwriting critical system files), or denial of service (filling up disk space in arbitrary locations). The `batch_extract_tables` function is particularly concerning as it iterates over an entire `folder_path` and writes to an `output_folder`. Implement strict input validation and sanitization for all file path arguments. Restrict file operations to a designated, sandboxed directory. Use path normalization and canonicalization to prevent path traversal attacks (e.g., disallow `../`). If possible, pass file content directly instead of paths. | LLM | SKILL.md:199 |
Scan History
Embed Code
[](https://skillshield.io/report/3a9f04baeff05cc9)
Powered by SkillShield