Security Audit

deep-scraper

github.com/openclaw/skills

AI SkillCommit 13146e6a3d46

CRITICAL

Scanned 5 months ago

Critical

Immediate action required

High

Priority fixes suggested

Medium

Best practices review

Low

Acknowledged / Tracked

Trust Assessment

deep-scraper received a trust score of 48/100, placing it in the Untrusted category. This skill has significant security findings that require attention before use in production.

SkillShield's automated analysis identified 7 findings: 2 critical, 0 high, 3 medium, and 2 low severity. Key findings include Missing required field: name, Unpinned npm dependency version, Node lockfile missing.

The analysis covered 4 layers: Manifest Analysis, Static Code Analysis, Dependency Graph, LLM Behavioral Safety. The LLM Behavioral Safety layer scored lowest at 31/100, indicating areas for improvement.

Last analyzed on February 13, 2026 (commit 13146e6a). SkillShield performs automated 4-layer security analysis on AI skills and MCP servers.

Layer Breakdown

Manifest Analysis

100%

Static Code Analysis

93%

Dependency Graph

91%

LLM Behavioral Safety

31%

Behavioral Risk Signals

Network Access

3 findings

Filesystem Write

1 finding

Shell Execution

3 findings

Dynamic Code

2 findings

Security Findings7

Severity	Finding	Layer	Location
CRITICAL	Untrusted Web Content as LLM Input (Prompt Injection) The skill's primary function is to scrape arbitrary web content (from `TARGET_URL`) and output it as JSON, explicitly stating it's 'optimized for LLM processing'. The `data` field (or `transcript`/`description` in `youtube_handler.js`) contains raw or lightly cleaned text from external websites. This untrusted content can contain malicious instructions or data designed to manipulate the host LLM that consumes the skill's output, leading to prompt injection attacks. The current sanitization (stripping HTML tags and whitespace) is insufficient to prevent LLM-specific prompt injection. Implement robust LLM-specific sanitization, filtering, or content moderation on the scraped `data` before outputting it. This may involve using a separate LLM to detect and neutralize malicious instructions, or employing advanced content filtering techniques to remove any text that could be interpreted as a command by the host LLM. Consider providing a 'safe' mode that only extracts specific, structured data rather than raw text.	LLM	assets/main_handler.js:80
CRITICAL	Untrusted Web Content as LLM Input (Prompt Injection) The skill's primary function is to scrape arbitrary web content (from `TARGET_URL`) and output it as JSON, explicitly stating it's 'optimized for LLM processing'. The `data` field (or `transcript`/`description` in `youtube_handler.js`) contains raw or lightly cleaned text from external websites. This untrusted content can contain malicious instructions or data designed to manipulate the host LLM that consumes the skill's output, leading to prompt injection attacks. The current sanitization (stripping HTML tags and whitespace) is insufficient to prevent LLM-specific prompt injection. Implement robust LLM-specific sanitization, filtering, or content moderation on the scraped `data` before outputting it. This may involve using a separate LLM to detect and neutralize malicious instructions, or employing advanced content filtering techniques to remove any text that could be interpreted as a command by the host LLM. Consider providing a 'safe' mode that only extracts specific, structured data rather than raw text.	LLM	assets/main_handler.js:95
MEDIUM	Missing required field: name The 'name' field is required for claude_code skills but is missing from frontmatter. Add a 'name' field to the SKILL.md frontmatter.	Static	skills/opsun/deep-scraper/SKILL.md:1
MEDIUM	Unpinned npm dependency version Dependency 'crawlee' is not pinned to an exact version ('^3.0.0'). Pin dependencies to exact versions to reduce drift and supply-chain risk.	Dependencies	skills/opsun/deep-scraper/package.json
MEDIUM	Reduced Browser Sandbox Security in Docker The Playwright crawler is configured to run with `--no-sandbox` and `--disable-setuid-sandbox` arguments. While often necessary for Playwright in Docker environments, these options disable critical security sandboxing mechanisms. Since the skill navigates to arbitrary external URLs, a successful browser exploit (e.g., through a malicious website) could more easily escape the browser process and potentially compromise the Docker container or, in severe cases, the host system. If possible, explore alternative Playwright configurations or Docker hardening techniques to enable sandboxing. If `--no-sandbox` is strictly required, ensure the Docker container runs with minimal privileges (e.g., a non-root user), strict resource limits, and is isolated from sensitive host resources. Regularly update Playwright and Chromium to patch known vulnerabilities.	LLM	assets/main_handler.js:19
LOW	Node lockfile missing package.json is present but no lockfile was found (package-lock.json, pnpm-lock.yaml, or yarn.lock). Commit a lockfile for deterministic dependency resolution.	Dependencies	skills/opsun/deep-scraper/package.json
LOW	Unpinned Dependencies The `package.json` uses caret (`^`) ranges for `crawlee` and `playwright` dependencies. This allows automatic updates to new minor or patch versions. While convenient, it introduces a slight supply chain risk as a new version could inadvertently introduce a vulnerability or breaking change without explicit review, potentially impacting the skill's security or stability. Pin dependencies to exact versions (e.g., `"crawlee": "3.x.x"`, `"playwright": "1.x.x"`) to ensure deterministic builds and prevent unexpected updates. Use a dependency vulnerability scanner (e.g., `npm audit`) regularly to identify and address known vulnerabilities in pinned versions.	LLM	package.json:19

Scan History

Embed Code

[![SkillShield](https://skillshield.io/api/v1/badge/3f9af0358265a575.svg)](https://skillshield.io/report/3f9af0358265a575)