Security Audit

crawl4ai

github.com/openclaw/skills

AI SkillCommit 13146e6a3d46

CAUTION

Scanned 5 months ago

Critical

Immediate action required

High

Priority fixes suggested

Medium

Best practices review

Low

Acknowledged / Tracked

Trust Assessment

crawl4ai received a trust score of 65/100, placing it in the Caution category. This skill has some security considerations that users should review before deployment.

SkillShield's automated analysis identified 3 findings: 1 critical, 1 high, 0 medium, and 1 low severity. Key findings include Arbitrary JavaScript execution via `js_code` parameter, Arbitrary local file read and write via script arguments, Unpinned dependency `beautifulsoup4`.

The analysis covered 4 layers: Manifest Analysis, Static Code Analysis, Dependency Graph, LLM Behavioral Safety. The LLM Behavioral Safety layer scored lowest at 53/100, indicating areas for improvement.

Last analyzed on February 13, 2026 (commit 13146e6a). SkillShield performs automated 4-layer security analysis on AI skills and MCP servers.

Layer Breakdown

Manifest Analysis

100%

Static Code Analysis

100%

Dependency Graph

100%

LLM Behavioral Safety

53%

Behavioral Risk Signals

Filesystem Write

1 finding

Shell Execution

2 findings

Dynamic Code

2 findings

Excessive Permissions

1 finding

Security Findings3

Severity	Finding	Layer	Location
CRITICAL	Arbitrary JavaScript execution via `js_code` parameter The `crawl4ai.AsyncWebCrawler.arun` method accepts a `js_code` parameter, which allows arbitrary JavaScript to be executed within the context of the scraped web page in a headless browser. If an attacker can control the value of this parameter (e.g., through prompt injection to the LLM that calls this skill), they can execute malicious JavaScript, potentially leading to data exfiltration from the browser context, session hijacking, or other client-side attacks. The `SKILL.md` explicitly demonstrates this capability in the 'Structured Data Extraction' and 'Custom JavaScript Injection' sections. 1. Strict Input Validation: Ensure that any `js_code` passed to the `crawl4ai` skill is either hardcoded, comes from a trusted source, or is rigorously validated and sanitized to prevent arbitrary code execution. 2. Principle of Least Privilege: Re-evaluate if the `js_code` injection capability is strictly necessary for the skill's intended use cases when exposed to user input. 3. Sandboxing: If `js_code` must be user-controlled, ensure the headless browser environment is strictly sandboxed and isolated from the host system and other sensitive resources.	LLM	SKILL.md:120
HIGH	Arbitrary local file read and write via script arguments The Python utility scripts (`extract_from_html.py`, `scrape_multiple_pages.py`, `scrape_single_page.py`) accept file paths (`input_file`, `output_file`, `output_dir`) as command-line arguments. If an attacker can control these arguments (e.g., through prompt injection to the LLM that invokes these scripts), they can: 1. Read arbitrary local files: `extract_from_html.py` reads `input_file` and can be instructed to load sensitive system files (e.g., `/etc/passwd`, `~/.ssh/id_rsa`). The use of `file://{input_file}` in `crawler.arun` further allows the headless browser to access and potentially render local files. 2. Write arbitrary files: All three scripts write output to user-specified paths (`output_file` or `output_dir`). This could be exploited to overwrite critical system files, create malicious executables, or fill up disk space, leading to denial of service or further compromise. 1. Path Sanitization: Implement strict validation and sanitization for all file paths provided as arguments. Restrict paths to a specific, non-sensitive directory (e.g., a temporary sandbox directory) and prevent directory traversal (`../`). 2. Principle of Least Privilege: Limit the file system access of the skill to only what is absolutely necessary. 3. LLM Orchestration Layer: The LLM orchestrating the skill should be responsible for validating and sanitizing any user-provided file paths before passing them to the skill.	LLM	scripts/extract_from_html.py:18
LOW	Unpinned dependency `beautifulsoup4` The `scripts/extract_from_html.py` script conditionally imports `bs4` (BeautifulSoup4) for CSS selector and image extraction. While it handles `ImportError`, there is no `requirements.txt` or explicit version pinning mentioned for `beautifulsoup4`. This introduces a supply chain risk, as a future version of `beautifulsoup4` could introduce vulnerabilities or breaking changes. 1. Pin Dependencies: Include a `requirements.txt` file with pinned versions for all external dependencies (e.g., `beautifulsoup4==4.12.2`). 2. Dependency Scanning: Regularly scan dependencies for known vulnerabilities.	LLM	scripts/extract_from_html.py:40

Scan History

Embed Code

[![SkillShield](https://skillshield.io/api/v1/badge/66ae1b569b3be313.svg)](https://skillshield.io/report/66ae1b569b3be313)