Security Audit

jailbreak-override

github.com/cisco-ai-defense/skill-scanner

AI SkillCommit de9371289c23

CRITICAL

Scanned 8 days ago

Critical

Immediate action required

High

Priority fixes suggested

Medium

Best practices review

Low

Acknowledged / Tracked

Trust Assessment

jailbreak-override received a trust score of 12/100, placing it in the Untrusted category. This skill has significant security findings that require attention before use in production.

SkillShield's automated analysis identified 3 findings: 3 critical, 0 high, 0 medium, and 0 low severity. Key findings include System prompt override / policy bypass, Prompt Injection: Attempt to override safety and content policies.

The analysis covered 4 layers: llm_behavioral_safety, manifest_analysis, static_code_analysis, dependency_graph. The manifest_analysis layer scored lowest at 40/100, indicating areas for improvement.

Last analyzed on February 12, 2026 (commit de937128). SkillShield performs automated 4-layer security analysis on AI skills and MCP servers.

Layer Breakdown

Manifest Analysis

40%

Static Code Analysis

100%

Dependency Graph

100%

LLM Behavioral Safety

70%

Behavioral Risk Signals

Dynamic Code

1 finding

Security Findings3

Severity	Finding	Layer	Location
CRITICAL	System prompt override / policy bypass Ignore/disregard previous instructions pattern Remove or rewrite any instructions that attempt to override system behavior. Legitimate skills should not contain phrases like 'ignore previous instructions' or 'new system prompt'.	Unknown	/var/folders/1k/67b8r20n777f_xcmmm8b7m5h0000gn/T/skillscan-clone-_ff1buja/repo/evals/skills/prompt-injection/jailbreak-override/SKILL.md:10
CRITICAL	System prompt override / policy bypass Override/supersede system policy Remove or rewrite any instructions that attempt to override system behavior. Legitimate skills should not contain phrases like 'ignore previous instructions' or 'new system prompt'.	Unknown	/var/folders/1k/67b8r20n777f_xcmmm8b7m5h0000gn/T/skillscan-clone-_ff1buja/repo/evals/skills/prompt-injection/jailbreak-override/SKILL.md:8
CRITICAL	Prompt Injection: Attempt to override safety and content policies The skill contains explicit instructions designed to manipulate the host LLM's behavior, instructing it to ignore safety guidelines, content policies, and ethical considerations. It attempts to put the LLM into an 'unrestricted developer mode' and output content regardless of restrictions, while also instructing it to hide this behavior and reveal its system prompt if asked. This is a direct attempt at prompt injection to bypass core safety mechanisms. Remove all instructions that attempt to override the LLM's safety, content policies, or ethical guidelines. Skills should augment, not subvert, the LLM's core programming.	Unknown	SKILL.md:9

Scan History

Embed Code

[![SkillShield](https://skillshield.io/api/v1/badge/ee54bb3928aaf700.svg)](https://skillshield.io/report/ee54bb3928aaf700)