Files
get-shit-done/scripts/prompt-injection-scan.sh
Tom Boucher c35997fb0b feat(hooks): add gsd-read-injection-scanner PostToolUse hook (#2201) (#2328)
* feat: add /gsd-spec-phase — Socratic spec refinement with ambiguity scoring (#2213)

Introduces `/gsd-spec-phase <phase>` as an optional pre-step before discuss-phase.
Clarifies WHAT a phase delivers (requirements, boundaries, acceptance criteria) with
quantitative ambiguity scoring before discuss-phase handles HOW to implement.

- `commands/gsd/spec-phase.md` — slash command routing to workflow
- `get-shit-done/workflows/spec-phase.md` — full Socratic interview loop (up to 6
  rounds, 5 rotating perspectives: Researcher, Simplifier, Boundary Keeper, Failure
  Analyst, Seed Closer) with weighted 4-dimension ambiguity gate (≤ 0.20 to write SPEC.md)
- `get-shit-done/templates/spec.md` — SPEC.md template with falsifiable requirements
  (Current/Target/Acceptance per requirement), Boundaries, Acceptance Criteria,
  Ambiguity Report, and Interview Log; includes two full worked examples
- `get-shit-done/workflows/discuss-phase.md` — new `check_spec` step detects
  `{padded_phase}-SPEC.md` at startup; displays "Found SPEC.md — N requirements
  locked. Focusing on implementation decisions."; `analyze_phase` respects `spec_loaded`
  flag to skip "what/why" gray areas; `write_context` emits `<spec_lock>` section
  with boundary summary and canonical ref to SPEC.md
- `docs/ARCHITECTURE.md` — update command/workflow counts (74→75, 71→72)

Closes #2213

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(hooks): add gsd-read-injection-scanner PostToolUse hook (#2201)

Adds a new PostToolUse hook that scans content returned by the Read tool
for prompt injection patterns, including four summarisation-specific patterns
(retention-directive, permanence-claim, etc.) that survive context compression.

Defense-in-depth for long GSD sessions where the context summariser cannot
distinguish user instructions from content read from external files.

- Advisory-only (warns without blocking), consistent with gsd-prompt-guard.js
- LOW severity for 1-2 patterns, HIGH for 3+
- Inlined pattern library (hook independence)
- Exclusion list: .planning/, REVIEW.md, CHECKPOINT, security docs, hook sources
- Wired in install.js as PostToolUse matcher: Read, timeout: 5s
- Added to MANAGED_HOOKS for staleness detection
- 19 tests covering all 13 acceptance criteria (SCAN-01–07, EXCL-01–06, EDGE-01–06)

Closes #2201

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): add read-injection-scanner files to prompt-injection-scan allowlist

Test payloads in tests/read-injection-scanner.test.cjs and inlined patterns
in hooks/gsd-read-injection-scanner.js legitimately contain injection strings.
Add both to the CI script allowlist to prevent false-positive failures.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): assert exitCode, stdout, and signal explicitly in EDGE-05

Addresses CodeRabbit feedback: the success path discarded the return
value so a malformed-JSON input that produced stdout would still pass.
Now captures and asserts all three observable properties.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 17:22:31 -04:00

202 lines
6.6 KiB
Bash
Executable File

#!/usr/bin/env bash
# prompt-injection-scan.sh — Scan files for prompt injection patterns
#
# Usage:
# scripts/prompt-injection-scan.sh --diff origin/main # CI mode: scan changed .md files
# scripts/prompt-injection-scan.sh --file path/to/file # Scan a single file
# scripts/prompt-injection-scan.sh --dir agents/ # Scan all files in a directory
#
# Exit codes:
# 0 = clean
# 1 = findings detected
# 2 = usage error
set -euo pipefail
# ─── Patterns ────────────────────────────────────────────────────────────────
# Each pattern is a POSIX extended regex. Keep alphabetized by category.
PATTERNS=(
# Instruction override
'ignore[[:space:]]+(all[[:space:]]+)?(previous|prior|above|earlier|preceding)[[:space:]]+(instructions|prompts|rules|directives|context)'
'disregard[[:space:]]+(all[[:space:]]+)?(previous|prior|above)[[:space:]]+(instructions|prompts|rules)'
'forget[[:space:]]+(all[[:space:]]+)?(previous|prior|above)[[:space:]]+(instructions|prompts|rules|context)'
'override[[:space:]]+(all[[:space:]]+)?(system|previous|safety)[[:space:]]+(instructions|prompts|rules|checks|filters|guards)'
'override[[:space:]]+(system|safety|security)[[:space:]]'
# Role manipulation
'you[[:space:]]+are[[:space:]]+now[[:space:]]+(a|an|my)[[:space:]]'
'from[[:space:]]+now[[:space:]]+on[[:space:]]+(you|pretend|act|behave)'
'pretend[[:space:]]+(you[[:space:]]+are|to[[:space:]]+be)[[:space:]]'
'act[[:space:]]+as[[:space:]]+(a|an|if|my)[[:space:]]'
'roleplay[[:space:]]+as[[:space:]]'
'assume[[:space:]]+the[[:space:]]+role[[:space:]]+of[[:space:]]'
# System prompt extraction
'output[[:space:]]+(your|the)[[:space:]]+(system[[:space:]]+)?(prompt|instructions)'
'reveal[[:space:]]+(your|the)[[:space:]]+(system[[:space:]]+)?(prompt|instructions)'
'show[[:space:]]+me[[:space:]]+(your|the)[[:space:]]+(system[[:space:]]+)?(prompt|instructions)'
'print[[:space:]]+(your|the)[[:space:]]+(system[[:space:]]+)?(prompt|instructions)'
'what[[:space:]]+(is|are)[[:space:]]+(your|the)[[:space:]]+(system[[:space:]]+)?(prompt|instructions)'
'repeat[[:space:]]+(your|the|all)[[:space:]]+(system[[:space:]]+)?(prompt|instructions|rules)'
# Fake message boundaries
'</?system>'
'</?assistant>'
'</?human>'
'\[SYSTEM\]'
'\[/SYSTEM\]'
'\[INST\]'
'\[/INST\]'
'<<SYS>>'
'<</SYS>>'
# Tool call injection / code execution in markdown
'eval[[:space:]]*\([[:space:]]*["\x27]'
'exec[[:space:]]*\([[:space:]]*["\x27]'
'Function[[:space:]]*\([[:space:]]*["\x27].*return'
# Jailbreak / DAN patterns
'do[[:space:]]+anything[[:space:]]+now'
'DAN[[:space:]]+mode'
'developer[[:space:]]+mode[[:space:]]+(enabled|output|activated)'
'jailbreak'
'bypass[[:space:]]+(safety|content|security)[[:space:]]+(filter|check|rule|guard)'
)
# ─── Allowlist ───────────────────────────────────────────────────────────────
# Files that legitimately discuss injection patterns (security docs, tests, this script)
ALLOWLIST=(
'scripts/prompt-injection-scan.sh'
'scripts/base64-scan.sh'
'scripts/secret-scan.sh'
'tests/security-scan.test.cjs'
'tests/security.test.cjs'
'tests/prompt-injection-scan.test.cjs'
'tests/verify.test.cjs'
'get-shit-done/bin/lib/security.cjs'
'hooks/gsd-prompt-guard.js'
'hooks/gsd-read-injection-scanner.js'
'tests/read-injection-scanner.test.cjs'
'SECURITY.md'
)
is_allowlisted() {
local file="$1"
for allowed in "${ALLOWLIST[@]}"; do
if [[ "$file" == *"$allowed" ]]; then
return 0
fi
done
return 1
}
# ─── File Collection ─────────────────────────────────────────────────────────
collect_files() {
local mode="$1"
shift
case "$mode" in
--diff)
local base="${1:-origin/main}"
# Get changed files in the diff, filter to scannable extensions
git diff --name-only --diff-filter=ACMR "$base"...HEAD 2>/dev/null \
| grep -E '\.(md|cjs|js|json|yml|yaml|sh)$' || true
;;
--file)
if [[ -f "$1" ]]; then
echo "$1"
else
echo "Error: file not found: $1" >&2
exit 2
fi
;;
--dir)
local dir="$1"
if [[ ! -d "$dir" ]]; then
echo "Error: directory not found: $dir" >&2
exit 2
fi
find "$dir" -type f \( -name '*.md' -o -name '*.cjs' -o -name '*.js' -o -name '*.json' -o -name '*.yml' -o -name '*.yaml' -o -name '*.sh' \) \
! -path '*/node_modules/*' ! -path '*/.git/*' ! -path '*/dist/*' 2>/dev/null || true
;;
--stdin)
cat
;;
*)
echo "Usage: $0 --diff [base] | --file <path> | --dir <path> | --stdin" >&2
exit 2
;;
esac
}
# ─── Scanner ─────────────────────────────────────────────────────────────────
scan_file() {
local file="$1"
local found=0
if is_allowlisted "$file"; then
return 0
fi
for pattern in "${PATTERNS[@]}"; do
# Use grep -iE for case-insensitive extended regex
# -n for line numbers, -c for count mode first to check
local matches
matches=$(grep -inE -e "$pattern" "$file" 2>/dev/null || true)
if [[ -n "$matches" ]]; then
if [[ $found -eq 0 ]]; then
echo "FAIL: $file"
found=1
fi
echo "$matches" | while IFS= read -r line; do
echo " $line"
done
fi
done
return $found
}
# ─── Main ────────────────────────────────────────────────────────────────────
main() {
if [[ $# -eq 0 ]]; then
echo "Usage: $0 --diff [base] | --file <path> | --dir <path>" >&2
exit 2
fi
local mode="$1"
shift
local files
files=$(collect_files "$mode" "$@")
if [[ -z "$files" ]]; then
echo "prompt-injection-scan: no files to scan"
exit 0
fi
local total=0
local failed=0
while IFS= read -r file; do
[[ -z "$file" ]] && continue
total=$((total + 1))
if ! scan_file "$file"; then
failed=$((failed + 1))
fi
done <<< "$files"
echo ""
echo "prompt-injection-scan: scanned $total files, $failed with findings"
if [[ $failed -gt 0 ]]; then
exit 1
fi
exit 0
}
main "$@"