- Step 9.3 now spawns gsd-entity-generator subagent instead of Task batches - Subagent receives file paths only (preserves orchestrator context) - Subagent reads files in fresh 200k context, writes entities to disk - Returns statistics only (not entity contents) - Updated context section to document subagent execution model - Fixed slug examples to use single hyphen format (matches hook) - Simplified Steps 9.4-9.5 for new flow
10 KiB
name, description, argument-hint, allowed-tools
| name | description | argument-hint | allowed-tools | |||||
|---|---|---|---|---|---|---|---|---|
| gsd:analyze-codebase | Scan existing codebase and populate .planning/intel/ with file index, conventions, and semantic entity files |
|
Works standalone (without /gsd:new-project) for brownfield codebases. Creates summary.md for context injection at session start. Generates entity files that capture file PURPOSE (what it does, why it exists), not just syntax.
Output: .planning/intel/index.json, conventions.json, summary.md, entities/*.md
This command performs bulk codebase scanning to bootstrap the Codebase Intelligence system.Use for:
- Brownfield projects before /gsd:new-project
- Refreshing intel after major changes
- Standalone intel without full project setup
After initial scan, the PostToolUse hook (hooks/intel-index.js) maintains incremental updates.
Execution model (Step 9 - Entity Generation):
- Orchestrator selects files for entity generation (up to 50 based on priority)
- Spawns
gsd-entity-generatorsubagent with file list (paths only, not contents) - Subagent reads files in fresh 200k context, generates entities, writes to disk
- PostToolUse hook automatically syncs entities to graph.db
- Subagent returns statistics only (not entity contents)
- This preserves orchestrator context for large codebases (500+ files)
- Users can skip Step 9 if they only want the index (faster)
Step 1: Create directory structure
mkdir -p .planning/intel
Step 2: Find all indexable files
Use Glob tool with pattern: **/*.{js,ts,jsx,tsx,mjs,cjs}
Exclude directories (skip any path containing):
- node_modules
- dist
- build
- .git
- vendor
- coverage
- .next
- pycache
Filter results to remove excluded paths before processing.
Step 3: Process each file
Initialize the index structure:
{
version: 1,
updated: Date.now(),
files: {}
}
For each file found:
-
Read file content using Read tool
-
Extract exports using these patterns:
- Named exports:
export\s*\{([^}]+)\} - Declaration exports:
export\s+(?:const|let|var|function\*?|async\s+function|class)\s+(\w+) - Default exports:
export\s+default\s+(?:function\s*\*?\s*|class\s+)?(\w+)? - CommonJS object:
module\.exports\s*=\s*\{([^}]+)\} - CommonJS single:
module\.exports\s*=\s*(\w+)\s*[;\n] - TypeScript:
export\s+(?:type|interface)\s+(\w+)
- Named exports:
-
Extract imports using these patterns:
- ES6:
import\s+(?:\{[^}]*\}|\*\s+as\s+\w+|\w+)\s+from\s+['"]([^'"]+)['"] - Side-effect:
import\s+['"]([^'"]+)['"](not preceded by 'from') - CommonJS:
require\s*\(\s*['"]([^'"]+)['"]\s*\)
- ES6:
-
Store in index:
index.files[absolutePath] = { exports: [], // Array of export names imports: [], // Array of import sources indexed: Date.now() }
Step 4: Detect conventions
Analyze the collected index for patterns.
Naming conventions (require 5+ exports, 70%+ match rate):
- camelCase:
^[a-z][a-z0-9]*(?:[A-Z][a-z0-9]+)+$or single lowercase^[a-z][a-z0-9]*$ - PascalCase:
^[A-Z][a-z0-9]+(?:[A-Z][a-z0-9]+)*$or single^[A-Z][a-z0-9]+$ - snake_case:
^[a-z][a-z0-9]*(?:_[a-z0-9]+)+$ - SCREAMING_SNAKE:
^[A-Z][A-Z0-9]*(?:_[A-Z0-9]+)+$or single^[A-Z][A-Z0-9]*$ - Skip 'default' when counting (it's a keyword, not naming convention)
Directory patterns (use lookup table):
components -> UI components
hooks -> React/custom hooks
utils, lib -> Utility functions
services -> Service layer
api, routes -> API endpoints
types -> TypeScript types
models -> Data models
tests, __tests__, test, spec -> Test files
controllers -> Controllers
middleware -> Middleware
config -> Configuration
constants -> Constants
pages -> Page components
views -> View templates
Suffix patterns (require 5+ occurrences):
.test.*, .spec.* -> Test files
.service.* -> Service layer
.controller.* -> Controllers
.model.* -> Data models
.util.*, .utils.* -> Utility functions
.helper.*, .helpers.* -> Helper functions
.config.* -> Configuration
.types.*, .type.* -> TypeScript types
.hook.*, .hooks.* -> React/custom hooks
.context.* -> React context
.store.* -> State store
.slice.* -> Redux slice
.reducer.* -> Redux reducer
.action.*, .actions.* -> Redux actions
.api.* -> API layer
.route.*, .routes.* -> Route definitions
.middleware.* -> Middleware
.schema.* -> Schema definitions
.mock.*, .mocks.* -> Mock data
.fixture.*, .fixtures.* -> Test fixtures
Step 5: Write index.json
Write to .planning/intel/index.json:
{
"version": 1,
"updated": 1737360330000,
"files": {
"/absolute/path/to/file.js": {
"exports": ["functionA", "ClassB"],
"imports": ["react", "./utils"],
"indexed": 1737360330000
}
}
}
Step 6: Write conventions.json
Write to .planning/intel/conventions.json:
{
"version": 1,
"updated": 1737360330000,
"naming": {
"exports": {
"dominant": "camelCase",
"count": 42,
"percentage": 85
}
},
"directories": {
"components": { "purpose": "UI components", "files": 15 },
"hooks": { "purpose": "React/custom hooks", "files": 8 }
},
"suffixes": {
".test.js": { "purpose": "Test files", "count": 12 }
}
}
Step 7: Generate summary.md
Write to .planning/intel/summary.md:
# Codebase Intelligence Summary
Last updated: [ISO timestamp]
Indexed files: [N]
## Naming Conventions
- Export naming: [case] ([percentage]% of [count] exports)
## Key Directories
- `[dir]/`: [purpose] ([N] files)
- ... (top 5)
## File Patterns
- `*[suffix]`: [purpose] ([count] files)
- ... (top 3)
Total exports: [N]
Target: < 500 tokens. Keep concise for context injection.
Step 8: Report completion
Display summary statistics:
Codebase Analysis Complete
Files indexed: [N]
Exports found: [N]
Imports found: [N]
Conventions detected:
- Naming: [dominant case] ([percentage]%)
- Directories: [list]
- Patterns: [list]
Files created:
- .planning/intel/index.json
- .planning/intel/conventions.json
- .planning/intel/summary.md
Step 9: Generate semantic entities (optional)
Generate entity files that capture semantic understanding of key files. These provide PURPOSE, not just syntax.
Skip this step if: User only wants the index, or codebase has < 10 files.
9.1 Create entities directory
mkdir -p .planning/intel/entities
9.2 Select files for entity generation
Select up to 50 files based on these criteria (in priority order):
- High-export files: 3+ exports (likely core modules)
- Hub files: Referenced by 5+ other files (via imports analysis)
- Key directories: Entry points (index.js, main.js, app.js), config files
- Structural files: Files matching convention patterns (services, controllers, models)
From the index.json, identify candidates and limit to 50 files maximum per run.
9.3 Spawn entity generator subagent
Spawn gsd-entity-generator with the selected file list.
Pass to subagent:
- Total file count
- Output directory:
.planning/intel/entities/ - Slug convention:
src/lib/db.ts->src-lib-db(replace / with -, remove extension, lowercase) - Entity template (include full template from agent definition)
- List of absolute file paths (one per line)
Task tool invocation:
# Build file list (one absolute path per line)
file_list = "\n".join(selected_files)
today = date.today().isoformat()
Task(
prompt=f"""Generate semantic entity documentation for key codebase files.
You are a GSD entity generator. Read source files and create semantic documentation that captures PURPOSE (what/why), not just syntax.
**Parameters:**
- Files to process: {len(selected_files)}
- Output directory: .planning/intel/entities/
- Date: {today}
**Slug convention:**
- Remove leading /
- Remove file extension
- Replace / and . with -
- Lowercase everything
- Example: src/lib/db.ts -> src-lib-db
**Entity template:**
```markdown
---
path: {{absolute_path}}
type: [module|component|util|config|api|hook|service|model|test]
updated: {today}
status: active
---
# {{filename}}
## Purpose
[1-3 sentences: What does this file do? Why does it exist? What problem does it solve?]
## Exports
- `functionName(params): ReturnType` - Brief description
- `ClassName` - What this class represents
If no exports: "None"
## Dependencies
- [[internal-file-slug]] - Why needed (for internal deps)
- external-package - What it provides (for npm packages)
If no dependencies: "None"
## Used By
TBD
Process: For each file path below:
- Read file content using Read tool
- Analyze purpose, exports, dependencies
- Check if entity already exists (skip if so)
- Write entity to .planning/intel/entities/{{slug}}.md
- PostToolUse hook syncs to graph.db automatically
Files: {file_list}
Return format: When complete, return ONLY statistics:
ENTITY GENERATION COMPLETE
Files processed: {{N}} Entities created: {{M}} Already existed: {{K}} Errors: {{E}}
Entities written to: .planning/intel/entities/
Do NOT include entity contents in your response. """, subagent_type="gsd-entity-generator" )
**Wait for completion:** Task() blocks until subagent finishes.
**Parse result:** Extract entities_created count from response for final report.
### 9.4 Verify entity generation
Confirm entities were written:
```bash
ls .planning/intel/entities/*.md 2>/dev/null | wc -l
9.5 Report entity statistics
Entity Generation Complete
Entity files created: [N] (from subagent response)
Location: .planning/intel/entities/
Graph database: Updated automatically via PostToolUse hook
Next: Intel hooks will continue incremental updates as you code.
<success_criteria>
- .planning/intel/ directory created
- All JS/TS files scanned (excluding node_modules, dist, build, .git, vendor, coverage)
- index.json populated with exports and imports for each file
- conventions.json has detected patterns (naming, directories, suffixes)
- summary.md is concise (< 500 tokens)
- Statistics reported to user
- Entity files generated for key files (if Step 9 executed)
- Entity files contain Purpose section with semantic understanding </success_criteria>