project map script | semantic parcer
This commit is contained in:
76
specs/semantic_map_design.md
Normal file
76
specs/semantic_map_design.md
Normal file
@@ -0,0 +1,76 @@
|
||||
# Semantic Map Generator & Validator Design
|
||||
|
||||
## Objective
|
||||
Create a Python script (`generate_semantic_map.py`) to:
|
||||
1. Scan the codebase (`backend/`, `frontend/`) for Semantic Protocol markers.
|
||||
2. Generate a **Full Semantic Map** (JSON) for detailed machine processing.
|
||||
3. Generate a **Compressed Project Map** (Markdown) for LLM context window (~4000 tokens).
|
||||
4. Generate a **Compliance Report** (Markdown) with history tracking.
|
||||
|
||||
## Scope
|
||||
* **Languages**: Python (`.py`), Svelte (`.svelte`), JavaScript/TypeScript (`.js`, `.ts`).
|
||||
* **Protocol**: Based on `semantic_protocol.md`.
|
||||
|
||||
## 1. Parsing Logic
|
||||
|
||||
The script will use Regex to parse files textually.
|
||||
|
||||
### Python Patterns
|
||||
* **Anchor Start**: `r"#\s*\[DEF:(?P<name>[\w\.]+):(?P<type>\w+)\]"`
|
||||
* **Anchor End**: `r"#\s*\[/DEF:(?P<name>[\w\.]+):(?P<type>\w+)\]"`
|
||||
* **Tags**: `r"#\s*@(?P<tag>[A-Z_]+):\s*(?P<value>.*)"`
|
||||
|
||||
### Svelte/JS Patterns
|
||||
* **HTML Anchor Start**: `r"<!--\s*\[DEF:(?P<name>[\w\.]+):(?P<type>\w+)\]\s*-->"`
|
||||
* **HTML Anchor End**: `r"<!--\s*\[/DEF:(?P<name>[\w\.]+):(?P<type>\w+)\]\s*-->"`
|
||||
* **JS Anchor Start**: `r"//\s*\[DEF:(?P<name>[\w\.]+):(?P<type>\w+)\]"`
|
||||
* **JS Anchor End**: `r"//\s*\[/DEF:(?P<name>[\w\.]+):(?P<type>\w+)\]"`
|
||||
* **HTML Tags**: `r"@(?P<tag>[A-Z_]+):\s*(?P<value>.*)"` (inside comments)
|
||||
* **JSDoc Tags**: `r"\*\s*@(?P<tag>[a-z]+)\s+(?P<value>.*)"`
|
||||
* **Relation Tag**: `r"//\s*@RELATION:\s*(?P<type>\w+)\s*->\s*(?P<target>.*)"`
|
||||
|
||||
## 2. Output Files & Structure
|
||||
|
||||
### A. Full Map (`semantics/semantic_map.json`)
|
||||
Detailed hierarchical JSON containing all metadata, line numbers, and relations.
|
||||
```json
|
||||
{
|
||||
"project_root": "/home/user/ss-tools",
|
||||
"generated_at": "ISO-DATE",
|
||||
"modules": [ ... ]
|
||||
}
|
||||
```
|
||||
|
||||
### B. Compressed Map (`specs/project_map.md`)
|
||||
Optimized for "Full Attention" (max ~4000 tokens).
|
||||
* **Format**: Markdown Tree.
|
||||
* **Content**:
|
||||
* List of high-level Modules/Components.
|
||||
* Only critical tags: `@PURPOSE`, `@LAYER`.
|
||||
* List of public Functions/Actions (names only, or brief summary).
|
||||
* Key Relations (`DEPENDS_ON`).
|
||||
* *Omit*: Internal implementation details, line numbers, minor tags.
|
||||
|
||||
### C. Compliance Report (`semantics/reports/semantic_report_YYYYMMDD_HHMMSS.md`)
|
||||
* **Metrics**: Global Coverage %, Module-level scores.
|
||||
* **Errors**: List of broken anchors or missing mandatory tags.
|
||||
* **History**: Timestamped filename ensures history tracking.
|
||||
|
||||
## 3. Validation Rules (Scoring)
|
||||
|
||||
### Critical Errors (0% score for entity)
|
||||
1. **Unmatched Anchors**: Start tag without End tag.
|
||||
|
||||
### Metadata Warnings (Reduces score)
|
||||
1. **Missing Mandatory Tags**: `Module`/`Component` needs `@PURPOSE`, `@LAYER`.
|
||||
|
||||
## 4. Implementation Plan
|
||||
|
||||
1. **Setup**: Create `generate_semantic_map.py` and directories (`semantics/`, `semantics/reports/`).
|
||||
2. **Parser**: Implement Regex patterns.
|
||||
3. **Walker**: Recursive file walk ignoring standard ignore patterns.
|
||||
4. **Generators**:
|
||||
* `MapGenerator`: JSON dump.
|
||||
* `ContextGenerator`: Markdown tree builder with token awareness (heuristic).
|
||||
* `ReportGenerator`: Scoring and markdown formatting.
|
||||
5. **Execution**: Run script.
|
||||
Reference in New Issue
Block a user