Files
ss-tools/specs/semantic_map_design.md

3.1 KiB

Semantic Map Generator & Validator Design

Objective

Create a Python script (generate_semantic_map.py) to:

  1. Scan the codebase (backend/, frontend/) for Semantic Protocol markers.
  2. Generate a Full Semantic Map (JSON) for detailed machine processing.
  3. Generate a Compressed Project Map (Markdown) for LLM context window (~4000 tokens).
  4. Generate a Compliance Report (Markdown) with history tracking.

Scope

  • Languages: Python (.py), Svelte (.svelte), JavaScript/TypeScript (.js, .ts).
  • Protocol: Based on semantic_protocol.md.

1. Parsing Logic

The script will use Regex to parse files textually.

Python Patterns

  • Anchor Start: r"#\s*\[DEF:(?P<name>[\w\.]+):(?P<type>\w+)\]"
  • Anchor End: r"#\s*\[/DEF:(?P<name>[\w\.]+):(?P<type>\w+)\]"
  • Tags: r"#\s*@(?P<tag>[A-Z_]+):\s*(?P<value>.*)"

Svelte/JS Patterns

  • HTML Anchor Start: r"<!--\s*\[DEF:(?P<name>[\w\.]+):(?P<type>\w+)\]\s*-->"
  • HTML Anchor End: r"<!--\s*\[/DEF:(?P<name>[\w\.]+):(?P<type>\w+)\]\s*-->"
  • JS Anchor Start: r"//\s*\[DEF:(?P<name>[\w\.]+):(?P<type>\w+)\]"
  • JS Anchor End: r"//\s*\[/DEF:(?P<name>[\w\.]+):(?P<type>\w+)\]"
  • HTML Tags: r"@(?P<tag>[A-Z_]+):\s*(?P<value>.*)" (inside comments)
  • JSDoc Tags: r"\*\s*@(?P<tag>[a-z]+)\s+(?P<value>.*)"
  • Relation Tag: r"//\s*@RELATION:\s*(?P<type>\w+)\s*->\s*(?P<target>.*)"

2. Output Files & Structure

A. Full Map (semantics/semantic_map.json)

Detailed hierarchical JSON containing all metadata, line numbers, and relations.

{
  "project_root": "/home/user/ss-tools",
  "generated_at": "ISO-DATE",
  "modules": [ ... ]
}

B. Compressed Map (specs/project_map.md)

Optimized for "Full Attention" (max ~4000 tokens).

  • Format: Markdown Tree.
  • Content:
    • List of high-level Modules/Components.
    • Only critical tags: @PURPOSE, @LAYER.
    • List of public Functions/Actions (names only, or brief summary).
    • Key Relations (DEPENDS_ON).
    • Omit: Internal implementation details, line numbers, minor tags.

C. Compliance Report (semantics/reports/semantic_report_YYYYMMDD_HHMMSS.md)

  • Metrics: Global Coverage %, Module-level scores.
  • Errors: List of broken anchors or missing mandatory tags.
  • History: Timestamped filename ensures history tracking.

3. Validation Rules (Scoring)

Critical Errors (0% score for entity)

  1. Unmatched Anchors: Start tag without End tag.

Metadata Warnings (Reduces score)

  1. Missing Mandatory Tags: Module/Component needs @PURPOSE, @LAYER.

4. Implementation Plan

  1. Setup: Create generate_semantic_map.py and directories (semantics/, semantics/reports/).
  2. Parser: Implement Regex patterns.
  3. Walker: Recursive file walk ignoring standard ignore patterns.
  4. Generators:
    • MapGenerator: JSON dump.
    • ContextGenerator: Markdown tree builder with token awareness (heuristic).
    • ReportGenerator: Scoring and markdown formatting.
  5. Execution: Run script.