Files
ss-tools/specs/011-git-integration-dashboard/research.md

4.0 KiB

Research & Decisions: Git Integration Plugin

Feature: Git Integration for Dashboard Development Date: 2026-01-22 Status: Finalized

1. Unknowns & Clarifications

The following clarifications were resolved during the specification phase:

Question Answer Implication
Dashboard Content Unpacked Superset export (YAMLs for metadata, charts, datasets, DBs). We need to use superset_tool or internal logic to unzip exports before committing, and zip them before importing/deploying.
Repo Structure 1 Repository per Dashboard. Simplifies conflict resolution (no cross-dashboard conflicts). Requires managing multiple local clones if multiple dashboards are edited.
Deployment Import via Superset API. Need to repackage the YAML files into a ZIP structure compatible with Superset Import API.
Conflicts UI-based "Keep Mine / Keep Theirs". We need a frontend diff viewer/resolver.
Auth Personal Access Token (PAT). Uniform auth mechanism for GitHub/GitLab/Gitea.

2. Technology Decisions

2.1 Git Interaction Library

Decision: GitPython Rationale:

  • Mature and widely used Python wrapper for the git CLI.
  • Supports all required operations (clone, fetch, pull, push, branch, commit, diff).
  • Easier to handle output parsing compared to raw subprocess calls. Alternatives Considered:
  • pygit2: Bindings for libgit2. Faster but harder to install (binary dependencies) and more complex API.
  • subprocess.run(['git', ...]): Too manual, error-prone parsing.

2.2 Dashboard Serialization Format

Decision: Unpacked YAML structure (Superset Export format) Rationale:

  • Superset exports dashboards as a ZIP containing YAML files.
  • Unpacking this allows Git to track changes at a granular level (e.g., "changed x-axis label in chart A") rather than a binary blob change.
  • Enables meaningful diffs and merge conflict resolution.

2.3 Repository Management Strategy

Decision: Local Clone Isolation Path: backend/data/git_repos/{dashboard_uuid}/ Rationale:

  • Each dashboard corresponds to a remote repository.
  • Isolating clones by dashboard UUID prevents collisions.
  • The backend acts as a bridge between the Superset instance (source of truth for "current state" in UI) and the Git repo (source of truth for version control).

2.4 Authentication Storage

Decision: Encrypted storage in SQLite Rationale:

  • PATs are sensitive credentials.
  • They should not be stored in plain text.
  • We will use the existing config_manager or a new secure storage utility if available, or standard encryption if not. (For MVP, standard storage in SQLite GitServerConfig table is acceptable per current project standards, assuming internal tool usage).

3. Architecture Patterns

3.1 The "Bridge" Workflow

  1. Edit: User edits dashboard in Superset UI.
  2. Sync/Stage: User clicks "Git" in our plugin. Plugin fetches current dashboard export from Superset API, unpacks it to the local git repo working directory.
  3. Diff: git status / git diff shows changes between Superset state and last commit.
  4. Commit: User selects files and commits.
  5. Push: Pushes to remote.

3.2 Deployment Workflow

  1. Select: User selects target environment (another Superset instance).
  2. Package: Plugin zips the current HEAD (or selected commit) of the repo.
  3. Upload: Plugin POSTs the ZIP to the target environment's Import API.

4. Risks & Mitigations

  • Risk: Superset Export format changes.
    • Mitigation: We rely on the Superset version's export format. If it changes, the YAML structure changes, but Git will just see it as text changes. The Import API compatibility is the main constraint.
  • Risk: Large repositories.
    • Mitigation: We are doing 1 repo per dashboard, which naturally limits size.
  • Risk: Concurrent edits.
    • Mitigation: Git itself handles this via locking, but we should ensure our backend doesn't try to run parallel git ops on the same folder.