4.0 KiB
4.0 KiB
Research & Decisions: Git Integration Plugin
Feature: Git Integration for Dashboard Development Date: 2026-01-22 Status: Finalized
1. Unknowns & Clarifications
The following clarifications were resolved during the specification phase:
| Question | Answer | Implication |
|---|---|---|
| Dashboard Content | Unpacked Superset export (YAMLs for metadata, charts, datasets, DBs). | We need to use superset_tool or internal logic to unzip exports before committing, and zip them before importing/deploying. |
| Repo Structure | 1 Repository per Dashboard. | Simplifies conflict resolution (no cross-dashboard conflicts). Requires managing multiple local clones if multiple dashboards are edited. |
| Deployment | Import via Superset API. | Need to repackage the YAML files into a ZIP structure compatible with Superset Import API. |
| Conflicts | UI-based "Keep Mine / Keep Theirs". | We need a frontend diff viewer/resolver. |
| Auth | Personal Access Token (PAT). | Uniform auth mechanism for GitHub/GitLab/Gitea. |
2. Technology Decisions
2.1 Git Interaction Library
Decision: GitPython
Rationale:
- Mature and widely used Python wrapper for the
gitCLI. - Supports all required operations (clone, fetch, pull, push, branch, commit, diff).
- Easier to handle output parsing compared to raw
subprocesscalls. Alternatives Considered: pygit2: Bindings forlibgit2. Faster but harder to install (binary dependencies) and more complex API.subprocess.run(['git', ...]): Too manual, error-prone parsing.
2.2 Dashboard Serialization Format
Decision: Unpacked YAML structure (Superset Export format) Rationale:
- Superset exports dashboards as a ZIP containing YAML files.
- Unpacking this allows Git to track changes at a granular level (e.g., "changed x-axis label in chart A") rather than a binary blob change.
- Enables meaningful diffs and merge conflict resolution.
2.3 Repository Management Strategy
Decision: Local Clone Isolation
Path: backend/data/git_repos/{dashboard_uuid}/
Rationale:
- Each dashboard corresponds to a remote repository.
- Isolating clones by dashboard UUID prevents collisions.
- The backend acts as a bridge between the Superset instance (source of truth for "current state" in UI) and the Git repo (source of truth for version control).
2.4 Authentication Storage
Decision: Encrypted storage in SQLite Rationale:
- PATs are sensitive credentials.
- They should not be stored in plain text.
- We will use the existing
config_manageror a new secure storage utility if available, or standard encryption if not. (For MVP, standard storage in SQLiteGitServerConfigtable is acceptable per current project standards, assuming internal tool usage).
3. Architecture Patterns
3.1 The "Bridge" Workflow
- Edit: User edits dashboard in Superset UI.
- Sync/Stage: User clicks "Git" in our plugin. Plugin fetches current dashboard export from Superset API, unpacks it to the local git repo working directory.
- Diff:
git status/git diffshows changes between Superset state and last commit. - Commit: User selects files and commits.
- Push: Pushes to remote.
3.2 Deployment Workflow
- Select: User selects target environment (another Superset instance).
- Package: Plugin zips the current
HEAD(or selected commit) of the repo. - Upload: Plugin POSTs the ZIP to the target environment's Import API.
4. Risks & Mitigations
- Risk: Superset Export format changes.
- Mitigation: We rely on the Superset version's export format. If it changes, the YAML structure changes, but Git will just see it as text changes. The Import API compatibility is the main constraint.
- Risk: Large repositories.
- Mitigation: We are doing 1 repo per dashboard, which naturally limits size.
- Risk: Concurrent edits.
- Mitigation: Git itself handles this via locking, but we should ensure our backend doesn't try to run parallel git ops on the same folder.