- Add SQLite database integration for environments and mappings - Update TaskManager to support pausing tasks (AWAITING_MAPPING) - Modify MigrationPlugin to detect missing mappings and wait for resolution - Add frontend UI for handling missing mappings interactively - Create dedicated migration routes and API endpoints - Update .gitignore and project documentation
2.1 KiB
2.1 KiB
Research: Migration Process and UI Redesign
Decision: Fuzzy Matching Algorithm
- Choice:
RapidFuzzlibrary withfuzz.token_sort_ratio. - Rationale:
RapidFuzzis significantly faster thanFuzzyWuzzyand provides robust string similarity metrics.token_sort_ratiois ideal for database names because it ignores word order and is less sensitive to prefixes like "Dev-" or "Prod-". - Alternatives considered:
Levenshtein: Too sensitive to string length and prefixes.Jaro-Winkler: Good for short strings but less effective for multi-word names with different orders.
Decision: Asset Interception Strategy
- Choice: ZIP-based transformation during migration.
- Rationale: Superset's native export/import format is a ZIP archive containing YAML definitions. Intercepting this archive allows for precise modification of database references (UUIDs) before they reach the target environment.
- Implementation:
- Export dashboard/dataset from source (ZIP).
- Extract ZIP to a temporary directory.
- Iterate through
datasets/*.yamlfiles. - Replace
database_uuidvalues based on the mapping table. - Re-package the ZIP.
- Import to target.
Decision: Database Mapping Persistence
- Choice: SQLite with SQLAlchemy/SQLModel.
- Rationale: SQLite is lightweight, requires no separate server, and is perfect for storing local configuration and mappings. It aligns with the project's existing stack.
- Schema:
Environment:id,name,url,credentials_id.DatabaseMapping:id,source_env_id,target_env_id,source_db_uuid,target_db_uuid,source_db_name,target_db_name.
Decision: Superset API Integration
- Choice: Extend existing
SupersetClient. - Rationale:
SupersetClientalready handles authentication, network requests, and basic CRUD for dashboards/datasets. Adding environment-specific fetching and database listing is a natural extension. - New Endpoints to use:
GET /api/v1/database/: List all databases.GET /api/v1/database/{id}: Get detailed database config.