feat: implement plugin architecture and application settings with Svelte UI

- Added plugin base and loader for backend extensibility
- Implemented application settings management with config persistence
- Created Svelte-based frontend with Dashboard and Settings pages
- Added API routes for plugins, tasks, and settings
- Updated documentation and specifications
- Improved project structure and developer tools
This commit is contained in:
2025-12-20 20:48:18 +03:00
parent ce703322c2
commit 2d8cae563f
98 changed files with 7894 additions and 5021 deletions

37
.gitignore vendored Normal file → Executable file
View File

@@ -1,18 +1,19 @@
*__pycache__*
*.ps1
keyring passwords.py
*logs*
*github*
*venv*
*git*
*tech_spec*
dashboards
# Python specific
*.pyc
dist/
*.egg-info/
# Node.js specific
node_modules/
build/
.env*
*__pycache__*
*.ps1
keyring passwords.py
*logs*
*github*
*venv*
*git*
*tech_spec*
dashboards
# Python specific
*.pyc
dist/
*.egg-info/
# Node.js specific
node_modules/
build/
.env*
config.json

0
.kilocode/mcp.json Normal file → Executable file
View File

4
.kilocode/rules/specify-rules.md Normal file → Executable file
View File

@@ -9,8 +9,8 @@ Auto-generated from all feature plans. Last updated: 2025-12-19
## Project Structure
```text
backend/
frontend/
backend/
frontend/
tests/
```

View File

@@ -24,7 +24,7 @@ Identify inconsistencies, duplications, ambiguities, and underspecified items ac
### 1. Initialize Analysis Context
Run `.specify/scripts/powershell/check-prerequisites.ps1 -Json -RequireTasks -IncludeTasks` once from repo root and parse JSON for FEATURE_DIR and AVAILABLE_DOCS. Derive absolute paths:
Run `.specify/scripts/bash/check-prerequisites.sh --json --require-tasks --include-tasks` once from repo root and parse JSON for FEATURE_DIR and AVAILABLE_DOCS. Derive absolute paths:
- SPEC = FEATURE_DIR/spec.md
- PLAN = FEATURE_DIR/plan.md

View File

@@ -33,7 +33,7 @@ You **MUST** consider the user input before proceeding (if not empty).
## Execution Steps
1. **Setup**: Run `.specify/scripts/powershell/check-prerequisites.ps1 -Json` from repo root and parse JSON for FEATURE_DIR and AVAILABLE_DOCS list.
1. **Setup**: Run `.specify/scripts/bash/check-prerequisites.sh --json` from repo root and parse JSON for FEATURE_DIR and AVAILABLE_DOCS list.
- All file paths must be absolute.
- For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").

View File

@@ -22,7 +22,7 @@ Note: This clarification workflow is expected to run (and be completed) BEFORE i
Execution steps:
1. Run `.specify/scripts/powershell/check-prerequisites.ps1 -Json -PathsOnly` from repo root **once** (combined `--json --paths-only` mode / `-Json -PathsOnly`). Parse minimal JSON payload fields:
1. Run `.specify/scripts/bash/check-prerequisites.sh --json --paths-only` from repo root **once** (combined `--json --paths-only` mode / `-Json -PathsOnly`). Parse minimal JSON payload fields:
- `FEATURE_DIR`
- `FEATURE_SPEC`
- (Optionally capture `IMPL_PLAN`, `TASKS` for future chained flows.)

View File

@@ -12,7 +12,7 @@ You **MUST** consider the user input before proceeding (if not empty).
## Outline
1. Run `.specify/scripts/powershell/check-prerequisites.ps1 -Json -RequireTasks -IncludeTasks` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS list. All paths must be absolute. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
1. Run `.specify/scripts/bash/check-prerequisites.sh --json --require-tasks --include-tasks` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS list. All paths must be absolute. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
2. **Check checklists status** (if FEATURE_DIR/checklists/ exists):
- Scan all checklist files in the checklists/ directory

View File

@@ -20,7 +20,7 @@ You **MUST** consider the user input before proceeding (if not empty).
## Outline
1. **Setup**: Run `.specify/scripts/powershell/setup-plan.ps1 -Json` from repo root and parse JSON for FEATURE_SPEC, IMPL_PLAN, SPECS_DIR, BRANCH. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
1. **Setup**: Run `.specify/scripts/bash/setup-plan.sh --json` from repo root and parse JSON for FEATURE_SPEC, IMPL_PLAN, SPECS_DIR, BRANCH. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
2. **Load context**: Read FEATURE_SPEC and `.specify/memory/constitution.md`. Load IMPL_PLAN template (already copied).
@@ -75,7 +75,7 @@ You **MUST** consider the user input before proceeding (if not empty).
- Output OpenAPI/GraphQL schema to `/contracts/`
3. **Agent context update**:
- Run `.specify/scripts/powershell/update-agent-context.ps1 -AgentType kilocode`
- Run `.specify/scripts/bash/update-agent-context.sh kilocode`
- These scripts detect which AI agent is in use
- Update the appropriate agent-specific context file
- Add only new technology from current plan

View File

@@ -54,10 +54,10 @@ Given that feature description, do this:
- Find the highest number N
- Use N+1 for the new branch number
d. Run the script `.specify/scripts/powershell/create-new-feature.ps1 -Json "$ARGUMENTS"` with the calculated number and short-name:
d. Run the script `.specify/scripts/bash/create-new-feature.sh --json "$ARGUMENTS"` with the calculated number and short-name:
- Pass `--number N+1` and `--short-name "your-short-name"` along with the feature description
- Bash example: `.specify/scripts/powershell/create-new-feature.ps1 -Json "$ARGUMENTS" --json --number 5 --short-name "user-auth" "Add user authentication"`
- PowerShell example: `.specify/scripts/powershell/create-new-feature.ps1 -Json "$ARGUMENTS" -Json -Number 5 -ShortName "user-auth" "Add user authentication"`
- Bash example: `.specify/scripts/bash/create-new-feature.sh --json "$ARGUMENTS" --json --number 5 --short-name "user-auth" "Add user authentication"`
- PowerShell example: `.specify/scripts/bash/create-new-feature.sh --json "$ARGUMENTS" -Json -Number 5 -ShortName "user-auth" "Add user authentication"`
**IMPORTANT**:
- Check all three sources (remote branches, local branches, specs directories) to find the highest number

View File

@@ -21,7 +21,7 @@ You **MUST** consider the user input before proceeding (if not empty).
## Outline
1. **Setup**: Run `.specify/scripts/powershell/check-prerequisites.ps1 -Json` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS list. All paths must be absolute. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
1. **Setup**: Run `.specify/scripts/bash/check-prerequisites.sh --json` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS list. All paths must be absolute. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
2. **Load design documents**: Read from FEATURE_DIR:
- **Required**: plan.md (tech stack, libraries, structure), spec.md (user stories with priorities)

View File

@@ -13,7 +13,7 @@ You **MUST** consider the user input before proceeding (if not empty).
## Outline
1. Run `.specify/scripts/powershell/check-prerequisites.ps1 -Json -RequireTasks -IncludeTasks` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS list. All paths must be absolute. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
1. Run `.specify/scripts/bash/check-prerequisites.sh --json --require-tasks --include-tasks` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS list. All paths must be absolute. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
1. From the executed script, extract the path to **tasks**.
1. Get the Git remote by running:

0
.pylintrc Normal file → Executable file
View File

View File

@@ -1,68 +1,50 @@
<!--
SYNC IMPACT REPORT
Version: 1.1.0 (Svelte Support)
Changes:
- Added Svelte Component semantic markup standards.
- Updated File Structure Standards to include `.svelte` files.
- Refined File Structure Standards to distinguish between Python Modules and Svelte Components.
Templates Status:
- .specify/templates/plan-template.md: ⚠ Pending (Needs update to include Component headers in checks).
- .specify/templates/spec-template.md: ✅ Aligned.
- .specify/templates/tasks-template.md: ⚠ Pending (Needs update to include Component definition tasks).
-->
# Semantic Code Generation Constitution
# [PROJECT_NAME] Constitution
<!-- Example: Spec Constitution, TaskFlow Constitution, etc. -->
## Core Principles
### I. Causal Validity (Contracts First)
Semantic definitions (Contracts) must ALWAYS precede implementation code. Logic is downstream of definition. We define the structure and constraints (`[DEF]`, `@PRE`, `@POST`) before writing the executable logic. This ensures that the "what" and "why" govern the "how".
### [PRINCIPLE_1_NAME]
<!-- Example: I. Library-First -->
[PRINCIPLE_1_DESCRIPTION]
<!-- Example: Every feature starts as a standalone library; Libraries must be self-contained, independently testable, documented; Clear purpose required - no organizational-only libraries -->
### II. Immutability of Architecture
Once defined, architectural decisions in the Module Header (`@LAYER`, `@INVARIANT`, `@CONSTRAINT`) are treated as immutable constraints for that module. Changes to these require an explicit refactoring step, not ad-hoc modification during implementation.
### [PRINCIPLE_2_NAME]
<!-- Example: II. CLI Interface -->
[PRINCIPLE_2_DESCRIPTION]
<!-- Example: Every library exposes functionality via CLI; Text in/out protocol: stdin/args → stdout, errors → stderr; Support JSON + human-readable formats -->
### III. Semantic Format Compliance
All output must strictly follow the `[DEF]` / `[/DEF]` anchor syntax with specific Metadata Tags (`@KEY`) and Graph Relations (`@RELATION`). This structure is non-negotiable as it ensures the codebase remains machine-readable, fractal-structured, and optimized for Sparse Attention navigation by AI agents.
### [PRINCIPLE_3_NAME]
<!-- Example: III. Test-First (NON-NEGOTIABLE) -->
[PRINCIPLE_3_DESCRIPTION]
<!-- Example: TDD mandatory: Tests written → User approved → Tests fail → Then implement; Red-Green-Refactor cycle strictly enforced -->
### IV. Design by Contract (DbC)
Contracts are the Source of Truth. Functions and Classes must define their purpose, specifications, and constraints (`@PRE`, `@POST`, `@THROW`) in the metadata block before implementation. Implementation must strictly satisfy these contracts.
### [PRINCIPLE_4_NAME]
<!-- Example: IV. Integration Testing -->
[PRINCIPLE_4_DESCRIPTION]
<!-- Example: Focus areas requiring integration tests: New library contract tests, Contract changes, Inter-service communication, Shared schemas -->
### V. Belief State Logging
Logs must define the agent's internal state for debugging and coherence checks. We use a strict format: `logger.level(f"[{ANCHOR_ID}][{STATE}] {MESSAGE} context={...}")` to track transitions between `Entry`, `Validation`, `Action`, and `Coherence` states.
### [PRINCIPLE_5_NAME]
<!-- Example: V. Observability, VI. Versioning & Breaking Changes, VII. Simplicity -->
[PRINCIPLE_5_DESCRIPTION]
<!-- Example: Text I/O ensures debuggability; Structured logging required; Or: MAJOR.MINOR.BUILD format; Or: Start simple, YAGNI principles -->
## File Structure Standards
## [SECTION_2_NAME]
<!-- Example: Additional Constraints, Security Requirements, Performance Standards, etc. -->
### Python Modules
Every `.py` file must start with a Module definition header (`[DEF:module_name:Module]`) containing:
- `@SEMANTICS`: Keywords for vector search.
- `@PURPOSE`: Primary responsibility of the module.
- `@LAYER`: Architecture layer (Domain/Infra/UI).
- `@RELATION`: Dependencies.
- `@INVARIANT` & `@CONSTRAINT`: Immutable rules.
- `@PUBLIC_API`: Exported symbols.
[SECTION_2_CONTENT]
<!-- Example: Technology stack requirements, compliance standards, deployment policies, etc. -->
### Svelte Components
Every `.svelte` file must start with a Component definition header (`[DEF:ComponentName:Component]`) wrapped in an HTML comment `<!-- ... -->` containing:
- `@SEMANTICS`: Keywords for vector search.
- `@PURPOSE`: Primary responsibility of the component.
- `@LAYER`: Architecture layer (UI/State/Layout).
- `@RELATION`: Child components, Stores used, API calls.
- `@PROPS`: Input properties.
- `@EVENTS`: Emitted events.
- `@INVARIANT`: Immutable UI/State rules.
## [SECTION_3_NAME]
<!-- Example: Development Workflow, Review Process, Quality Gates, etc. -->
## Generation Workflow
The development process follows a strict sequence:
1. **Analyze Request**: Identify target module and graph position.
2. **Define Structure**: Generate `[DEF]` anchors and Contracts FIRST.
3. **Implement Logic**: Write code satisfying Contracts.
4. **Validate**: If logic conflicts with Contract -> Stop -> Report Error.
[SECTION_3_CONTENT]
<!-- Example: Code review requirements, testing gates, deployment approval process, etc. -->
## Governance
This Constitution establishes the "Semantic Code Generation Protocol" as the supreme law of this repository.
<!-- Example: Constitution supersedes all other practices; Amendments require documentation, approval, migration plan -->
- **Automated Enforcement**: All code generation tools and agents must parse and validate adherence to the `[DEF]` syntax and Contract requirements.
- **Amendments**: Changes to the syntax or core principles require a formal amendment to this Constitution and a corresponding update to the constitution
- **Review**: Code reviews must verify that implementation matches the preceding contracts and that no "naked code" exists outside of semantic anchors.
- **Compliance**: Failure to adhere to the `[DEF]` / `[/DEF]` structure constitutes a build failure.
[GOVERNANCE_RULES]
<!-- Example: All PRs/reviews must verify compliance; Complexity must be justified; Use [GUIDANCE_FILE] for runtime development guidance -->
**Version**: 1.1.0 | **Ratified**: 2025-12-19 | **Last Amended**: 2025-12-19
**Version**: [CONSTITUTION_VERSION] | **Ratified**: [RATIFICATION_DATE] | **Last Amended**: [LAST_AMENDED_DATE]
<!-- Example: Version: 2.1.1 | Ratified: 2025-06-13 | Last Amended: 2025-07-16 -->

View File

@@ -0,0 +1,166 @@
#!/usr/bin/env bash
# Consolidated prerequisite checking script
#
# This script provides unified prerequisite checking for Spec-Driven Development workflow.
# It replaces the functionality previously spread across multiple scripts.
#
# Usage: ./check-prerequisites.sh [OPTIONS]
#
# OPTIONS:
# --json Output in JSON format
# --require-tasks Require tasks.md to exist (for implementation phase)
# --include-tasks Include tasks.md in AVAILABLE_DOCS list
# --paths-only Only output path variables (no validation)
# --help, -h Show help message
#
# OUTPUTS:
# JSON mode: {"FEATURE_DIR":"...", "AVAILABLE_DOCS":["..."]}
# Text mode: FEATURE_DIR:... \n AVAILABLE_DOCS: \n ✓/✗ file.md
# Paths only: REPO_ROOT: ... \n BRANCH: ... \n FEATURE_DIR: ... etc.
set -e
# Parse command line arguments
JSON_MODE=false
REQUIRE_TASKS=false
INCLUDE_TASKS=false
PATHS_ONLY=false
for arg in "$@"; do
case "$arg" in
--json)
JSON_MODE=true
;;
--require-tasks)
REQUIRE_TASKS=true
;;
--include-tasks)
INCLUDE_TASKS=true
;;
--paths-only)
PATHS_ONLY=true
;;
--help|-h)
cat << 'EOF'
Usage: check-prerequisites.sh [OPTIONS]
Consolidated prerequisite checking for Spec-Driven Development workflow.
OPTIONS:
--json Output in JSON format
--require-tasks Require tasks.md to exist (for implementation phase)
--include-tasks Include tasks.md in AVAILABLE_DOCS list
--paths-only Only output path variables (no prerequisite validation)
--help, -h Show this help message
EXAMPLES:
# Check task prerequisites (plan.md required)
./check-prerequisites.sh --json
# Check implementation prerequisites (plan.md + tasks.md required)
./check-prerequisites.sh --json --require-tasks --include-tasks
# Get feature paths only (no validation)
./check-prerequisites.sh --paths-only
EOF
exit 0
;;
*)
echo "ERROR: Unknown option '$arg'. Use --help for usage information." >&2
exit 1
;;
esac
done
# Source common functions
SCRIPT_DIR="$(CDPATH="" cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "$SCRIPT_DIR/common.sh"
# Get feature paths and validate branch
eval $(get_feature_paths)
check_feature_branch "$CURRENT_BRANCH" "$HAS_GIT" || exit 1
# If paths-only mode, output paths and exit (support JSON + paths-only combined)
if $PATHS_ONLY; then
if $JSON_MODE; then
# Minimal JSON paths payload (no validation performed)
printf '{"REPO_ROOT":"%s","BRANCH":"%s","FEATURE_DIR":"%s","FEATURE_SPEC":"%s","IMPL_PLAN":"%s","TASKS":"%s"}\n' \
"$REPO_ROOT" "$CURRENT_BRANCH" "$FEATURE_DIR" "$FEATURE_SPEC" "$IMPL_PLAN" "$TASKS"
else
echo "REPO_ROOT: $REPO_ROOT"
echo "BRANCH: $CURRENT_BRANCH"
echo "FEATURE_DIR: $FEATURE_DIR"
echo "FEATURE_SPEC: $FEATURE_SPEC"
echo "IMPL_PLAN: $IMPL_PLAN"
echo "TASKS: $TASKS"
fi
exit 0
fi
# Validate required directories and files
if [[ ! -d "$FEATURE_DIR" ]]; then
echo "ERROR: Feature directory not found: $FEATURE_DIR" >&2
echo "Run /speckit.specify first to create the feature structure." >&2
exit 1
fi
if [[ ! -f "$IMPL_PLAN" ]]; then
echo "ERROR: plan.md not found in $FEATURE_DIR" >&2
echo "Run /speckit.plan first to create the implementation plan." >&2
exit 1
fi
# Check for tasks.md if required
if $REQUIRE_TASKS && [[ ! -f "$TASKS" ]]; then
echo "ERROR: tasks.md not found in $FEATURE_DIR" >&2
echo "Run /speckit.tasks first to create the task list." >&2
exit 1
fi
# Build list of available documents
docs=()
# Always check these optional docs
[[ -f "$RESEARCH" ]] && docs+=("research.md")
[[ -f "$DATA_MODEL" ]] && docs+=("data-model.md")
# Check contracts directory (only if it exists and has files)
if [[ -d "$CONTRACTS_DIR" ]] && [[ -n "$(ls -A "$CONTRACTS_DIR" 2>/dev/null)" ]]; then
docs+=("contracts/")
fi
[[ -f "$QUICKSTART" ]] && docs+=("quickstart.md")
# Include tasks.md if requested and it exists
if $INCLUDE_TASKS && [[ -f "$TASKS" ]]; then
docs+=("tasks.md")
fi
# Output results
if $JSON_MODE; then
# Build JSON array of documents
if [[ ${#docs[@]} -eq 0 ]]; then
json_docs="[]"
else
json_docs=$(printf '"%s",' "${docs[@]}")
json_docs="[${json_docs%,}]"
fi
printf '{"FEATURE_DIR":"%s","AVAILABLE_DOCS":%s}\n' "$FEATURE_DIR" "$json_docs"
else
# Text output
echo "FEATURE_DIR:$FEATURE_DIR"
echo "AVAILABLE_DOCS:"
# Show status of each potential document
check_file "$RESEARCH" "research.md"
check_file "$DATA_MODEL" "data-model.md"
check_dir "$CONTRACTS_DIR" "contracts/"
check_file "$QUICKSTART" "quickstart.md"
if $INCLUDE_TASKS; then
check_file "$TASKS" "tasks.md"
fi
fi

156
.specify/scripts/bash/common.sh Executable file
View File

@@ -0,0 +1,156 @@
#!/usr/bin/env bash
# Common functions and variables for all scripts
# Get repository root, with fallback for non-git repositories
get_repo_root() {
if git rev-parse --show-toplevel >/dev/null 2>&1; then
git rev-parse --show-toplevel
else
# Fall back to script location for non-git repos
local script_dir="$(CDPATH="" cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
(cd "$script_dir/../../.." && pwd)
fi
}
# Get current branch, with fallback for non-git repositories
get_current_branch() {
# First check if SPECIFY_FEATURE environment variable is set
if [[ -n "${SPECIFY_FEATURE:-}" ]]; then
echo "$SPECIFY_FEATURE"
return
fi
# Then check git if available
if git rev-parse --abbrev-ref HEAD >/dev/null 2>&1; then
git rev-parse --abbrev-ref HEAD
return
fi
# For non-git repos, try to find the latest feature directory
local repo_root=$(get_repo_root)
local specs_dir="$repo_root/specs"
if [[ -d "$specs_dir" ]]; then
local latest_feature=""
local highest=0
for dir in "$specs_dir"/*; do
if [[ -d "$dir" ]]; then
local dirname=$(basename "$dir")
if [[ "$dirname" =~ ^([0-9]{3})- ]]; then
local number=${BASH_REMATCH[1]}
number=$((10#$number))
if [[ "$number" -gt "$highest" ]]; then
highest=$number
latest_feature=$dirname
fi
fi
fi
done
if [[ -n "$latest_feature" ]]; then
echo "$latest_feature"
return
fi
fi
echo "main" # Final fallback
}
# Check if we have git available
has_git() {
git rev-parse --show-toplevel >/dev/null 2>&1
}
check_feature_branch() {
local branch="$1"
local has_git_repo="$2"
# For non-git repos, we can't enforce branch naming but still provide output
if [[ "$has_git_repo" != "true" ]]; then
echo "[specify] Warning: Git repository not detected; skipped branch validation" >&2
return 0
fi
if [[ ! "$branch" =~ ^[0-9]{3}- ]]; then
echo "ERROR: Not on a feature branch. Current branch: $branch" >&2
echo "Feature branches should be named like: 001-feature-name" >&2
return 1
fi
return 0
}
get_feature_dir() { echo "$1/specs/$2"; }
# Find feature directory by numeric prefix instead of exact branch match
# This allows multiple branches to work on the same spec (e.g., 004-fix-bug, 004-add-feature)
find_feature_dir_by_prefix() {
local repo_root="$1"
local branch_name="$2"
local specs_dir="$repo_root/specs"
# Extract numeric prefix from branch (e.g., "004" from "004-whatever")
if [[ ! "$branch_name" =~ ^([0-9]{3})- ]]; then
# If branch doesn't have numeric prefix, fall back to exact match
echo "$specs_dir/$branch_name"
return
fi
local prefix="${BASH_REMATCH[1]}"
# Search for directories in specs/ that start with this prefix
local matches=()
if [[ -d "$specs_dir" ]]; then
for dir in "$specs_dir"/"$prefix"-*; do
if [[ -d "$dir" ]]; then
matches+=("$(basename "$dir")")
fi
done
fi
# Handle results
if [[ ${#matches[@]} -eq 0 ]]; then
# No match found - return the branch name path (will fail later with clear error)
echo "$specs_dir/$branch_name"
elif [[ ${#matches[@]} -eq 1 ]]; then
# Exactly one match - perfect!
echo "$specs_dir/${matches[0]}"
else
# Multiple matches - this shouldn't happen with proper naming convention
echo "ERROR: Multiple spec directories found with prefix '$prefix': ${matches[*]}" >&2
echo "Please ensure only one spec directory exists per numeric prefix." >&2
echo "$specs_dir/$branch_name" # Return something to avoid breaking the script
fi
}
get_feature_paths() {
local repo_root=$(get_repo_root)
local current_branch=$(get_current_branch)
local has_git_repo="false"
if has_git; then
has_git_repo="true"
fi
# Use prefix-based lookup to support multiple branches per spec
local feature_dir=$(find_feature_dir_by_prefix "$repo_root" "$current_branch")
cat <<EOF
REPO_ROOT='$repo_root'
CURRENT_BRANCH='$current_branch'
HAS_GIT='$has_git_repo'
FEATURE_DIR='$feature_dir'
FEATURE_SPEC='$feature_dir/spec.md'
IMPL_PLAN='$feature_dir/plan.md'
TASKS='$feature_dir/tasks.md'
RESEARCH='$feature_dir/research.md'
DATA_MODEL='$feature_dir/data-model.md'
QUICKSTART='$feature_dir/quickstart.md'
CONTRACTS_DIR='$feature_dir/contracts'
EOF
}
check_file() { [[ -f "$1" ]] && echo "$2" || echo "$2"; }
check_dir() { [[ -d "$1" && -n $(ls -A "$1" 2>/dev/null) ]] && echo "$2" || echo "$2"; }

View File

@@ -0,0 +1,297 @@
#!/usr/bin/env bash
set -e
JSON_MODE=false
SHORT_NAME=""
BRANCH_NUMBER=""
ARGS=()
i=1
while [ $i -le $# ]; do
arg="${!i}"
case "$arg" in
--json)
JSON_MODE=true
;;
--short-name)
if [ $((i + 1)) -gt $# ]; then
echo 'Error: --short-name requires a value' >&2
exit 1
fi
i=$((i + 1))
next_arg="${!i}"
# Check if the next argument is another option (starts with --)
if [[ "$next_arg" == --* ]]; then
echo 'Error: --short-name requires a value' >&2
exit 1
fi
SHORT_NAME="$next_arg"
;;
--number)
if [ $((i + 1)) -gt $# ]; then
echo 'Error: --number requires a value' >&2
exit 1
fi
i=$((i + 1))
next_arg="${!i}"
if [[ "$next_arg" == --* ]]; then
echo 'Error: --number requires a value' >&2
exit 1
fi
BRANCH_NUMBER="$next_arg"
;;
--help|-h)
echo "Usage: $0 [--json] [--short-name <name>] [--number N] <feature_description>"
echo ""
echo "Options:"
echo " --json Output in JSON format"
echo " --short-name <name> Provide a custom short name (2-4 words) for the branch"
echo " --number N Specify branch number manually (overrides auto-detection)"
echo " --help, -h Show this help message"
echo ""
echo "Examples:"
echo " $0 'Add user authentication system' --short-name 'user-auth'"
echo " $0 'Implement OAuth2 integration for API' --number 5"
exit 0
;;
*)
ARGS+=("$arg")
;;
esac
i=$((i + 1))
done
FEATURE_DESCRIPTION="${ARGS[*]}"
if [ -z "$FEATURE_DESCRIPTION" ]; then
echo "Usage: $0 [--json] [--short-name <name>] [--number N] <feature_description>" >&2
exit 1
fi
# Function to find the repository root by searching for existing project markers
find_repo_root() {
local dir="$1"
while [ "$dir" != "/" ]; do
if [ -d "$dir/.git" ] || [ -d "$dir/.specify" ]; then
echo "$dir"
return 0
fi
dir="$(dirname "$dir")"
done
return 1
}
# Function to get highest number from specs directory
get_highest_from_specs() {
local specs_dir="$1"
local highest=0
if [ -d "$specs_dir" ]; then
for dir in "$specs_dir"/*; do
[ -d "$dir" ] || continue
dirname=$(basename "$dir")
number=$(echo "$dirname" | grep -o '^[0-9]\+' || echo "0")
number=$((10#$number))
if [ "$number" -gt "$highest" ]; then
highest=$number
fi
done
fi
echo "$highest"
}
# Function to get highest number from git branches
get_highest_from_branches() {
local highest=0
# Get all branches (local and remote)
branches=$(git branch -a 2>/dev/null || echo "")
if [ -n "$branches" ]; then
while IFS= read -r branch; do
# Clean branch name: remove leading markers and remote prefixes
clean_branch=$(echo "$branch" | sed 's/^[* ]*//; s|^remotes/[^/]*/||')
# Extract feature number if branch matches pattern ###-*
if echo "$clean_branch" | grep -q '^[0-9]\{3\}-'; then
number=$(echo "$clean_branch" | grep -o '^[0-9]\{3\}' || echo "0")
number=$((10#$number))
if [ "$number" -gt "$highest" ]; then
highest=$number
fi
fi
done <<< "$branches"
fi
echo "$highest"
}
# Function to check existing branches (local and remote) and return next available number
check_existing_branches() {
local specs_dir="$1"
# Fetch all remotes to get latest branch info (suppress errors if no remotes)
git fetch --all --prune 2>/dev/null || true
# Get highest number from ALL branches (not just matching short name)
local highest_branch=$(get_highest_from_branches)
# Get highest number from ALL specs (not just matching short name)
local highest_spec=$(get_highest_from_specs "$specs_dir")
# Take the maximum of both
local max_num=$highest_branch
if [ "$highest_spec" -gt "$max_num" ]; then
max_num=$highest_spec
fi
# Return next number
echo $((max_num + 1))
}
# Function to clean and format a branch name
clean_branch_name() {
local name="$1"
echo "$name" | tr '[:upper:]' '[:lower:]' | sed 's/[^a-z0-9]/-/g' | sed 's/-\+/-/g' | sed 's/^-//' | sed 's/-$//'
}
# Resolve repository root. Prefer git information when available, but fall back
# to searching for repository markers so the workflow still functions in repositories that
# were initialised with --no-git.
SCRIPT_DIR="$(CDPATH="" cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
if git rev-parse --show-toplevel >/dev/null 2>&1; then
REPO_ROOT=$(git rev-parse --show-toplevel)
HAS_GIT=true
else
REPO_ROOT="$(find_repo_root "$SCRIPT_DIR")"
if [ -z "$REPO_ROOT" ]; then
echo "Error: Could not determine repository root. Please run this script from within the repository." >&2
exit 1
fi
HAS_GIT=false
fi
cd "$REPO_ROOT"
SPECS_DIR="$REPO_ROOT/specs"
mkdir -p "$SPECS_DIR"
# Function to generate branch name with stop word filtering and length filtering
generate_branch_name() {
local description="$1"
# Common stop words to filter out
local stop_words="^(i|a|an|the|to|for|of|in|on|at|by|with|from|is|are|was|were|be|been|being|have|has|had|do|does|did|will|would|should|could|can|may|might|must|shall|this|that|these|those|my|your|our|their|want|need|add|get|set)$"
# Convert to lowercase and split into words
local clean_name=$(echo "$description" | tr '[:upper:]' '[:lower:]' | sed 's/[^a-z0-9]/ /g')
# Filter words: remove stop words and words shorter than 3 chars (unless they're uppercase acronyms in original)
local meaningful_words=()
for word in $clean_name; do
# Skip empty words
[ -z "$word" ] && continue
# Keep words that are NOT stop words AND (length >= 3 OR are potential acronyms)
if ! echo "$word" | grep -qiE "$stop_words"; then
if [ ${#word} -ge 3 ]; then
meaningful_words+=("$word")
elif echo "$description" | grep -q "\b${word^^}\b"; then
# Keep short words if they appear as uppercase in original (likely acronyms)
meaningful_words+=("$word")
fi
fi
done
# If we have meaningful words, use first 3-4 of them
if [ ${#meaningful_words[@]} -gt 0 ]; then
local max_words=3
if [ ${#meaningful_words[@]} -eq 4 ]; then max_words=4; fi
local result=""
local count=0
for word in "${meaningful_words[@]}"; do
if [ $count -ge $max_words ]; then break; fi
if [ -n "$result" ]; then result="$result-"; fi
result="$result$word"
count=$((count + 1))
done
echo "$result"
else
# Fallback to original logic if no meaningful words found
local cleaned=$(clean_branch_name "$description")
echo "$cleaned" | tr '-' '\n' | grep -v '^$' | head -3 | tr '\n' '-' | sed 's/-$//'
fi
}
# Generate branch name
if [ -n "$SHORT_NAME" ]; then
# Use provided short name, just clean it up
BRANCH_SUFFIX=$(clean_branch_name "$SHORT_NAME")
else
# Generate from description with smart filtering
BRANCH_SUFFIX=$(generate_branch_name "$FEATURE_DESCRIPTION")
fi
# Determine branch number
if [ -z "$BRANCH_NUMBER" ]; then
if [ "$HAS_GIT" = true ]; then
# Check existing branches on remotes
BRANCH_NUMBER=$(check_existing_branches "$SPECS_DIR")
else
# Fall back to local directory check
HIGHEST=$(get_highest_from_specs "$SPECS_DIR")
BRANCH_NUMBER=$((HIGHEST + 1))
fi
fi
# Force base-10 interpretation to prevent octal conversion (e.g., 010 → 8 in octal, but should be 10 in decimal)
FEATURE_NUM=$(printf "%03d" "$((10#$BRANCH_NUMBER))")
BRANCH_NAME="${FEATURE_NUM}-${BRANCH_SUFFIX}"
# GitHub enforces a 244-byte limit on branch names
# Validate and truncate if necessary
MAX_BRANCH_LENGTH=244
if [ ${#BRANCH_NAME} -gt $MAX_BRANCH_LENGTH ]; then
# Calculate how much we need to trim from suffix
# Account for: feature number (3) + hyphen (1) = 4 chars
MAX_SUFFIX_LENGTH=$((MAX_BRANCH_LENGTH - 4))
# Truncate suffix at word boundary if possible
TRUNCATED_SUFFIX=$(echo "$BRANCH_SUFFIX" | cut -c1-$MAX_SUFFIX_LENGTH)
# Remove trailing hyphen if truncation created one
TRUNCATED_SUFFIX=$(echo "$TRUNCATED_SUFFIX" | sed 's/-$//')
ORIGINAL_BRANCH_NAME="$BRANCH_NAME"
BRANCH_NAME="${FEATURE_NUM}-${TRUNCATED_SUFFIX}"
>&2 echo "[specify] Warning: Branch name exceeded GitHub's 244-byte limit"
>&2 echo "[specify] Original: $ORIGINAL_BRANCH_NAME (${#ORIGINAL_BRANCH_NAME} bytes)"
>&2 echo "[specify] Truncated to: $BRANCH_NAME (${#BRANCH_NAME} bytes)"
fi
if [ "$HAS_GIT" = true ]; then
git checkout -b "$BRANCH_NAME"
else
>&2 echo "[specify] Warning: Git repository not detected; skipped branch creation for $BRANCH_NAME"
fi
FEATURE_DIR="$SPECS_DIR/$BRANCH_NAME"
mkdir -p "$FEATURE_DIR"
TEMPLATE="$REPO_ROOT/.specify/templates/spec-template.md"
SPEC_FILE="$FEATURE_DIR/spec.md"
if [ -f "$TEMPLATE" ]; then cp "$TEMPLATE" "$SPEC_FILE"; else touch "$SPEC_FILE"; fi
# Set the SPECIFY_FEATURE environment variable for the current session
export SPECIFY_FEATURE="$BRANCH_NAME"
if $JSON_MODE; then
printf '{"BRANCH_NAME":"%s","SPEC_FILE":"%s","FEATURE_NUM":"%s"}\n' "$BRANCH_NAME" "$SPEC_FILE" "$FEATURE_NUM"
else
echo "BRANCH_NAME: $BRANCH_NAME"
echo "SPEC_FILE: $SPEC_FILE"
echo "FEATURE_NUM: $FEATURE_NUM"
echo "SPECIFY_FEATURE environment variable set to: $BRANCH_NAME"
fi

View File

@@ -0,0 +1,61 @@
#!/usr/bin/env bash
set -e
# Parse command line arguments
JSON_MODE=false
ARGS=()
for arg in "$@"; do
case "$arg" in
--json)
JSON_MODE=true
;;
--help|-h)
echo "Usage: $0 [--json]"
echo " --json Output results in JSON format"
echo " --help Show this help message"
exit 0
;;
*)
ARGS+=("$arg")
;;
esac
done
# Get script directory and load common functions
SCRIPT_DIR="$(CDPATH="" cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "$SCRIPT_DIR/common.sh"
# Get all paths and variables from common functions
eval $(get_feature_paths)
# Check if we're on a proper feature branch (only for git repos)
check_feature_branch "$CURRENT_BRANCH" "$HAS_GIT" || exit 1
# Ensure the feature directory exists
mkdir -p "$FEATURE_DIR"
# Copy plan template if it exists
TEMPLATE="$REPO_ROOT/.specify/templates/plan-template.md"
if [[ -f "$TEMPLATE" ]]; then
cp "$TEMPLATE" "$IMPL_PLAN"
echo "Copied plan template to $IMPL_PLAN"
else
echo "Warning: Plan template not found at $TEMPLATE"
# Create a basic plan file if template doesn't exist
touch "$IMPL_PLAN"
fi
# Output results
if $JSON_MODE; then
printf '{"FEATURE_SPEC":"%s","IMPL_PLAN":"%s","SPECS_DIR":"%s","BRANCH":"%s","HAS_GIT":"%s"}\n' \
"$FEATURE_SPEC" "$IMPL_PLAN" "$FEATURE_DIR" "$CURRENT_BRANCH" "$HAS_GIT"
else
echo "FEATURE_SPEC: $FEATURE_SPEC"
echo "IMPL_PLAN: $IMPL_PLAN"
echo "SPECS_DIR: $FEATURE_DIR"
echo "BRANCH: $CURRENT_BRANCH"
echo "HAS_GIT: $HAS_GIT"
fi

View File

@@ -0,0 +1,799 @@
#!/usr/bin/env bash
# Update agent context files with information from plan.md
#
# This script maintains AI agent context files by parsing feature specifications
# and updating agent-specific configuration files with project information.
#
# MAIN FUNCTIONS:
# 1. Environment Validation
# - Verifies git repository structure and branch information
# - Checks for required plan.md files and templates
# - Validates file permissions and accessibility
#
# 2. Plan Data Extraction
# - Parses plan.md files to extract project metadata
# - Identifies language/version, frameworks, databases, and project types
# - Handles missing or incomplete specification data gracefully
#
# 3. Agent File Management
# - Creates new agent context files from templates when needed
# - Updates existing agent files with new project information
# - Preserves manual additions and custom configurations
# - Supports multiple AI agent formats and directory structures
#
# 4. Content Generation
# - Generates language-specific build/test commands
# - Creates appropriate project directory structures
# - Updates technology stacks and recent changes sections
# - Maintains consistent formatting and timestamps
#
# 5. Multi-Agent Support
# - Handles agent-specific file paths and naming conventions
# - Supports: Claude, Gemini, Copilot, Cursor, Qwen, opencode, Codex, Windsurf, Kilo Code, Auggie CLI, Roo Code, CodeBuddy CLI, Qoder CLI, Amp, SHAI, or Amazon Q Developer CLI
# - Can update single agents or all existing agent files
# - Creates default Claude file if no agent files exist
#
# Usage: ./update-agent-context.sh [agent_type]
# Agent types: claude|gemini|copilot|cursor-agent|qwen|opencode|codex|windsurf|kilocode|auggie|shai|q|bob|qoder
# Leave empty to update all existing agent files
set -e
# Enable strict error handling
set -u
set -o pipefail
#==============================================================================
# Configuration and Global Variables
#==============================================================================
# Get script directory and load common functions
SCRIPT_DIR="$(CDPATH="" cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "$SCRIPT_DIR/common.sh"
# Get all paths and variables from common functions
eval $(get_feature_paths)
NEW_PLAN="$IMPL_PLAN" # Alias for compatibility with existing code
AGENT_TYPE="${1:-}"
# Agent-specific file paths
CLAUDE_FILE="$REPO_ROOT/CLAUDE.md"
GEMINI_FILE="$REPO_ROOT/GEMINI.md"
COPILOT_FILE="$REPO_ROOT/.github/agents/copilot-instructions.md"
CURSOR_FILE="$REPO_ROOT/.cursor/rules/specify-rules.mdc"
QWEN_FILE="$REPO_ROOT/QWEN.md"
AGENTS_FILE="$REPO_ROOT/AGENTS.md"
WINDSURF_FILE="$REPO_ROOT/.windsurf/rules/specify-rules.md"
KILOCODE_FILE="$REPO_ROOT/.kilocode/rules/specify-rules.md"
AUGGIE_FILE="$REPO_ROOT/.augment/rules/specify-rules.md"
ROO_FILE="$REPO_ROOT/.roo/rules/specify-rules.md"
CODEBUDDY_FILE="$REPO_ROOT/CODEBUDDY.md"
QODER_FILE="$REPO_ROOT/QODER.md"
AMP_FILE="$REPO_ROOT/AGENTS.md"
SHAI_FILE="$REPO_ROOT/SHAI.md"
Q_FILE="$REPO_ROOT/AGENTS.md"
BOB_FILE="$REPO_ROOT/AGENTS.md"
# Template file
TEMPLATE_FILE="$REPO_ROOT/.specify/templates/agent-file-template.md"
# Global variables for parsed plan data
NEW_LANG=""
NEW_FRAMEWORK=""
NEW_DB=""
NEW_PROJECT_TYPE=""
#==============================================================================
# Utility Functions
#==============================================================================
log_info() {
echo "INFO: $1"
}
log_success() {
echo "$1"
}
log_error() {
echo "ERROR: $1" >&2
}
log_warning() {
echo "WARNING: $1" >&2
}
# Cleanup function for temporary files
cleanup() {
local exit_code=$?
rm -f /tmp/agent_update_*_$$
rm -f /tmp/manual_additions_$$
exit $exit_code
}
# Set up cleanup trap
trap cleanup EXIT INT TERM
#==============================================================================
# Validation Functions
#==============================================================================
validate_environment() {
# Check if we have a current branch/feature (git or non-git)
if [[ -z "$CURRENT_BRANCH" ]]; then
log_error "Unable to determine current feature"
if [[ "$HAS_GIT" == "true" ]]; then
log_info "Make sure you're on a feature branch"
else
log_info "Set SPECIFY_FEATURE environment variable or create a feature first"
fi
exit 1
fi
# Check if plan.md exists
if [[ ! -f "$NEW_PLAN" ]]; then
log_error "No plan.md found at $NEW_PLAN"
log_info "Make sure you're working on a feature with a corresponding spec directory"
if [[ "$HAS_GIT" != "true" ]]; then
log_info "Use: export SPECIFY_FEATURE=your-feature-name or create a new feature first"
fi
exit 1
fi
# Check if template exists (needed for new files)
if [[ ! -f "$TEMPLATE_FILE" ]]; then
log_warning "Template file not found at $TEMPLATE_FILE"
log_warning "Creating new agent files will fail"
fi
}
#==============================================================================
# Plan Parsing Functions
#==============================================================================
extract_plan_field() {
local field_pattern="$1"
local plan_file="$2"
grep "^\*\*${field_pattern}\*\*: " "$plan_file" 2>/dev/null | \
head -1 | \
sed "s|^\*\*${field_pattern}\*\*: ||" | \
sed 's/^[ \t]*//;s/[ \t]*$//' | \
grep -v "NEEDS CLARIFICATION" | \
grep -v "^N/A$" || echo ""
}
parse_plan_data() {
local plan_file="$1"
if [[ ! -f "$plan_file" ]]; then
log_error "Plan file not found: $plan_file"
return 1
fi
if [[ ! -r "$plan_file" ]]; then
log_error "Plan file is not readable: $plan_file"
return 1
fi
log_info "Parsing plan data from $plan_file"
NEW_LANG=$(extract_plan_field "Language/Version" "$plan_file")
NEW_FRAMEWORK=$(extract_plan_field "Primary Dependencies" "$plan_file")
NEW_DB=$(extract_plan_field "Storage" "$plan_file")
NEW_PROJECT_TYPE=$(extract_plan_field "Project Type" "$plan_file")
# Log what we found
if [[ -n "$NEW_LANG" ]]; then
log_info "Found language: $NEW_LANG"
else
log_warning "No language information found in plan"
fi
if [[ -n "$NEW_FRAMEWORK" ]]; then
log_info "Found framework: $NEW_FRAMEWORK"
fi
if [[ -n "$NEW_DB" ]] && [[ "$NEW_DB" != "N/A" ]]; then
log_info "Found database: $NEW_DB"
fi
if [[ -n "$NEW_PROJECT_TYPE" ]]; then
log_info "Found project type: $NEW_PROJECT_TYPE"
fi
}
format_technology_stack() {
local lang="$1"
local framework="$2"
local parts=()
# Add non-empty parts
[[ -n "$lang" && "$lang" != "NEEDS CLARIFICATION" ]] && parts+=("$lang")
[[ -n "$framework" && "$framework" != "NEEDS CLARIFICATION" && "$framework" != "N/A" ]] && parts+=("$framework")
# Join with proper formatting
if [[ ${#parts[@]} -eq 0 ]]; then
echo ""
elif [[ ${#parts[@]} -eq 1 ]]; then
echo "${parts[0]}"
else
# Join multiple parts with " + "
local result="${parts[0]}"
for ((i=1; i<${#parts[@]}; i++)); do
result="$result + ${parts[i]}"
done
echo "$result"
fi
}
#==============================================================================
# Template and Content Generation Functions
#==============================================================================
get_project_structure() {
local project_type="$1"
if [[ "$project_type" == *"web"* ]]; then
echo "backend/\\nfrontend/\\ntests/"
else
echo "src/\\ntests/"
fi
}
get_commands_for_language() {
local lang="$1"
case "$lang" in
*"Python"*)
echo "cd src && pytest && ruff check ."
;;
*"Rust"*)
echo "cargo test && cargo clippy"
;;
*"JavaScript"*|*"TypeScript"*)
echo "npm test \\&\\& npm run lint"
;;
*)
echo "# Add commands for $lang"
;;
esac
}
get_language_conventions() {
local lang="$1"
echo "$lang: Follow standard conventions"
}
create_new_agent_file() {
local target_file="$1"
local temp_file="$2"
local project_name="$3"
local current_date="$4"
if [[ ! -f "$TEMPLATE_FILE" ]]; then
log_error "Template not found at $TEMPLATE_FILE"
return 1
fi
if [[ ! -r "$TEMPLATE_FILE" ]]; then
log_error "Template file is not readable: $TEMPLATE_FILE"
return 1
fi
log_info "Creating new agent context file from template..."
if ! cp "$TEMPLATE_FILE" "$temp_file"; then
log_error "Failed to copy template file"
return 1
fi
# Replace template placeholders
local project_structure
project_structure=$(get_project_structure "$NEW_PROJECT_TYPE")
local commands
commands=$(get_commands_for_language "$NEW_LANG")
local language_conventions
language_conventions=$(get_language_conventions "$NEW_LANG")
# Perform substitutions with error checking using safer approach
# Escape special characters for sed by using a different delimiter or escaping
local escaped_lang=$(printf '%s\n' "$NEW_LANG" | sed 's/[\[\.*^$()+{}|]/\\&/g')
local escaped_framework=$(printf '%s\n' "$NEW_FRAMEWORK" | sed 's/[\[\.*^$()+{}|]/\\&/g')
local escaped_branch=$(printf '%s\n' "$CURRENT_BRANCH" | sed 's/[\[\.*^$()+{}|]/\\&/g')
# Build technology stack and recent change strings conditionally
local tech_stack
if [[ -n "$escaped_lang" && -n "$escaped_framework" ]]; then
tech_stack="- $escaped_lang + $escaped_framework ($escaped_branch)"
elif [[ -n "$escaped_lang" ]]; then
tech_stack="- $escaped_lang ($escaped_branch)"
elif [[ -n "$escaped_framework" ]]; then
tech_stack="- $escaped_framework ($escaped_branch)"
else
tech_stack="- ($escaped_branch)"
fi
local recent_change
if [[ -n "$escaped_lang" && -n "$escaped_framework" ]]; then
recent_change="- $escaped_branch: Added $escaped_lang + $escaped_framework"
elif [[ -n "$escaped_lang" ]]; then
recent_change="- $escaped_branch: Added $escaped_lang"
elif [[ -n "$escaped_framework" ]]; then
recent_change="- $escaped_branch: Added $escaped_framework"
else
recent_change="- $escaped_branch: Added"
fi
local substitutions=(
"s|\[PROJECT NAME\]|$project_name|"
"s|\[DATE\]|$current_date|"
"s|\[EXTRACTED FROM ALL PLAN.MD FILES\]|$tech_stack|"
"s|\[ACTUAL STRUCTURE FROM PLANS\]|$project_structure|g"
"s|\[ONLY COMMANDS FOR ACTIVE TECHNOLOGIES\]|$commands|"
"s|\[LANGUAGE-SPECIFIC, ONLY FOR LANGUAGES IN USE\]|$language_conventions|"
"s|\[LAST 3 FEATURES AND WHAT THEY ADDED\]|$recent_change|"
)
for substitution in "${substitutions[@]}"; do
if ! sed -i.bak -e "$substitution" "$temp_file"; then
log_error "Failed to perform substitution: $substitution"
rm -f "$temp_file" "$temp_file.bak"
return 1
fi
done
# Convert \n sequences to actual newlines
newline=$(printf '\n')
sed -i.bak2 "s/\\\\n/${newline}/g" "$temp_file"
# Clean up backup files
rm -f "$temp_file.bak" "$temp_file.bak2"
return 0
}
update_existing_agent_file() {
local target_file="$1"
local current_date="$2"
log_info "Updating existing agent context file..."
# Use a single temporary file for atomic update
local temp_file
temp_file=$(mktemp) || {
log_error "Failed to create temporary file"
return 1
}
# Process the file in one pass
local tech_stack=$(format_technology_stack "$NEW_LANG" "$NEW_FRAMEWORK")
local new_tech_entries=()
local new_change_entry=""
# Prepare new technology entries
if [[ -n "$tech_stack" ]] && ! grep -q "$tech_stack" "$target_file"; then
new_tech_entries+=("- $tech_stack ($CURRENT_BRANCH)")
fi
if [[ -n "$NEW_DB" ]] && [[ "$NEW_DB" != "N/A" ]] && [[ "$NEW_DB" != "NEEDS CLARIFICATION" ]] && ! grep -q "$NEW_DB" "$target_file"; then
new_tech_entries+=("- $NEW_DB ($CURRENT_BRANCH)")
fi
# Prepare new change entry
if [[ -n "$tech_stack" ]]; then
new_change_entry="- $CURRENT_BRANCH: Added $tech_stack"
elif [[ -n "$NEW_DB" ]] && [[ "$NEW_DB" != "N/A" ]] && [[ "$NEW_DB" != "NEEDS CLARIFICATION" ]]; then
new_change_entry="- $CURRENT_BRANCH: Added $NEW_DB"
fi
# Check if sections exist in the file
local has_active_technologies=0
local has_recent_changes=0
if grep -q "^## Active Technologies" "$target_file" 2>/dev/null; then
has_active_technologies=1
fi
if grep -q "^## Recent Changes" "$target_file" 2>/dev/null; then
has_recent_changes=1
fi
# Process file line by line
local in_tech_section=false
local in_changes_section=false
local tech_entries_added=false
local changes_entries_added=false
local existing_changes_count=0
local file_ended=false
while IFS= read -r line || [[ -n "$line" ]]; do
# Handle Active Technologies section
if [[ "$line" == "## Active Technologies" ]]; then
echo "$line" >> "$temp_file"
in_tech_section=true
continue
elif [[ $in_tech_section == true ]] && [[ "$line" =~ ^##[[:space:]] ]]; then
# Add new tech entries before closing the section
if [[ $tech_entries_added == false ]] && [[ ${#new_tech_entries[@]} -gt 0 ]]; then
printf '%s\n' "${new_tech_entries[@]}" >> "$temp_file"
tech_entries_added=true
fi
echo "$line" >> "$temp_file"
in_tech_section=false
continue
elif [[ $in_tech_section == true ]] && [[ -z "$line" ]]; then
# Add new tech entries before empty line in tech section
if [[ $tech_entries_added == false ]] && [[ ${#new_tech_entries[@]} -gt 0 ]]; then
printf '%s\n' "${new_tech_entries[@]}" >> "$temp_file"
tech_entries_added=true
fi
echo "$line" >> "$temp_file"
continue
fi
# Handle Recent Changes section
if [[ "$line" == "## Recent Changes" ]]; then
echo "$line" >> "$temp_file"
# Add new change entry right after the heading
if [[ -n "$new_change_entry" ]]; then
echo "$new_change_entry" >> "$temp_file"
fi
in_changes_section=true
changes_entries_added=true
continue
elif [[ $in_changes_section == true ]] && [[ "$line" =~ ^##[[:space:]] ]]; then
echo "$line" >> "$temp_file"
in_changes_section=false
continue
elif [[ $in_changes_section == true ]] && [[ "$line" == "- "* ]]; then
# Keep only first 2 existing changes
if [[ $existing_changes_count -lt 2 ]]; then
echo "$line" >> "$temp_file"
((existing_changes_count++))
fi
continue
fi
# Update timestamp
if [[ "$line" =~ \*\*Last\ updated\*\*:.*[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9] ]]; then
echo "$line" | sed "s/[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]/$current_date/" >> "$temp_file"
else
echo "$line" >> "$temp_file"
fi
done < "$target_file"
# Post-loop check: if we're still in the Active Technologies section and haven't added new entries
if [[ $in_tech_section == true ]] && [[ $tech_entries_added == false ]] && [[ ${#new_tech_entries[@]} -gt 0 ]]; then
printf '%s\n' "${new_tech_entries[@]}" >> "$temp_file"
tech_entries_added=true
fi
# If sections don't exist, add them at the end of the file
if [[ $has_active_technologies -eq 0 ]] && [[ ${#new_tech_entries[@]} -gt 0 ]]; then
echo "" >> "$temp_file"
echo "## Active Technologies" >> "$temp_file"
printf '%s\n' "${new_tech_entries[@]}" >> "$temp_file"
tech_entries_added=true
fi
if [[ $has_recent_changes -eq 0 ]] && [[ -n "$new_change_entry" ]]; then
echo "" >> "$temp_file"
echo "## Recent Changes" >> "$temp_file"
echo "$new_change_entry" >> "$temp_file"
changes_entries_added=true
fi
# Move temp file to target atomically
if ! mv "$temp_file" "$target_file"; then
log_error "Failed to update target file"
rm -f "$temp_file"
return 1
fi
return 0
}
#==============================================================================
# Main Agent File Update Function
#==============================================================================
update_agent_file() {
local target_file="$1"
local agent_name="$2"
if [[ -z "$target_file" ]] || [[ -z "$agent_name" ]]; then
log_error "update_agent_file requires target_file and agent_name parameters"
return 1
fi
log_info "Updating $agent_name context file: $target_file"
local project_name
project_name=$(basename "$REPO_ROOT")
local current_date
current_date=$(date +%Y-%m-%d)
# Create directory if it doesn't exist
local target_dir
target_dir=$(dirname "$target_file")
if [[ ! -d "$target_dir" ]]; then
if ! mkdir -p "$target_dir"; then
log_error "Failed to create directory: $target_dir"
return 1
fi
fi
if [[ ! -f "$target_file" ]]; then
# Create new file from template
local temp_file
temp_file=$(mktemp) || {
log_error "Failed to create temporary file"
return 1
}
if create_new_agent_file "$target_file" "$temp_file" "$project_name" "$current_date"; then
if mv "$temp_file" "$target_file"; then
log_success "Created new $agent_name context file"
else
log_error "Failed to move temporary file to $target_file"
rm -f "$temp_file"
return 1
fi
else
log_error "Failed to create new agent file"
rm -f "$temp_file"
return 1
fi
else
# Update existing file
if [[ ! -r "$target_file" ]]; then
log_error "Cannot read existing file: $target_file"
return 1
fi
if [[ ! -w "$target_file" ]]; then
log_error "Cannot write to existing file: $target_file"
return 1
fi
if update_existing_agent_file "$target_file" "$current_date"; then
log_success "Updated existing $agent_name context file"
else
log_error "Failed to update existing agent file"
return 1
fi
fi
return 0
}
#==============================================================================
# Agent Selection and Processing
#==============================================================================
update_specific_agent() {
local agent_type="$1"
case "$agent_type" in
claude)
update_agent_file "$CLAUDE_FILE" "Claude Code"
;;
gemini)
update_agent_file "$GEMINI_FILE" "Gemini CLI"
;;
copilot)
update_agent_file "$COPILOT_FILE" "GitHub Copilot"
;;
cursor-agent)
update_agent_file "$CURSOR_FILE" "Cursor IDE"
;;
qwen)
update_agent_file "$QWEN_FILE" "Qwen Code"
;;
opencode)
update_agent_file "$AGENTS_FILE" "opencode"
;;
codex)
update_agent_file "$AGENTS_FILE" "Codex CLI"
;;
windsurf)
update_agent_file "$WINDSURF_FILE" "Windsurf"
;;
kilocode)
update_agent_file "$KILOCODE_FILE" "Kilo Code"
;;
auggie)
update_agent_file "$AUGGIE_FILE" "Auggie CLI"
;;
roo)
update_agent_file "$ROO_FILE" "Roo Code"
;;
codebuddy)
update_agent_file "$CODEBUDDY_FILE" "CodeBuddy CLI"
;;
qoder)
update_agent_file "$QODER_FILE" "Qoder CLI"
;;
amp)
update_agent_file "$AMP_FILE" "Amp"
;;
shai)
update_agent_file "$SHAI_FILE" "SHAI"
;;
q)
update_agent_file "$Q_FILE" "Amazon Q Developer CLI"
;;
bob)
update_agent_file "$BOB_FILE" "IBM Bob"
;;
*)
log_error "Unknown agent type '$agent_type'"
log_error "Expected: claude|gemini|copilot|cursor-agent|qwen|opencode|codex|windsurf|kilocode|auggie|roo|amp|shai|q|bob|qoder"
exit 1
;;
esac
}
update_all_existing_agents() {
local found_agent=false
# Check each possible agent file and update if it exists
if [[ -f "$CLAUDE_FILE" ]]; then
update_agent_file "$CLAUDE_FILE" "Claude Code"
found_agent=true
fi
if [[ -f "$GEMINI_FILE" ]]; then
update_agent_file "$GEMINI_FILE" "Gemini CLI"
found_agent=true
fi
if [[ -f "$COPILOT_FILE" ]]; then
update_agent_file "$COPILOT_FILE" "GitHub Copilot"
found_agent=true
fi
if [[ -f "$CURSOR_FILE" ]]; then
update_agent_file "$CURSOR_FILE" "Cursor IDE"
found_agent=true
fi
if [[ -f "$QWEN_FILE" ]]; then
update_agent_file "$QWEN_FILE" "Qwen Code"
found_agent=true
fi
if [[ -f "$AGENTS_FILE" ]]; then
update_agent_file "$AGENTS_FILE" "Codex/opencode"
found_agent=true
fi
if [[ -f "$WINDSURF_FILE" ]]; then
update_agent_file "$WINDSURF_FILE" "Windsurf"
found_agent=true
fi
if [[ -f "$KILOCODE_FILE" ]]; then
update_agent_file "$KILOCODE_FILE" "Kilo Code"
found_agent=true
fi
if [[ -f "$AUGGIE_FILE" ]]; then
update_agent_file "$AUGGIE_FILE" "Auggie CLI"
found_agent=true
fi
if [[ -f "$ROO_FILE" ]]; then
update_agent_file "$ROO_FILE" "Roo Code"
found_agent=true
fi
if [[ -f "$CODEBUDDY_FILE" ]]; then
update_agent_file "$CODEBUDDY_FILE" "CodeBuddy CLI"
found_agent=true
fi
if [[ -f "$SHAI_FILE" ]]; then
update_agent_file "$SHAI_FILE" "SHAI"
found_agent=true
fi
if [[ -f "$QODER_FILE" ]]; then
update_agent_file "$QODER_FILE" "Qoder CLI"
found_agent=true
fi
if [[ -f "$Q_FILE" ]]; then
update_agent_file "$Q_FILE" "Amazon Q Developer CLI"
found_agent=true
fi
if [[ -f "$BOB_FILE" ]]; then
update_agent_file "$BOB_FILE" "IBM Bob"
found_agent=true
fi
# If no agent files exist, create a default Claude file
if [[ "$found_agent" == false ]]; then
log_info "No existing agent files found, creating default Claude file..."
update_agent_file "$CLAUDE_FILE" "Claude Code"
fi
}
print_summary() {
echo
log_info "Summary of changes:"
if [[ -n "$NEW_LANG" ]]; then
echo " - Added language: $NEW_LANG"
fi
if [[ -n "$NEW_FRAMEWORK" ]]; then
echo " - Added framework: $NEW_FRAMEWORK"
fi
if [[ -n "$NEW_DB" ]] && [[ "$NEW_DB" != "N/A" ]]; then
echo " - Added database: $NEW_DB"
fi
echo
log_info "Usage: $0 [claude|gemini|copilot|cursor-agent|qwen|opencode|codex|windsurf|kilocode|auggie|codebuddy|shai|q|bob|qoder]"
}
#==============================================================================
# Main Execution
#==============================================================================
main() {
# Validate environment before proceeding
validate_environment
log_info "=== Updating agent context files for feature $CURRENT_BRANCH ==="
# Parse the plan file to extract project information
if ! parse_plan_data "$NEW_PLAN"; then
log_error "Failed to parse plan data"
exit 1
fi
# Process based on agent type argument
local success=true
if [[ -z "$AGENT_TYPE" ]]; then
# No specific agent provided - update all existing agent files
log_info "No agent specified, updating all existing agent files..."
if ! update_all_existing_agents; then
success=false
fi
else
# Specific agent provided - update only that agent
log_info "Updating specific agent: $AGENT_TYPE"
if ! update_specific_agent "$AGENT_TYPE"; then
success=false
fi
fi
# Print summary
print_summary
if [[ "$success" == true ]]; then
log_success "Agent context update completed successfully"
exit 0
else
log_error "Agent context update completed with errors"
exit 1
fi
}
# Execute main function if script is run directly
if [[ "${BASH_SOURCE[0]}" == "${0}" ]]; then
main "$@"
fi

View File

@@ -31,10 +31,7 @@
*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
- [ ] **Causal Validity**: Do all planned modules/components have defined Contracts (inputs/outputs/props/events) before implementation logic?
- [ ] **Immutability**: Are architectural layers and constraints defined in Module/Component Headers?
- [ ] **Format Compliance**: Does the plan ensure all code will be wrapped in `[DEF]` anchors?
- [ ] **Belief State**: Is logging planned to follow the `Entry` -> `Validation` -> `Action` -> `Coherence` state transition model?
[Gates determined based on constitution file]
## Project Structure

View File

@@ -95,12 +95,6 @@
- **FR-006**: System MUST authenticate users via [NEEDS CLARIFICATION: auth method not specified - email/password, SSO, OAuth?]
- **FR-007**: System MUST retain user data for [NEEDS CLARIFICATION: retention period not specified]
### System Invariants (Constitution Check)
*Define immutable constraints that will become `@INVARIANT` or `@CONSTRAINT` tags in Module Headers.*
- **INV-001**: [e.g., "No direct database access from UI layer"]
- **INV-002**: [e.g., "All financial calculations must use Decimal type"]
### Key Entities *(include if feature involves data)*
- **[Entity 1]**: [What it represents, key attributes without implementation]

View File

@@ -88,14 +88,12 @@ Examples of foundational tasks (adjust based on your project):
### Implementation for User Story 1
- [ ] T012 [P] [US1] Define [Entity1] Module Header & Contracts in src/models/[entity1].py
- [ ] T013 [P] [US1] Implement [Entity1] logic satisfying contracts
- [ ] T014 [P] [US1] Define [Service] Module Header & Contracts in src/services/[service].py
- [ ] T015 [US1] Implement [Service] logic satisfying contracts (depends on T012)
- [ ] T016 [US1] Define [endpoint] Contracts & Logic in src/[location]/[file].py
- [ ] T017 [US1] Define [Component] Header (Props/Events) in frontend/src/components/[Component].svelte
- [ ] T018 [US1] Implement [Component] logic satisfying contracts
- [ ] T019 [US1] Verify `[DEF]` syntax and Belief State logging compliance
- [ ] T012 [P] [US1] Create [Entity1] model in src/models/[entity1].py
- [ ] T013 [P] [US1] Create [Entity2] model in src/models/[entity2].py
- [ ] T014 [US1] Implement [Service] in src/services/[service].py (depends on T012, T013)
- [ ] T015 [US1] Implement [endpoint/feature] in src/[location]/[file].py
- [ ] T016 [US1] Add validation and error handling
- [ ] T017 [US1] Add logging for user story 1 operations
**Checkpoint**: At this point, User Story 1 should be fully functional and testable independently
@@ -109,16 +107,15 @@ Examples of foundational tasks (adjust based on your project):
### Tests for User Story 2 (OPTIONAL - only if tests requested) ⚠️
- [ ] T020 [P] [US2] Contract test for [endpoint] in tests/contract/test_[name].py
- [ ] T021 [P] [US2] Integration test for [user journey] in tests/integration/test_[name].py
- [ ] T018 [P] [US2] Contract test for [endpoint] in tests/contract/test_[name].py
- [ ] T019 [P] [US2] Integration test for [user journey] in tests/integration/test_[name].py
### Implementation for User Story 2
- [ ] T022 [P] [US2] Define [Entity] Module Header & Contracts in src/models/[entity].py
- [ ] T023 [P] [US2] Implement [Entity] logic satisfying contracts
- [ ] T024 [US2] Define [Service] Module Header & Contracts in src/services/[service].py
- [ ] T025 [US2] Implement [Service] logic satisfying contracts
- [ ] T026 [US2] Define [Component] Header & Logic in frontend/src/components/[Component].svelte
- [ ] T020 [P] [US2] Create [Entity] model in src/models/[entity].py
- [ ] T021 [US2] Implement [Service] in src/services/[service].py
- [ ] T022 [US2] Implement [endpoint/feature] in src/[location]/[file].py
- [ ] T023 [US2] Integrate with User Story 1 components (if needed)
**Checkpoint**: At this point, User Stories 1 AND 2 should both work independently
@@ -132,15 +129,14 @@ Examples of foundational tasks (adjust based on your project):
### Tests for User Story 3 (OPTIONAL - only if tests requested) ⚠️
- [ ] T027 [P] [US3] Contract test for [endpoint] in tests/contract/test_[name].py
- [ ] T028 [P] [US3] Integration test for [user journey] in tests/integration/test_[name].py
- [ ] T024 [P] [US3] Contract test for [endpoint] in tests/contract/test_[name].py
- [ ] T025 [P] [US3] Integration test for [user journey] in tests/integration/test_[name].py
### Implementation for User Story 3
- [ ] T029 [P] [US3] Define [Entity] Module Header & Contracts in src/models/[entity].py
- [ ] T030 [US3] Define [Service] Module Header & Contracts in src/services/[service].py
- [ ] T031 [US3] Implement logic for [Entity] and [Service] satisfying contracts
- [ ] T032 [US3] Define [Component] Header & Logic in frontend/src/components/[Component].svelte
- [ ] T026 [P] [US3] Create [Entity] model in src/models/[entity].py
- [ ] T027 [US3] Implement [Service] in src/services/[service].py
- [ ] T028 [US3] Implement [endpoint/feature] in src/[location]/[file].py
**Checkpoint**: All user stories should now be independently functional
@@ -183,10 +179,9 @@ Examples of foundational tasks (adjust based on your project):
### Within Each User Story
- Tests (if included) MUST be written and FAIL before implementation
- Module/Component Headers & Contracts BEFORE Implementation (Causal Validity)
- Models before services
- Services before endpoints
- Components before Pages
- Core implementation before integration
- Story complete before moving to next priority
### Parallel Opportunities
@@ -207,9 +202,9 @@ Examples of foundational tasks (adjust based on your project):
Task: "Contract test for [endpoint] in tests/contract/test_[name].py"
Task: "Integration test for [user journey] in tests/integration/test_[name].py"
# Launch all contract definitions for User Story 1 together:
Task: "Define [Entity1] Module Header & Contracts in src/models/[entity1].py"
Task: "Define [Entity2] Module Header & Contracts in src/models/[entity2].py"
# Launch all models for User Story 1 together:
Task: "Create [Entity1] model in src/models/[entity1].py"
Task: "Create [Entity2] model in src/models/[entity2].py"
```
---

212
README.md Normal file → Executable file
View File

@@ -1,106 +1,106 @@
# Инструменты автоматизации Superset
## Обзор
Этот репозиторий содержит Python-скрипты и библиотеку (`superset_tool`) для автоматизации задач в Apache Superset, таких как:
- **Резервное копирование**: Экспорт всех дашбордов из экземпляра Superset в локальное хранилище.
- **Миграция**: Перенос и преобразование дашбордов между разными средами Superset (например, Development, Sandbox, Production).
## Структура проекта
- `backup_script.py`: Основной скрипт для выполнения запланированного резервного копирования дашбордов Superset.
- `migration_script.py`: Основной скрипт для переноса конкретных дашбордов между окружениями, включая переопределение соединений с базами данных.
- `search_script.py`: Скрипт для поиска данных во всех доступных датасетах на сервере
- `run_mapper.py`: CLI-скрипт для маппинга метаданных датасетов.
- `superset_tool/`:
- `client.py`: Python-клиент для взаимодействия с API Superset.
- `exceptions.py`: Пользовательские классы исключений для структурированной обработки ошибок.
- `models.py`: Pydantic-модели для валидации конфигурационных данных.
- `utils/`:
- `fileio.py`: Утилиты для работы с файловой системой (работа с архивами, парсинг YAML).
- `logger.py`: Конфигурация логгера для единообразного логирования в проекте.
- `network.py`: HTTP-клиент для сетевых запросов с обработкой аутентификации и повторных попыток.
- `init_clients.py`: Утилита для инициализации клиентов Superset для разных окружений.
- `dataset_mapper.py`: Логика маппинга метаданных датасетов.
## Настройка
### Требования
- Python 3.9+
- `pip` для управления пакетами.
- `keyring` для безопасного хранения паролей.
### Установка
1. **Клонируйте репозиторий:**
```bash
git clone https://prod.gitlab.dwh.rusal.com/dwh_bi/superset-tools.git
cd superset-tools
```
2. **Установите зависимости:**
```bash
pip install -r requirements.txt
```
(Возможно, потребуется создать `requirements.txt` с `pydantic`, `requests`, `keyring`, `PyYAML`, `urllib3`)
3. **Настройте пароли:**
Используйте `keyring` для хранения паролей API-пользователей Superset.
```python
import keyring
keyring.set_password("system", "dev migrate", "пароль пользователя migrate_user")
keyring.set_password("system", "prod migrate", "пароль пользователя migrate_user")
keyring.set_password("system", "sandbox migrate", "пароль пользователя migrate_user")
```
## Использование
### Скрипт резервного копирования (`backup_script.py`)
Для создания резервных копий дашбордов из настроенных окружений Superset:
```bash
python backup_script.py
```
Резервные копии сохраняются в `P:\Superset\010 Бекапы\` по умолчанию. Логи хранятся в `P:\Superset\010 Бекапы\Logs`.
### Скрипт миграции (`migration_script.py`)
Для переноса конкретного дашборда:
```bash
python migration_script.py
```
### Скрипт поиска (`search_script.py`)
Для поиска по текстовым паттернам в метаданных датасетов Superset:
```bash
python search_script.py
```
Скрипт использует регулярные выражения для поиска в полях датасетов, таких как SQL-запросы. Результаты поиска выводятся в лог и в консоль.
### Скрипт маппинга метаданных (`run_mapper.py`)
Для обновления метаданных датасета (например, verbose names) в Superset:
```bash
python run_mapper.py --source <source_type> --dataset-id <dataset_id> [--table-name <table_name>] [--table-schema <table_schema>] [--excel-path <path_to_excel>] [--env <environment>]
```
Если вы используете XLSX - файл должен содержать два столбца - column_name | verbose_name
Параметры:
- `--source`: Источник данных ('postgres', 'excel' или 'both').
- `--dataset-id`: ID датасета для обновления.
- `--table-name`: Имя таблицы для PostgreSQL.
- `--table-schema`: Схема таблицы для PostgreSQL.
- `--excel-path`: Путь к Excel-файлу.
- `--env`: Окружение Superset ('dev', 'prod' и т.д.).
Пример использования:
```bash
python run_mapper.py --source postgres --dataset-id 123 --table-name account_debt --table-schema dm_view --env dev
python run_mapper.py --source=excel --dataset-id=286 --excel-path=H:\dev\ss-tools\286_map.xlsx --env=dev
```
## Логирование
Логи пишутся в файл в директории `Logs` (например, `P:\Superset\010 Бекапы\Logs` для резервных копий) и выводятся в консоль. Уровень логирования по умолчанию — `INFO`.
## Разработка и вклад
- Следуйте **Semantic Code Generation Protocol** (см. `semantic_protocol.md`):
- Все определения обернуты в `[DEF]...[/DEF]`.
- Контракты (`@PRE`, `@POST`) определяются ДО реализации.
- Строгая типизация и иммутабельность архитектурных решений.
- Соблюдайте Конституцию проекта (`.specify/memory/constitution.md`).
- Используйте `Pydantic`-модели для валидации данных.
- Реализуйте всестороннюю обработку ошибок с помощью пользовательских исключений.
# Инструменты автоматизации Superset
## Обзор
Этот репозиторий содержит Python-скрипты и библиотеку (`superset_tool`) для автоматизации задач в Apache Superset, таких как:
- **Резервное копирование**: Экспорт всех дашбордов из экземпляра Superset в локальное хранилище.
- **Миграция**: Перенос и преобразование дашбордов между разными средами Superset (например, Development, Sandbox, Production).
## Структура проекта
- `backup_script.py`: Основной скрипт для выполнения запланированного резервного копирования дашбордов Superset.
- `migration_script.py`: Основной скрипт для переноса конкретных дашбордов между окружениями, включая переопределение соединений с базами данных.
- `search_script.py`: Скрипт для поиска данных во всех доступных датасетах на сервере
- `run_mapper.py`: CLI-скрипт для маппинга метаданных датасетов.
- `superset_tool/`:
- `client.py`: Python-клиент для взаимодействия с API Superset.
- `exceptions.py`: Пользовательские классы исключений для структурированной обработки ошибок.
- `models.py`: Pydantic-модели для валидации конфигурационных данных.
- `utils/`:
- `fileio.py`: Утилиты для работы с файловой системой (работа с архивами, парсинг YAML).
- `logger.py`: Конфигурация логгера для единообразного логирования в проекте.
- `network.py`: HTTP-клиент для сетевых запросов с обработкой аутентификации и повторных попыток.
- `init_clients.py`: Утилита для инициализации клиентов Superset для разных окружений.
- `dataset_mapper.py`: Логика маппинга метаданных датасетов.
## Настройка
### Требования
- Python 3.9+
- `pip` для управления пакетами.
- `keyring` для безопасного хранения паролей.
### Установка
1. **Клонируйте репозиторий:**
```bash
git clone https://prod.gitlab.dwh.rusal.com/dwh_bi/superset-tools.git
cd superset-tools
```
2. **Установите зависимости:**
```bash
pip install -r requirements.txt
```
(Возможно, потребуется создать `requirements.txt` с `pydantic`, `requests`, `keyring`, `PyYAML`, `urllib3`)
3. **Настройте пароли:**
Используйте `keyring` для хранения паролей API-пользователей Superset.
```python
import keyring
keyring.set_password("system", "dev migrate", "пароль пользователя migrate_user")
keyring.set_password("system", "prod migrate", "пароль пользователя migrate_user")
keyring.set_password("system", "sandbox migrate", "пароль пользователя migrate_user")
```
## Использование
### Скрипт резервного копирования (`backup_script.py`)
Для создания резервных копий дашбордов из настроенных окружений Superset:
```bash
python backup_script.py
```
Резервные копии сохраняются в `P:\Superset\010 Бекапы\` по умолчанию. Логи хранятся в `P:\Superset\010 Бекапы\Logs`.
### Скрипт миграции (`migration_script.py`)
Для переноса конкретного дашборда:
```bash
python migration_script.py
```
### Скрипт поиска (`search_script.py`)
Для поиска по текстовым паттернам в метаданных датасетов Superset:
```bash
python search_script.py
```
Скрипт использует регулярные выражения для поиска в полях датасетов, таких как SQL-запросы. Результаты поиска выводятся в лог и в консоль.
### Скрипт маппинга метаданных (`run_mapper.py`)
Для обновления метаданных датасета (например, verbose names) в Superset:
```bash
python run_mapper.py --source <source_type> --dataset-id <dataset_id> [--table-name <table_name>] [--table-schema <table_schema>] [--excel-path <path_to_excel>] [--env <environment>]
```
Если вы используете XLSX - файл должен содержать два столбца - column_name | verbose_name
Параметры:
- `--source`: Источник данных ('postgres', 'excel' или 'both').
- `--dataset-id`: ID датасета для обновления.
- `--table-name`: Имя таблицы для PostgreSQL.
- `--table-schema`: Схема таблицы для PostgreSQL.
- `--excel-path`: Путь к Excel-файлу.
- `--env`: Окружение Superset ('dev', 'prod' и т.д.).
Пример использования:
```bash
python run_mapper.py --source postgres --dataset-id 123 --table-name account_debt --table-schema dm_view --env dev
python run_mapper.py --source=excel --dataset-id=286 --excel-path=H:\dev\ss-tools\286_map.xlsx --env=dev
```
## Логирование
Логи пишутся в файл в директории `Logs` (например, `P:\Superset\010 Бекапы\Logs` для резервных копий) и выводятся в консоль. Уровень логирования по умолчанию — `INFO`.
## Разработка и вклад
- Следуйте **Semantic Code Generation Protocol** (см. `semantic_protocol.md`):
- Все определения обернуты в `[DEF]...[/DEF]`.
- Контракты (`@PRE`, `@POST`) определяются ДО реализации.
- Строгая типизация и иммутабельность архитектурных решений.
- Соблюдайте Конституцию проекта (`.specify/memory/constitution.md`).
- Используйте `Pydantic`-модели для валидации данных.
- Реализуйте всестороннюю обработку ошибок с помощью пользовательских исключений.

20
backend/requirements.txt Normal file → Executable file
View File

@@ -1,9 +1,11 @@
fastapi
uvicorn
pydantic
authlib
python-multipart
starlette
jsonschema
requests
keyring
fastapi
uvicorn
pydantic
authlib
python-multipart
starlette
jsonschema
requests
keyring
httpx
PyYAML

102
backend/src/api/auth.py Normal file → Executable file
View File

@@ -1,52 +1,52 @@
# [DEF:AuthModule:Module]
# @SEMANTICS: auth, authentication, adfs, oauth, middleware
# @PURPOSE: Implements ADFS authentication using Authlib for FastAPI. It provides a dependency to protect endpoints.
# @LAYER: UI (API)
# @RELATION: Used by API routers to protect endpoints that require authentication.
from fastapi import Depends, HTTPException, status
from fastapi.security import OAuth2AuthorizationCodeBearer
from authlib.integrations.starlette_client import OAuth
from starlette.config import Config
# Placeholder for ADFS configuration. In a real app, this would come from a secure source.
# Create an in-memory .env file
from io import StringIO
config_data = StringIO("""
ADFS_CLIENT_ID=your-client-id
ADFS_CLIENT_SECRET=your-client-secret
ADFS_SERVER_METADATA_URL=https://your-adfs-server/.well-known/openid-configuration
""")
config = Config(config_data)
oauth = OAuth(config)
oauth.register(
name='adfs',
server_metadata_url=config('ADFS_SERVER_METADATA_URL'),
client_kwargs={'scope': 'openid profile email'}
)
oauth2_scheme = OAuth2AuthorizationCodeBearer(
authorizationUrl="https://your-adfs-server/adfs/oauth2/authorize",
tokenUrl="https://your-adfs-server/adfs/oauth2/token",
)
async def get_current_user(token: str = Depends(oauth2_scheme)):
"""
Dependency to get the current user from the ADFS token.
This is a placeholder and needs to be fully implemented.
"""
# In a real implementation, you would:
# 1. Validate the token with ADFS.
# 2. Fetch user information.
# 3. Create a user object.
# For now, we'll just check if a token exists.
if not token:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Not authenticated",
headers={"WWW-Authenticate": "Bearer"},
)
# A real implementation would return a user object.
return {"placeholder_user": "user@example.com"}
# [DEF:AuthModule:Module]
# @SEMANTICS: auth, authentication, adfs, oauth, middleware
# @PURPOSE: Implements ADFS authentication using Authlib for FastAPI. It provides a dependency to protect endpoints.
# @LAYER: UI (API)
# @RELATION: Used by API routers to protect endpoints that require authentication.
from fastapi import Depends, HTTPException, status
from fastapi.security import OAuth2AuthorizationCodeBearer
from authlib.integrations.starlette_client import OAuth
from starlette.config import Config
# Placeholder for ADFS configuration. In a real app, this would come from a secure source.
# Create an in-memory .env file
from io import StringIO
config_data = StringIO("""
ADFS_CLIENT_ID=your-client-id
ADFS_CLIENT_SECRET=your-client-secret
ADFS_SERVER_METADATA_URL=https://your-adfs-server/.well-known/openid-configuration
""")
config = Config(config_data)
oauth = OAuth(config)
oauth.register(
name='adfs',
server_metadata_url=config('ADFS_SERVER_METADATA_URL'),
client_kwargs={'scope': 'openid profile email'}
)
oauth2_scheme = OAuth2AuthorizationCodeBearer(
authorizationUrl="https://your-adfs-server/adfs/oauth2/authorize",
tokenUrl="https://your-adfs-server/adfs/oauth2/token",
)
async def get_current_user(token: str = Depends(oauth2_scheme)):
"""
Dependency to get the current user from the ADFS token.
This is a placeholder and needs to be fully implemented.
"""
# In a real implementation, you would:
# 1. Validate the token with ADFS.
# 2. Fetch user information.
# 3. Create a user object.
# For now, we'll just check if a token exists.
if not token:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Not authenticated",
headers={"WWW-Authenticate": "Bearer"},
)
# A real implementation would return a user object.
return {"placeholder_user": "user@example.com"}
# [/DEF]

View File

@@ -0,0 +1 @@
from . import plugins, tasks, settings

42
backend/src/api/routes/plugins.py Normal file → Executable file
View File

@@ -1,22 +1,22 @@
# [DEF:PluginsRouter:Module]
# @SEMANTICS: api, router, plugins, list
# @PURPOSE: Defines the FastAPI router for plugin-related endpoints, allowing clients to list available plugins.
# @LAYER: UI (API)
# @RELATION: Depends on the PluginLoader and PluginConfig. It is included by the main app.
from typing import List
from fastapi import APIRouter, Depends
from ...core.plugin_base import PluginConfig
from ...dependencies import get_plugin_loader
router = APIRouter()
@router.get("/", response_model=List[PluginConfig])
async def list_plugins(
plugin_loader = Depends(get_plugin_loader)
):
"""
Retrieve a list of all available plugins.
"""
return plugin_loader.get_all_plugin_configs()
# [DEF:PluginsRouter:Module]
# @SEMANTICS: api, router, plugins, list
# @PURPOSE: Defines the FastAPI router for plugin-related endpoints, allowing clients to list available plugins.
# @LAYER: UI (API)
# @RELATION: Depends on the PluginLoader and PluginConfig. It is included by the main app.
from typing import List
from fastapi import APIRouter, Depends
from ...core.plugin_base import PluginConfig
from ...dependencies import get_plugin_loader
router = APIRouter()
@router.get("/", response_model=List[PluginConfig])
async def list_plugins(
plugin_loader = Depends(get_plugin_loader)
):
"""
Retrieve a list of all available plugins.
"""
return plugin_loader.get_all_plugin_configs()
# [/DEF]

View File

@@ -0,0 +1,185 @@
# [DEF:SettingsRouter:Module]
#
# @SEMANTICS: settings, api, router, fastapi
# @PURPOSE: Provides API endpoints for managing application settings and Superset environments.
# @LAYER: UI (API)
# @RELATION: DEPENDS_ON -> ConfigManager
# @RELATION: DEPENDS_ON -> ConfigModels
#
# @INVARIANT: All settings changes must be persisted via ConfigManager.
# @PUBLIC_API: router
# [SECTION: IMPORTS]
from fastapi import APIRouter, Depends, HTTPException
from typing import List
from ...core.config_models import AppConfig, Environment, GlobalSettings
from ...dependencies import get_config_manager
from ...core.config_manager import ConfigManager
from ...core.logger import logger
from superset_tool.client import SupersetClient
from superset_tool.models import SupersetConfig
import os
# [/SECTION]
router = APIRouter()
# [DEF:get_settings:Function]
# @PURPOSE: Retrieves all application settings.
# @RETURN: AppConfig - The current configuration.
@router.get("/", response_model=AppConfig)
async def get_settings(config_manager: ConfigManager = Depends(get_config_manager)):
logger.info("[get_settings][Entry] Fetching all settings")
config = config_manager.get_config().copy(deep=True)
# Mask passwords
for env in config.environments:
if env.password:
env.password = "********"
return config
# [/DEF:get_settings]
# [DEF:update_global_settings:Function]
# @PURPOSE: Updates global application settings.
# @PARAM: settings (GlobalSettings) - The new global settings.
# @RETURN: GlobalSettings - The updated settings.
@router.patch("/global", response_model=GlobalSettings)
async def update_global_settings(
settings: GlobalSettings,
config_manager: ConfigManager = Depends(get_config_manager)
):
logger.info("[update_global_settings][Entry] Updating global settings")
config_manager.update_global_settings(settings)
return settings
# [/DEF:update_global_settings]
# [DEF:get_environments:Function]
# @PURPOSE: Lists all configured Superset environments.
# @RETURN: List[Environment] - List of environments.
@router.get("/environments", response_model=List[Environment])
async def get_environments(config_manager: ConfigManager = Depends(get_config_manager)):
logger.info("[get_environments][Entry] Fetching environments")
return config_manager.get_environments()
# [/DEF:get_environments]
# [DEF:add_environment:Function]
# @PURPOSE: Adds a new Superset environment.
# @PARAM: env (Environment) - The environment to add.
# @RETURN: Environment - The added environment.
@router.post("/environments", response_model=Environment)
async def add_environment(
env: Environment,
config_manager: ConfigManager = Depends(get_config_manager)
):
logger.info(f"[add_environment][Entry] Adding environment {env.id}")
config_manager.add_environment(env)
return env
# [/DEF:add_environment]
# [DEF:update_environment:Function]
# @PURPOSE: Updates an existing Superset environment.
# @PARAM: id (str) - The ID of the environment to update.
# @PARAM: env (Environment) - The updated environment data.
# @RETURN: Environment - The updated environment.
@router.put("/environments/{id}", response_model=Environment)
async def update_environment(
id: str,
env: Environment,
config_manager: ConfigManager = Depends(get_config_manager)
):
logger.info(f"[update_environment][Entry] Updating environment {id}")
if config_manager.update_environment(id, env):
return env
raise HTTPException(status_code=404, detail=f"Environment {id} not found")
# [/DEF:update_environment]
# [DEF:delete_environment:Function]
# @PURPOSE: Deletes a Superset environment.
# @PARAM: id (str) - The ID of the environment to delete.
@router.delete("/environments/{id}")
async def delete_environment(
id: str,
config_manager: ConfigManager = Depends(get_config_manager)
):
logger.info(f"[delete_environment][Entry] Deleting environment {id}")
config_manager.delete_environment(id)
return {"message": f"Environment {id} deleted"}
# [/DEF:delete_environment]
# [DEF:test_environment_connection:Function]
# @PURPOSE: Tests the connection to a Superset environment.
# @PARAM: id (str) - The ID of the environment to test.
# @RETURN: dict - Success message or error.
@router.post("/environments/{id}/test")
async def test_environment_connection(
id: str,
config_manager: ConfigManager = Depends(get_config_manager)
):
logger.info(f"[test_environment_connection][Entry] Testing environment {id}")
# Find environment
env = next((e for e in config_manager.get_environments() if e.id == id), None)
if not env:
raise HTTPException(status_code=404, detail=f"Environment {id} not found")
try:
# Create SupersetConfig
# Note: SupersetConfig expects 'auth' dict with specific keys
superset_config = SupersetConfig(
env=env.name,
base_url=env.url,
auth={
"provider": "db", # Defaulting to db for now
"username": env.username,
"password": env.password,
"refresh": "true"
}
)
# Initialize client (this will trigger authentication)
client = SupersetClient(config=superset_config)
# Try a simple request to verify
client.get_dashboards(query={"page_size": 1})
logger.info(f"[test_environment_connection][Coherence:OK] Connection successful for {id}")
return {"status": "success", "message": "Connection successful"}
except Exception as e:
logger.error(f"[test_environment_connection][Coherence:Failed] Connection failed for {id}: {e}")
return {"status": "error", "message": str(e)}
# [/DEF:test_environment_connection]
# [DEF:validate_backup_path:Function]
# @PURPOSE: Validates if a backup path exists and is writable.
# @PARAM: path (str) - The path to validate.
# @RETURN: dict - Validation result.
@router.post("/validate-path")
async def validate_backup_path(path_data: dict):
path = path_data.get("path")
if not path:
raise HTTPException(status_code=400, detail="Path is required")
logger.info(f"[validate_backup_path][Entry] Validating path: {path}")
p = os.path.abspath(path)
exists = os.path.exists(p)
writable = os.access(p, os.W_OK) if exists else os.access(os.path.dirname(p), os.W_OK)
if not exists:
# Try to create it
try:
os.makedirs(p, exist_ok=True)
exists = True
writable = os.access(p, os.W_OK)
logger.info(f"[validate_backup_path][Action] Created directory: {p}")
except Exception as e:
logger.error(f"[validate_backup_path][Coherence:Failed] Failed to create directory: {e}")
return {"status": "error", "message": f"Path does not exist and could not be created: {e}"}
if not writable:
logger.warning(f"[validate_backup_path][Coherence:Failed] Path not writable: {p}")
return {"status": "error", "message": "Path is not writable"}
logger.info(f"[validate_backup_path][Coherence:OK] Path valid: {p}")
return {"status": "success", "message": "Path is valid and writable"}
# [/DEF:validate_backup_path]
# [/DEF:SettingsRouter]

112
backend/src/api/routes/tasks.py Normal file → Executable file
View File

@@ -1,57 +1,57 @@
# [DEF:TasksRouter:Module]
# @SEMANTICS: api, router, tasks, create, list, get
# @PURPOSE: Defines the FastAPI router for task-related endpoints, allowing clients to create, list, and get the status of tasks.
# @LAYER: UI (API)
# @RELATION: Depends on the TaskManager. It is included by the main app.
from typing import List, Dict, Any
from fastapi import APIRouter, Depends, HTTPException, status
from pydantic import BaseModel
from ...core.task_manager import TaskManager, Task
from ...dependencies import get_task_manager
router = APIRouter()
class CreateTaskRequest(BaseModel):
plugin_id: str
params: Dict[str, Any]
@router.post("/", response_model=Task, status_code=status.HTTP_201_CREATED)
async def create_task(
request: CreateTaskRequest,
task_manager: TaskManager = Depends(get_task_manager)
):
"""
Create and start a new task for a given plugin.
"""
try:
task = await task_manager.create_task(
plugin_id=request.plugin_id,
params=request.params
)
return task
except ValueError as e:
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail=str(e))
@router.get("/", response_model=List[Task])
async def list_tasks(
task_manager: TaskManager = Depends(get_task_manager)
):
"""
Retrieve a list of all tasks.
"""
return task_manager.get_all_tasks()
@router.get("/{task_id}", response_model=Task)
async def get_task(
task_id: str,
task_manager: TaskManager = Depends(get_task_manager)
):
"""
Retrieve the details of a specific task.
"""
task = task_manager.get_task(task_id)
if not task:
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Task not found")
return task
# [DEF:TasksRouter:Module]
# @SEMANTICS: api, router, tasks, create, list, get
# @PURPOSE: Defines the FastAPI router for task-related endpoints, allowing clients to create, list, and get the status of tasks.
# @LAYER: UI (API)
# @RELATION: Depends on the TaskManager. It is included by the main app.
from typing import List, Dict, Any
from fastapi import APIRouter, Depends, HTTPException, status
from pydantic import BaseModel
from ...core.task_manager import TaskManager, Task
from ...dependencies import get_task_manager
router = APIRouter()
class CreateTaskRequest(BaseModel):
plugin_id: str
params: Dict[str, Any]
@router.post("/", response_model=Task, status_code=status.HTTP_201_CREATED)
async def create_task(
request: CreateTaskRequest,
task_manager: TaskManager = Depends(get_task_manager)
):
"""
Create and start a new task for a given plugin.
"""
try:
task = await task_manager.create_task(
plugin_id=request.plugin_id,
params=request.params
)
return task
except ValueError as e:
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail=str(e))
@router.get("/", response_model=List[Task])
async def list_tasks(
task_manager: TaskManager = Depends(get_task_manager)
):
"""
Retrieve a list of all tasks.
"""
return task_manager.get_all_tasks()
@router.get("/{task_id}", response_model=Task)
async def get_task(
task_id: str,
task_manager: TaskManager = Depends(get_task_manager)
):
"""
Retrieve the details of a specific task.
"""
task = task_manager.get_task(task_id)
if not task:
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Task not found")
return task
# [/DEF]

155
backend/src/app.py Normal file → Executable file
View File

@@ -1,77 +1,78 @@
# [DEF:AppModule:Module]
# @SEMANTICS: app, main, entrypoint, fastapi
# @PURPOSE: The main entry point for the FastAPI application. It initializes the app, configures CORS, sets up dependencies, includes API routers, and defines the WebSocket endpoint for log streaming.
# @LAYER: UI (API)
# @RELATION: Depends on the dependency module and API route modules.
import sys
from pathlib import Path
# Add project root to sys.path to allow importing superset_tool
# Assuming app.py is in backend/src/
project_root = Path(__file__).resolve().parent.parent.parent
sys.path.append(str(project_root))
from fastapi import FastAPI, WebSocket, WebSocketDisconnect, Depends
from fastapi.middleware.cors import CORSMiddleware
import asyncio
from .dependencies import get_task_manager
from .core.logger import logger
from .api.routes import plugins, tasks
# [DEF:App:Global]
# @SEMANTICS: app, fastapi, instance
# @PURPOSE: The global FastAPI application instance.
app = FastAPI(
title="Superset Tools API",
description="API for managing Superset automation tools and plugins.",
version="1.0.0",
)
# Configure CORS
app.add_middleware(
CORSMiddleware,
allow_origins=["*"], # Adjust this in production
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# Include API routes
app.include_router(plugins.router, prefix="/plugins", tags=["Plugins"])
app.include_router(tasks.router, prefix="/tasks", tags=["Tasks"])
# [DEF:WebSocketEndpoint:Endpoint]
# @SEMANTICS: websocket, logs, streaming, real-time
# @PURPOSE: Provides a WebSocket endpoint for clients to connect to and receive real-time log entries for a specific task.
@app.websocket("/ws/logs/{task_id}")
async def websocket_endpoint(websocket: WebSocket, task_id: str, task_manager=Depends(get_task_manager)):
await websocket.accept()
logger.info(f"WebSocket connection established for task {task_id}")
try:
# Send initial logs if any
initial_logs = task_manager.get_task_logs(task_id)
for log_entry in initial_logs:
await websocket.send_json(log_entry.dict())
# Keep connection alive, ideally stream new logs as they come
# This part requires a more sophisticated log streaming mechanism (e.g., queues, pub/sub)
# For now, it will just keep the connection open and send initial logs.
while True:
await asyncio.sleep(1) # Keep connection alive, send heartbeat or check for new logs
# In a real system, new logs would be pushed here
except WebSocketDisconnect:
logger.info(f"WebSocket connection disconnected for task {task_id}")
except Exception as e:
logger.error(f"WebSocket error for task {task_id}: {e}")
# [/DEF]
# [DEF:RootEndpoint:Endpoint]
# @SEMANTICS: root, healthcheck
# @PURPOSE: A simple root endpoint to confirm that the API is running.
@app.get("/")
async def read_root():
return {"message": "Superset Tools API is running"}
# [/DEF]
# [DEF:AppModule:Module]
# @SEMANTICS: app, main, entrypoint, fastapi
# @PURPOSE: The main entry point for the FastAPI application. It initializes the app, configures CORS, sets up dependencies, includes API routers, and defines the WebSocket endpoint for log streaming.
# @LAYER: UI (API)
# @RELATION: Depends on the dependency module and API route modules.
import sys
from pathlib import Path
# Add project root to sys.path to allow importing superset_tool
# Assuming app.py is in backend/src/
project_root = Path(__file__).resolve().parent.parent.parent
sys.path.append(str(project_root))
from fastapi import FastAPI, WebSocket, WebSocketDisconnect, Depends
from fastapi.middleware.cors import CORSMiddleware
import asyncio
from .dependencies import get_task_manager
from .core.logger import logger
from .api.routes import plugins, tasks, settings
# [DEF:App:Global]
# @SEMANTICS: app, fastapi, instance
# @PURPOSE: The global FastAPI application instance.
app = FastAPI(
title="Superset Tools API",
description="API for managing Superset automation tools and plugins.",
version="1.0.0",
)
# Configure CORS
app.add_middleware(
CORSMiddleware,
allow_origins=["*"], # Adjust this in production
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# Include API routes
app.include_router(plugins.router, prefix="/plugins", tags=["Plugins"])
app.include_router(tasks.router, prefix="/tasks", tags=["Tasks"])
app.include_router(settings.router, prefix="/settings", tags=["Settings"])
# [DEF:WebSocketEndpoint:Endpoint]
# @SEMANTICS: websocket, logs, streaming, real-time
# @PURPOSE: Provides a WebSocket endpoint for clients to connect to and receive real-time log entries for a specific task.
@app.websocket("/ws/logs/{task_id}")
async def websocket_endpoint(websocket: WebSocket, task_id: str, task_manager=Depends(get_task_manager)):
await websocket.accept()
logger.info(f"WebSocket connection established for task {task_id}")
try:
# Send initial logs if any
initial_logs = task_manager.get_task_logs(task_id)
for log_entry in initial_logs:
await websocket.send_json(log_entry.dict())
# Keep connection alive, ideally stream new logs as they come
# This part requires a more sophisticated log streaming mechanism (e.g., queues, pub/sub)
# For now, it will just keep the connection open and send initial logs.
while True:
await asyncio.sleep(1) # Keep connection alive, send heartbeat or check for new logs
# In a real system, new logs would be pushed here
except WebSocketDisconnect:
logger.info(f"WebSocket connection disconnected for task {task_id}")
except Exception as e:
logger.error(f"WebSocket error for task {task_id}: {e}")
# [/DEF]
# [DEF:RootEndpoint:Endpoint]
# @SEMANTICS: root, healthcheck
# @PURPOSE: A simple root endpoint to confirm that the API is running.
@app.get("/")
async def read_root():
return {"message": "Superset Tools API is running"}
# [/DEF]

View File

@@ -0,0 +1,205 @@
# [DEF:ConfigManagerModule:Module]
#
# @SEMANTICS: config, manager, persistence, json
# @PURPOSE: Manages application configuration, including loading/saving to JSON and CRUD for environments.
# @LAYER: Core
# @RELATION: DEPENDS_ON -> ConfigModels
# @RELATION: CALLS -> logger
# @RELATION: WRITES_TO -> config.json
#
# @INVARIANT: Configuration must always be valid according to AppConfig model.
# @PUBLIC_API: ConfigManager
# [SECTION: IMPORTS]
import json
import os
from pathlib import Path
from typing import Optional, List
from .config_models import AppConfig, Environment, GlobalSettings
from .logger import logger
# [/SECTION]
# [DEF:ConfigManager:Class]
# @PURPOSE: A class to handle application configuration persistence and management.
# @RELATION: WRITES_TO -> config.json
class ConfigManager:
# [DEF:__init__:Function]
# @PURPOSE: Initializes the ConfigManager.
# @PRE: isinstance(config_path, str) and len(config_path) > 0
# @POST: self.config is an instance of AppConfig
# @PARAM: config_path (str) - Path to the configuration file.
def __init__(self, config_path: str = "config.json"):
# 1. Runtime check of @PRE
assert isinstance(config_path, str) and config_path, "config_path must be a non-empty string"
logger.info(f"[ConfigManager][Entry] Initializing with {config_path}")
# 2. Logic implementation
self.config_path = Path(config_path)
self.config: AppConfig = self._load_config()
# 3. Runtime check of @POST
assert isinstance(self.config, AppConfig), "self.config must be an instance of AppConfig"
logger.info(f"[ConfigManager][Exit] Initialized")
# [/DEF:__init__]
# [DEF:_load_config:Function]
# @PURPOSE: Loads the configuration from disk or creates a default one.
# @POST: isinstance(return, AppConfig)
# @RETURN: AppConfig - The loaded or default configuration.
def _load_config(self) -> AppConfig:
logger.debug(f"[_load_config][Entry] Loading from {self.config_path}")
if not self.config_path.exists():
logger.info(f"[_load_config][Action] Config file not found. Creating default.")
default_config = AppConfig(
environments=[],
settings=GlobalSettings(backup_path="backups")
)
self._save_config_to_disk(default_config)
return default_config
try:
with open(self.config_path, "r") as f:
data = json.load(f)
config = AppConfig(**data)
logger.info(f"[_load_config][Coherence:OK] Configuration loaded")
return config
except Exception as e:
logger.error(f"[_load_config][Coherence:Failed] Error loading config: {e}")
return AppConfig(
environments=[],
settings=GlobalSettings(backup_path="backups")
)
# [/DEF:_load_config]
# [DEF:_save_config_to_disk:Function]
# @PURPOSE: Saves the provided configuration object to disk.
# @PRE: isinstance(config, AppConfig)
# @PARAM: config (AppConfig) - The configuration to save.
def _save_config_to_disk(self, config: AppConfig):
logger.debug(f"[_save_config_to_disk][Entry] Saving to {self.config_path}")
# 1. Runtime check of @PRE
assert isinstance(config, AppConfig), "config must be an instance of AppConfig"
# 2. Logic implementation
try:
with open(self.config_path, "w") as f:
json.dump(config.dict(), f, indent=4)
logger.info(f"[_save_config_to_disk][Action] Configuration saved")
except Exception as e:
logger.error(f"[_save_config_to_disk][Coherence:Failed] Failed to save: {e}")
# [/DEF:_save_config_to_disk]
# [DEF:save:Function]
# @PURPOSE: Saves the current configuration state to disk.
def save(self):
self._save_config_to_disk(self.config)
# [/DEF:save]
# [DEF:get_config:Function]
# @PURPOSE: Returns the current configuration.
# @RETURN: AppConfig - The current configuration.
def get_config(self) -> AppConfig:
return self.config
# [/DEF:get_config]
# [DEF:update_global_settings:Function]
# @PURPOSE: Updates the global settings and persists the change.
# @PRE: isinstance(settings, GlobalSettings)
# @PARAM: settings (GlobalSettings) - The new global settings.
def update_global_settings(self, settings: GlobalSettings):
logger.info(f"[update_global_settings][Entry] Updating settings")
# 1. Runtime check of @PRE
assert isinstance(settings, GlobalSettings), "settings must be an instance of GlobalSettings"
# 2. Logic implementation
self.config.settings = settings
self.save()
logger.info(f"[update_global_settings][Exit] Settings updated")
# [/DEF:update_global_settings]
# [DEF:get_environments:Function]
# @PURPOSE: Returns the list of configured environments.
# @RETURN: List[Environment] - List of environments.
def get_environments(self) -> List[Environment]:
return self.config.environments
# [/DEF:get_environments]
# [DEF:add_environment:Function]
# @PURPOSE: Adds a new environment to the configuration.
# @PRE: isinstance(env, Environment)
# @PARAM: env (Environment) - The environment to add.
def add_environment(self, env: Environment):
logger.info(f"[add_environment][Entry] Adding environment {env.id}")
# 1. Runtime check of @PRE
assert isinstance(env, Environment), "env must be an instance of Environment"
# 2. Logic implementation
# Check for duplicate ID and remove if exists
self.config.environments = [e for e in self.config.environments if e.id != env.id]
self.config.environments.append(env)
self.save()
logger.info(f"[add_environment][Exit] Environment added")
# [/DEF:add_environment]
# [DEF:update_environment:Function]
# @PURPOSE: Updates an existing environment.
# @PRE: isinstance(env_id, str) and len(env_id) > 0 and isinstance(updated_env, Environment)
# @PARAM: env_id (str) - The ID of the environment to update.
# @PARAM: updated_env (Environment) - The updated environment data.
# @RETURN: bool - True if updated, False otherwise.
def update_environment(self, env_id: str, updated_env: Environment) -> bool:
logger.info(f"[update_environment][Entry] Updating {env_id}")
# 1. Runtime check of @PRE
assert env_id and isinstance(env_id, str), "env_id must be a non-empty string"
assert isinstance(updated_env, Environment), "updated_env must be an instance of Environment"
# 2. Logic implementation
for i, env in enumerate(self.config.environments):
if env.id == env_id:
# If password is masked, keep the old one
if updated_env.password == "********":
updated_env.password = env.password
self.config.environments[i] = updated_env
self.save()
logger.info(f"[update_environment][Coherence:OK] Updated {env_id}")
return True
logger.warning(f"[update_environment][Coherence:Failed] Environment {env_id} not found")
return False
# [/DEF:update_environment]
# [DEF:delete_environment:Function]
# @PURPOSE: Deletes an environment by ID.
# @PRE: isinstance(env_id, str) and len(env_id) > 0
# @PARAM: env_id (str) - The ID of the environment to delete.
def delete_environment(self, env_id: str):
logger.info(f"[delete_environment][Entry] Deleting {env_id}")
# 1. Runtime check of @PRE
assert env_id and isinstance(env_id, str), "env_id must be a non-empty string"
# 2. Logic implementation
original_count = len(self.config.environments)
self.config.environments = [e for e in self.config.environments if e.id != env_id]
if len(self.config.environments) < original_count:
self.save()
logger.info(f"[delete_environment][Action] Deleted {env_id}")
else:
logger.warning(f"[delete_environment][Coherence:Failed] Environment {env_id} not found")
# [/DEF:delete_environment]
# [/DEF:ConfigManager]
# [/DEF:ConfigManagerModule]

View File

@@ -0,0 +1,36 @@
# [DEF:ConfigModels:Module]
# @SEMANTICS: config, models, pydantic
# @PURPOSE: Defines the data models for application configuration using Pydantic.
# @LAYER: Core
# @RELATION: READS_FROM -> config.json
# @RELATION: USED_BY -> ConfigManager
from pydantic import BaseModel, Field
from typing import List, Optional
# [DEF:Environment:DataClass]
# @PURPOSE: Represents a Superset environment configuration.
class Environment(BaseModel):
id: str
name: str
url: str
username: str
password: str # Will be masked in UI
is_default: bool = False
# [/DEF:Environment]
# [DEF:GlobalSettings:DataClass]
# @PURPOSE: Represents global application settings.
class GlobalSettings(BaseModel):
backup_path: str
default_environment_id: Optional[str] = None
# [/DEF:GlobalSettings]
# [DEF:AppConfig:DataClass]
# @PURPOSE: The root configuration model containing all application settings.
class AppConfig(BaseModel):
environments: List[Environment] = []
settings: GlobalSettings
# [/DEF:AppConfig]
# [/DEF:ConfigModels]

182
backend/src/core/logger.py Normal file → Executable file
View File

@@ -1,92 +1,92 @@
# [DEF:LoggerModule:Module]
# @SEMANTICS: logging, websocket, streaming, handler
# @PURPOSE: Configures the application's logging system, including a custom handler for buffering logs and streaming them over WebSockets.
# @LAYER: Core
# @RELATION: Used by the main application and other modules to log events. The WebSocketLogHandler is used by the WebSocket endpoint in app.py.
import logging
from datetime import datetime
from typing import Dict, Any, List, Optional
from collections import deque
from pydantic import BaseModel, Field
# Re-using LogEntry from task_manager for consistency
# [DEF:LogEntry:Class]
# @SEMANTICS: log, entry, record, pydantic
# @PURPOSE: A Pydantic model representing a single, structured log entry. This is a re-definition for consistency, as it's also defined in task_manager.py.
class LogEntry(BaseModel):
timestamp: datetime = Field(default_factory=datetime.utcnow)
level: str
message: str
context: Optional[Dict[str, Any]] = None
# [/DEF]
# [DEF:WebSocketLogHandler:Class]
# @SEMANTICS: logging, handler, websocket, buffer
# @PURPOSE: A custom logging handler that captures log records into a buffer. It is designed to be extended for real-time log streaming over WebSockets.
class WebSocketLogHandler(logging.Handler):
"""
A logging handler that stores log records and can be extended to send them
over WebSockets.
"""
def __init__(self, capacity: int = 1000):
super().__init__()
self.log_buffer: deque[LogEntry] = deque(maxlen=capacity)
# In a real implementation, you'd have a way to manage active WebSocket connections
# e.g., self.active_connections: Set[WebSocket] = set()
def emit(self, record: logging.LogRecord):
try:
log_entry = LogEntry(
level=record.levelname,
message=self.format(record),
context={
"name": record.name,
"pathname": record.pathname,
"lineno": record.lineno,
"funcName": record.funcName,
"process": record.process,
"thread": record.thread,
}
)
self.log_buffer.append(log_entry)
# Here you would typically send the log_entry to all active WebSocket connections
# for real-time streaming to the frontend.
# Example: for ws in self.active_connections: await ws.send_json(log_entry.dict())
except Exception:
self.handleError(record)
def get_recent_logs(self) -> List[LogEntry]:
"""
Returns a list of recent log entries from the buffer.
"""
return list(self.log_buffer)
# [/DEF]
# [DEF:Logger:Global]
# @SEMANTICS: logger, global, instance
# @PURPOSE: The global logger instance for the application, configured with both a console handler and the custom WebSocket handler.
logger = logging.getLogger("superset_tools_app")
logger.setLevel(logging.INFO)
# Create a formatter
formatter = logging.Formatter(
'[%(asctime)s][%(levelname)s][%(name)s] %(message)s'
)
# Add console handler
console_handler = logging.StreamHandler()
console_handler.setFormatter(formatter)
logger.addHandler(console_handler)
# Add WebSocket log handler
websocket_log_handler = WebSocketLogHandler()
websocket_log_handler.setFormatter(formatter)
logger.addHandler(websocket_log_handler)
# Example usage:
# logger.info("Application started", extra={"context_key": "context_value"})
# logger.error("An error occurred", exc_info=True)
# [DEF:LoggerModule:Module]
# @SEMANTICS: logging, websocket, streaming, handler
# @PURPOSE: Configures the application's logging system, including a custom handler for buffering logs and streaming them over WebSockets.
# @LAYER: Core
# @RELATION: Used by the main application and other modules to log events. The WebSocketLogHandler is used by the WebSocket endpoint in app.py.
import logging
from datetime import datetime
from typing import Dict, Any, List, Optional
from collections import deque
from pydantic import BaseModel, Field
# Re-using LogEntry from task_manager for consistency
# [DEF:LogEntry:Class]
# @SEMANTICS: log, entry, record, pydantic
# @PURPOSE: A Pydantic model representing a single, structured log entry. This is a re-definition for consistency, as it's also defined in task_manager.py.
class LogEntry(BaseModel):
timestamp: datetime = Field(default_factory=datetime.utcnow)
level: str
message: str
context: Optional[Dict[str, Any]] = None
# [/DEF]
# [DEF:WebSocketLogHandler:Class]
# @SEMANTICS: logging, handler, websocket, buffer
# @PURPOSE: A custom logging handler that captures log records into a buffer. It is designed to be extended for real-time log streaming over WebSockets.
class WebSocketLogHandler(logging.Handler):
"""
A logging handler that stores log records and can be extended to send them
over WebSockets.
"""
def __init__(self, capacity: int = 1000):
super().__init__()
self.log_buffer: deque[LogEntry] = deque(maxlen=capacity)
# In a real implementation, you'd have a way to manage active WebSocket connections
# e.g., self.active_connections: Set[WebSocket] = set()
def emit(self, record: logging.LogRecord):
try:
log_entry = LogEntry(
level=record.levelname,
message=self.format(record),
context={
"name": record.name,
"pathname": record.pathname,
"lineno": record.lineno,
"funcName": record.funcName,
"process": record.process,
"thread": record.thread,
}
)
self.log_buffer.append(log_entry)
# Here you would typically send the log_entry to all active WebSocket connections
# for real-time streaming to the frontend.
# Example: for ws in self.active_connections: await ws.send_json(log_entry.dict())
except Exception:
self.handleError(record)
def get_recent_logs(self) -> List[LogEntry]:
"""
Returns a list of recent log entries from the buffer.
"""
return list(self.log_buffer)
# [/DEF]
# [DEF:Logger:Global]
# @SEMANTICS: logger, global, instance
# @PURPOSE: The global logger instance for the application, configured with both a console handler and the custom WebSocket handler.
logger = logging.getLogger("superset_tools_app")
logger.setLevel(logging.INFO)
# Create a formatter
formatter = logging.Formatter(
'[%(asctime)s][%(levelname)s][%(name)s] %(message)s'
)
# Add console handler
console_handler = logging.StreamHandler()
console_handler.setFormatter(formatter)
logger.addHandler(console_handler)
# Add WebSocket log handler
websocket_log_handler = WebSocketLogHandler()
websocket_log_handler.setFormatter(formatter)
logger.addHandler(websocket_log_handler)
# Example usage:
# logger.info("Application started", extra={"context_key": "context_value"})
# logger.error("An error occurred", exc_info=True)
# [/DEF]

140
backend/src/core/plugin_base.py Normal file → Executable file
View File

@@ -1,71 +1,71 @@
from abc import ABC, abstractmethod
from typing import Dict, Any
from pydantic import BaseModel, Field
# [DEF:PluginBase:Class]
# @SEMANTICS: plugin, interface, base, abstract
# @PURPOSE: Defines the abstract base class that all plugins must implement to be recognized by the system. It enforces a common structure for plugin metadata and execution.
# @LAYER: Core
# @RELATION: Used by PluginLoader to identify valid plugins.
# @INVARIANT: All plugins MUST inherit from this class.
class PluginBase(ABC):
"""
Base class for all plugins.
Plugins must inherit from this class and implement the abstract methods.
"""
@property
@abstractmethod
def id(self) -> str:
"""A unique identifier for the plugin."""
pass
@property
@abstractmethod
def name(self) -> str:
"""A human-readable name for the plugin."""
pass
@property
@abstractmethod
def description(self) -> str:
"""A brief description of what the plugin does."""
pass
@property
@abstractmethod
def version(self) -> str:
"""The version of the plugin."""
pass
@abstractmethod
def get_schema(self) -> Dict[str, Any]:
"""
Returns the JSON schema for the plugin's input parameters.
This schema will be used to generate the frontend form.
"""
pass
@abstractmethod
async def execute(self, params: Dict[str, Any]):
"""
Executes the plugin's logic.
The `params` argument will be validated against the schema returned by `get_schema()`.
"""
pass
# [/DEF]
# [DEF:PluginConfig:Class]
# @SEMANTICS: plugin, config, schema, pydantic
# @PURPOSE: A Pydantic model used to represent the validated configuration and metadata of a loaded plugin. This object is what gets exposed to the API layer.
# @LAYER: Core
# @RELATION: Instantiated by PluginLoader after validating a PluginBase instance.
class PluginConfig(BaseModel):
"""Pydantic model for plugin configuration."""
id: str = Field(..., description="Unique identifier for the plugin")
name: str = Field(..., description="Human-readable name for the plugin")
description: str = Field(..., description="Brief description of what the plugin does")
version: str = Field(..., description="Version of the plugin")
input_schema: Dict[str, Any] = Field(..., description="JSON schema for input parameters", alias="schema")
from abc import ABC, abstractmethod
from typing import Dict, Any
from pydantic import BaseModel, Field
# [DEF:PluginBase:Class]
# @SEMANTICS: plugin, interface, base, abstract
# @PURPOSE: Defines the abstract base class that all plugins must implement to be recognized by the system. It enforces a common structure for plugin metadata and execution.
# @LAYER: Core
# @RELATION: Used by PluginLoader to identify valid plugins.
# @INVARIANT: All plugins MUST inherit from this class.
class PluginBase(ABC):
"""
Base class for all plugins.
Plugins must inherit from this class and implement the abstract methods.
"""
@property
@abstractmethod
def id(self) -> str:
"""A unique identifier for the plugin."""
pass
@property
@abstractmethod
def name(self) -> str:
"""A human-readable name for the plugin."""
pass
@property
@abstractmethod
def description(self) -> str:
"""A brief description of what the plugin does."""
pass
@property
@abstractmethod
def version(self) -> str:
"""The version of the plugin."""
pass
@abstractmethod
def get_schema(self) -> Dict[str, Any]:
"""
Returns the JSON schema for the plugin's input parameters.
This schema will be used to generate the frontend form.
"""
pass
@abstractmethod
async def execute(self, params: Dict[str, Any]):
"""
Executes the plugin's logic.
The `params` argument will be validated against the schema returned by `get_schema()`.
"""
pass
# [/DEF]
# [DEF:PluginConfig:Class]
# @SEMANTICS: plugin, config, schema, pydantic
# @PURPOSE: A Pydantic model used to represent the validated configuration and metadata of a loaded plugin. This object is what gets exposed to the API layer.
# @LAYER: Core
# @RELATION: Instantiated by PluginLoader after validating a PluginBase instance.
class PluginConfig(BaseModel):
"""Pydantic model for plugin configuration."""
id: str = Field(..., description="Unique identifier for the plugin")
name: str = Field(..., description="Human-readable name for the plugin")
description: str = Field(..., description="Brief description of what the plugin does")
version: str = Field(..., description="Version of the plugin")
input_schema: Dict[str, Any] = Field(..., description="JSON schema for input parameters", alias="schema")
# [/DEF]

251
backend/src/core/plugin_loader.py Normal file → Executable file
View File

@@ -1,123 +1,130 @@
import importlib.util
import os
import sys # Added this line
from typing import Dict, Type, List, Optional
from .plugin_base import PluginBase, PluginConfig
from jsonschema import validate
# [DEF:PluginLoader:Class]
# @SEMANTICS: plugin, loader, dynamic, import
# @PURPOSE: Scans a specified directory for Python modules, dynamically loads them, and registers any classes that are valid implementations of the PluginBase interface.
# @LAYER: Core
# @RELATION: Depends on PluginBase. It is used by the main application to discover and manage available plugins.
class PluginLoader:
"""
Scans a directory for Python modules, loads them, and identifies classes
that inherit from PluginBase.
"""
def __init__(self, plugin_dir: str):
self.plugin_dir = plugin_dir
self._plugins: Dict[str, PluginBase] = {}
self._plugin_configs: Dict[str, PluginConfig] = {}
self._load_plugins()
def _load_plugins(self):
"""
Scans the plugin directory, imports modules, and registers valid plugins.
"""
if not os.path.exists(self.plugin_dir):
os.makedirs(self.plugin_dir)
# Add the plugin directory's parent to sys.path to enable relative imports within plugins
# This assumes plugin_dir is something like 'backend/src/plugins'
# and we want 'backend/src' to be on the path for 'from ..core...' imports
plugin_parent_dir = os.path.abspath(os.path.join(self.plugin_dir, os.pardir))
if plugin_parent_dir not in sys.path:
sys.path.insert(0, plugin_parent_dir)
for filename in os.listdir(self.plugin_dir):
if filename.endswith(".py") and filename != "__init__.py":
module_name = filename[:-3]
file_path = os.path.join(self.plugin_dir, filename)
self._load_module(module_name, file_path)
def _load_module(self, module_name: str, file_path: str):
"""
Loads a single Python module and extracts PluginBase subclasses.
"""
package_name = f"src.plugins.{module_name}"
spec = importlib.util.spec_from_file_location(package_name, file_path)
if spec is None or spec.loader is None:
print(f"Could not load module spec for {package_name}") # Replace with proper logging
return
module = importlib.util.module_from_spec(spec)
try:
spec.loader.exec_module(module)
except Exception as e:
print(f"Error loading plugin module {module_name}: {e}") # Replace with proper logging
return
for attribute_name in dir(module):
attribute = getattr(module, attribute_name)
if (
isinstance(attribute, type)
and issubclass(attribute, PluginBase)
and attribute is not PluginBase
):
try:
plugin_instance = attribute()
self._register_plugin(plugin_instance)
except Exception as e:
print(f"Error instantiating plugin {attribute_name} in {module_name}: {e}") # Replace with proper logging
def _register_plugin(self, plugin_instance: PluginBase):
"""
Registers a valid plugin instance.
"""
plugin_id = plugin_instance.id
if plugin_id in self._plugins:
print(f"Warning: Duplicate plugin ID '{plugin_id}' found. Skipping.") # Replace with proper logging
return
try:
schema = plugin_instance.get_schema()
# Basic validation to ensure it's a dictionary
if not isinstance(schema, dict):
raise TypeError("get_schema() must return a dictionary.")
plugin_config = PluginConfig(
id=plugin_instance.id,
name=plugin_instance.name,
description=plugin_instance.description,
version=plugin_instance.version,
schema=schema,
)
# The following line is commented out because it requires a schema to be passed to validate against.
# The schema provided by the plugin is the one being validated, not the data.
# validate(instance={}, schema=schema)
self._plugins[plugin_id] = plugin_instance
self._plugin_configs[plugin_id] = plugin_config
print(f"Plugin '{plugin_instance.name}' (ID: {plugin_id}) loaded successfully.") # Replace with proper logging
except Exception as e:
print(f"Error validating plugin '{plugin_instance.name}' (ID: {plugin_id}): {e}") # Replace with proper logging
def get_plugin(self, plugin_id: str) -> Optional[PluginBase]:
"""
Returns a loaded plugin instance by its ID.
"""
return self._plugins.get(plugin_id)
def get_all_plugin_configs(self) -> List[PluginConfig]:
"""
Returns a list of all loaded plugin configurations.
"""
return list(self._plugin_configs.values())
def has_plugin(self, plugin_id: str) -> bool:
"""
Checks if a plugin with the given ID is loaded.
"""
import importlib.util
import os
import sys # Added this line
from typing import Dict, Type, List, Optional
from .plugin_base import PluginBase, PluginConfig
from jsonschema import validate
# [DEF:PluginLoader:Class]
# @SEMANTICS: plugin, loader, dynamic, import
# @PURPOSE: Scans a specified directory for Python modules, dynamically loads them, and registers any classes that are valid implementations of the PluginBase interface.
# @LAYER: Core
# @RELATION: Depends on PluginBase. It is used by the main application to discover and manage available plugins.
class PluginLoader:
"""
Scans a directory for Python modules, loads them, and identifies classes
that inherit from PluginBase.
"""
def __init__(self, plugin_dir: str):
self.plugin_dir = plugin_dir
self._plugins: Dict[str, PluginBase] = {}
self._plugin_configs: Dict[str, PluginConfig] = {}
self._load_plugins()
def _load_plugins(self):
"""
Scans the plugin directory, imports modules, and registers valid plugins.
"""
if not os.path.exists(self.plugin_dir):
os.makedirs(self.plugin_dir)
# Add the plugin directory's parent to sys.path to enable relative imports within plugins
# This assumes plugin_dir is something like 'backend/src/plugins'
# and we want 'backend/src' to be on the path for 'from ..core...' imports
plugin_parent_dir = os.path.abspath(os.path.join(self.plugin_dir, os.pardir))
if plugin_parent_dir not in sys.path:
sys.path.insert(0, plugin_parent_dir)
for filename in os.listdir(self.plugin_dir):
if filename.endswith(".py") and filename != "__init__.py":
module_name = filename[:-3]
file_path = os.path.join(self.plugin_dir, filename)
self._load_module(module_name, file_path)
def _load_module(self, module_name: str, file_path: str):
"""
Loads a single Python module and extracts PluginBase subclasses.
"""
# Try to determine the correct package prefix based on how the app is running
if "backend.src" in __name__:
package_prefix = "backend.src.plugins"
else:
package_prefix = "src.plugins"
package_name = f"{package_prefix}.{module_name}"
# print(f"DEBUG: Loading plugin {module_name} as {package_name}")
spec = importlib.util.spec_from_file_location(package_name, file_path)
if spec is None or spec.loader is None:
print(f"Could not load module spec for {package_name}") # Replace with proper logging
return
module = importlib.util.module_from_spec(spec)
try:
spec.loader.exec_module(module)
except Exception as e:
print(f"Error loading plugin module {module_name}: {e}") # Replace with proper logging
return
for attribute_name in dir(module):
attribute = getattr(module, attribute_name)
if (
isinstance(attribute, type)
and issubclass(attribute, PluginBase)
and attribute is not PluginBase
):
try:
plugin_instance = attribute()
self._register_plugin(plugin_instance)
except Exception as e:
print(f"Error instantiating plugin {attribute_name} in {module_name}: {e}") # Replace with proper logging
def _register_plugin(self, plugin_instance: PluginBase):
"""
Registers a valid plugin instance.
"""
plugin_id = plugin_instance.id
if plugin_id in self._plugins:
print(f"Warning: Duplicate plugin ID '{plugin_id}' found. Skipping.") # Replace with proper logging
return
try:
schema = plugin_instance.get_schema()
# Basic validation to ensure it's a dictionary
if not isinstance(schema, dict):
raise TypeError("get_schema() must return a dictionary.")
plugin_config = PluginConfig(
id=plugin_instance.id,
name=plugin_instance.name,
description=plugin_instance.description,
version=plugin_instance.version,
schema=schema,
)
# The following line is commented out because it requires a schema to be passed to validate against.
# The schema provided by the plugin is the one being validated, not the data.
# validate(instance={}, schema=schema)
self._plugins[plugin_id] = plugin_instance
self._plugin_configs[plugin_id] = plugin_config
print(f"Plugin '{plugin_instance.name}' (ID: {plugin_id}) loaded successfully.") # Replace with proper logging
except Exception as e:
print(f"Error validating plugin '{plugin_instance.name}' (ID: {plugin_id}): {e}") # Replace with proper logging
def get_plugin(self, plugin_id: str) -> Optional[PluginBase]:
"""
Returns a loaded plugin instance by its ID.
"""
return self._plugins.get(plugin_id)
def get_all_plugin_configs(self) -> List[PluginConfig]:
"""
Returns a list of all loaded plugin configurations.
"""
return list(self._plugin_configs.values())
def has_plugin(self, plugin_id: str) -> bool:
"""
Checks if a plugin with the given ID is loaded.
"""
return plugin_id in self._plugins

262
backend/src/core/task_manager.py Normal file → Executable file
View File

@@ -1,131 +1,131 @@
# [DEF:TaskManagerModule:Module]
# @SEMANTICS: task, manager, lifecycle, execution, state
# @PURPOSE: Manages the lifecycle of tasks, including their creation, execution, and state tracking. It uses a thread pool to run plugins asynchronously.
# @LAYER: Core
# @RELATION: Depends on PluginLoader to get plugin instances. It is used by the API layer to create and query tasks.
import asyncio
import uuid
from datetime import datetime
from enum import Enum
from typing import Dict, Any, List, Optional
from concurrent.futures import ThreadPoolExecutor
from pydantic import BaseModel, Field
# Assuming PluginBase and PluginConfig are defined in plugin_base.py
# from .plugin_base import PluginBase, PluginConfig # Not needed here, TaskManager interacts with the PluginLoader
# [DEF:TaskStatus:Enum]
# @SEMANTICS: task, status, state, enum
# @PURPOSE: Defines the possible states a task can be in during its lifecycle.
class TaskStatus(str, Enum):
PENDING = "PENDING"
RUNNING = "RUNNING"
SUCCESS = "SUCCESS"
FAILED = "FAILED"
# [/DEF]
# [DEF:LogEntry:Class]
# @SEMANTICS: log, entry, record, pydantic
# @PURPOSE: A Pydantic model representing a single, structured log entry associated with a task.
class LogEntry(BaseModel):
timestamp: datetime = Field(default_factory=datetime.utcnow)
level: str
message: str
context: Optional[Dict[str, Any]] = None
# [/DEF]
# [DEF:Task:Class]
# @SEMANTICS: task, job, execution, state, pydantic
# @PURPOSE: A Pydantic model representing a single execution instance of a plugin, including its status, parameters, and logs.
class Task(BaseModel):
id: str = Field(default_factory=lambda: str(uuid.uuid4()))
plugin_id: str
status: TaskStatus = TaskStatus.PENDING
started_at: Optional[datetime] = None
finished_at: Optional[datetime] = None
user_id: Optional[str] = None
logs: List[LogEntry] = Field(default_factory=list)
params: Dict[str, Any] = Field(default_factory=dict)
# [/DEF]
# [DEF:TaskManager:Class]
# @SEMANTICS: task, manager, lifecycle, execution, state
# @PURPOSE: Manages the lifecycle of tasks, including their creation, execution, and state tracking.
class TaskManager:
"""
Manages the lifecycle of tasks, including their creation, execution, and state tracking.
"""
def __init__(self, plugin_loader):
self.plugin_loader = plugin_loader
self.tasks: Dict[str, Task] = {}
self.executor = ThreadPoolExecutor(max_workers=5) # For CPU-bound plugin execution
self.loop = asyncio.get_event_loop()
# [/DEF]
async def create_task(self, plugin_id: str, params: Dict[str, Any], user_id: Optional[str] = None) -> Task:
"""
Creates and queues a new task for execution.
"""
if not self.plugin_loader.has_plugin(plugin_id):
raise ValueError(f"Plugin with ID '{plugin_id}' not found.")
plugin = self.plugin_loader.get_plugin(plugin_id)
# Validate params against plugin schema (this will be done at a higher level, e.g., API route)
# For now, a basic check
if not isinstance(params, dict):
raise ValueError("Task parameters must be a dictionary.")
task = Task(plugin_id=plugin_id, params=params, user_id=user_id)
self.tasks[task.id] = task
self.loop.create_task(self._run_task(task.id)) # Schedule task for execution
return task
async def _run_task(self, task_id: str):
"""
Internal method to execute a task.
"""
task = self.tasks[task_id]
plugin = self.plugin_loader.get_plugin(task.plugin_id)
task.status = TaskStatus.RUNNING
task.started_at = datetime.utcnow()
task.logs.append(LogEntry(level="INFO", message=f"Task started for plugin '{plugin.name}'"))
try:
# Execute plugin in a separate thread to avoid blocking the event loop
# if the plugin's execute method is synchronous and potentially CPU-bound.
# If the plugin's execute method is already async, this can be simplified.
await self.loop.run_in_executor(
self.executor,
lambda: asyncio.run(plugin.execute(task.params)) if asyncio.iscoroutinefunction(plugin.execute) else plugin.execute(task.params)
)
task.status = TaskStatus.SUCCESS
task.logs.append(LogEntry(level="INFO", message=f"Task completed successfully for plugin '{plugin.name}'"))
except Exception as e:
task.status = TaskStatus.FAILED
task.logs.append(LogEntry(level="ERROR", message=f"Task failed: {e}", context={"error_type": type(e).__name__}))
finally:
task.finished_at = datetime.utcnow()
# In a real system, you might notify clients via WebSocket here
def get_task(self, task_id: str) -> Optional[Task]:
"""
Retrieves a task by its ID.
"""
return self.tasks.get(task_id)
def get_all_tasks(self) -> List[Task]:
"""
Retrieves all registered tasks.
"""
return list(self.tasks.values())
def get_task_logs(self, task_id: str) -> List[LogEntry]:
"""
Retrieves logs for a specific task.
"""
task = self.tasks.get(task_id)
return task.logs if task else []
# [DEF:TaskManagerModule:Module]
# @SEMANTICS: task, manager, lifecycle, execution, state
# @PURPOSE: Manages the lifecycle of tasks, including their creation, execution, and state tracking. It uses a thread pool to run plugins asynchronously.
# @LAYER: Core
# @RELATION: Depends on PluginLoader to get plugin instances. It is used by the API layer to create and query tasks.
import asyncio
import uuid
from datetime import datetime
from enum import Enum
from typing import Dict, Any, List, Optional
from concurrent.futures import ThreadPoolExecutor
from pydantic import BaseModel, Field
# Assuming PluginBase and PluginConfig are defined in plugin_base.py
# from .plugin_base import PluginBase, PluginConfig # Not needed here, TaskManager interacts with the PluginLoader
# [DEF:TaskStatus:Enum]
# @SEMANTICS: task, status, state, enum
# @PURPOSE: Defines the possible states a task can be in during its lifecycle.
class TaskStatus(str, Enum):
PENDING = "PENDING"
RUNNING = "RUNNING"
SUCCESS = "SUCCESS"
FAILED = "FAILED"
# [/DEF]
# [DEF:LogEntry:Class]
# @SEMANTICS: log, entry, record, pydantic
# @PURPOSE: A Pydantic model representing a single, structured log entry associated with a task.
class LogEntry(BaseModel):
timestamp: datetime = Field(default_factory=datetime.utcnow)
level: str
message: str
context: Optional[Dict[str, Any]] = None
# [/DEF]
# [DEF:Task:Class]
# @SEMANTICS: task, job, execution, state, pydantic
# @PURPOSE: A Pydantic model representing a single execution instance of a plugin, including its status, parameters, and logs.
class Task(BaseModel):
id: str = Field(default_factory=lambda: str(uuid.uuid4()))
plugin_id: str
status: TaskStatus = TaskStatus.PENDING
started_at: Optional[datetime] = None
finished_at: Optional[datetime] = None
user_id: Optional[str] = None
logs: List[LogEntry] = Field(default_factory=list)
params: Dict[str, Any] = Field(default_factory=dict)
# [/DEF]
# [DEF:TaskManager:Class]
# @SEMANTICS: task, manager, lifecycle, execution, state
# @PURPOSE: Manages the lifecycle of tasks, including their creation, execution, and state tracking.
class TaskManager:
"""
Manages the lifecycle of tasks, including their creation, execution, and state tracking.
"""
def __init__(self, plugin_loader):
self.plugin_loader = plugin_loader
self.tasks: Dict[str, Task] = {}
self.executor = ThreadPoolExecutor(max_workers=5) # For CPU-bound plugin execution
self.loop = asyncio.get_event_loop()
# [/DEF]
async def create_task(self, plugin_id: str, params: Dict[str, Any], user_id: Optional[str] = None) -> Task:
"""
Creates and queues a new task for execution.
"""
if not self.plugin_loader.has_plugin(plugin_id):
raise ValueError(f"Plugin with ID '{plugin_id}' not found.")
plugin = self.plugin_loader.get_plugin(plugin_id)
# Validate params against plugin schema (this will be done at a higher level, e.g., API route)
# For now, a basic check
if not isinstance(params, dict):
raise ValueError("Task parameters must be a dictionary.")
task = Task(plugin_id=plugin_id, params=params, user_id=user_id)
self.tasks[task.id] = task
self.loop.create_task(self._run_task(task.id)) # Schedule task for execution
return task
async def _run_task(self, task_id: str):
"""
Internal method to execute a task.
"""
task = self.tasks[task_id]
plugin = self.plugin_loader.get_plugin(task.plugin_id)
task.status = TaskStatus.RUNNING
task.started_at = datetime.utcnow()
task.logs.append(LogEntry(level="INFO", message=f"Task started for plugin '{plugin.name}'"))
try:
# Execute plugin in a separate thread to avoid blocking the event loop
# if the plugin's execute method is synchronous and potentially CPU-bound.
# If the plugin's execute method is already async, this can be simplified.
await self.loop.run_in_executor(
self.executor,
lambda: asyncio.run(plugin.execute(task.params)) if asyncio.iscoroutinefunction(plugin.execute) else plugin.execute(task.params)
)
task.status = TaskStatus.SUCCESS
task.logs.append(LogEntry(level="INFO", message=f"Task completed successfully for plugin '{plugin.name}'"))
except Exception as e:
task.status = TaskStatus.FAILED
task.logs.append(LogEntry(level="ERROR", message=f"Task failed: {e}", context={"error_type": type(e).__name__}))
finally:
task.finished_at = datetime.utcnow()
# In a real system, you might notify clients via WebSocket here
def get_task(self, task_id: str) -> Optional[Task]:
"""
Retrieves a task by its ID.
"""
return self.tasks.get(task_id)
def get_all_tasks(self) -> List[Task]:
"""
Retrieves all registered tasks.
"""
return list(self.tasks.values())
def get_task_logs(self, task_id: str) -> List[LogEntry]:
"""
Retrieves logs for a specific task.
"""
task = self.tasks.get(task_id)
return task.logs if task else []

55
backend/src/dependencies.py Normal file → Executable file
View File

@@ -1,24 +1,33 @@
# [DEF:Dependencies:Module]
# @SEMANTICS: dependency, injection, singleton, factory
# @PURPOSE: Manages the creation and provision of shared application dependencies, such as the PluginLoader and TaskManager, to avoid circular imports.
# @LAYER: Core
# @RELATION: Used by the main app and API routers to get access to shared instances.
from pathlib import Path
from .core.plugin_loader import PluginLoader
from .core.task_manager import TaskManager
# Initialize singletons
# Use absolute path relative to this file to ensure plugins are found regardless of CWD
plugin_dir = Path(__file__).parent / "plugins"
plugin_loader = PluginLoader(plugin_dir=str(plugin_dir))
task_manager = TaskManager(plugin_loader)
def get_plugin_loader() -> PluginLoader:
"""Dependency injector for the PluginLoader."""
return plugin_loader
def get_task_manager() -> TaskManager:
"""Dependency injector for the TaskManager."""
return task_manager
# [DEF:Dependencies:Module]
# @SEMANTICS: dependency, injection, singleton, factory
# @PURPOSE: Manages the creation and provision of shared application dependencies, such as the PluginLoader and TaskManager, to avoid circular imports.
# @LAYER: Core
# @RELATION: Used by the main app and API routers to get access to shared instances.
from pathlib import Path
from .core.plugin_loader import PluginLoader
from .core.task_manager import TaskManager
from .core.config_manager import ConfigManager
# Initialize singletons
# Use absolute path relative to this file to ensure plugins are found regardless of CWD
project_root = Path(__file__).parent.parent.parent
config_path = project_root / "config.json"
config_manager = ConfigManager(config_path=str(config_path))
def get_config_manager() -> ConfigManager:
"""Dependency injector for the ConfigManager."""
return config_manager
plugin_dir = Path(__file__).parent / "plugins"
plugin_loader = PluginLoader(plugin_dir=str(plugin_dir))
task_manager = TaskManager(plugin_loader)
def get_plugin_loader() -> PluginLoader:
"""Dependency injector for the PluginLoader."""
return plugin_loader
def get_task_manager() -> TaskManager:
"""Dependency injector for the TaskManager."""
return task_manager
# [/DEF]

249
backend/src/plugins/backup.py Normal file → Executable file
View File

@@ -1,121 +1,130 @@
# [DEF:BackupPlugin:Module]
# @SEMANTICS: backup, superset, automation, dashboard, plugin
# @PURPOSE: A plugin that provides functionality to back up Superset dashboards.
# @LAYER: App
# @RELATION: IMPLEMENTS -> PluginBase
# @RELATION: DEPENDS_ON -> superset_tool.client
# @RELATION: DEPENDS_ON -> superset_tool.utils
from typing import Dict, Any
from pathlib import Path
from requests.exceptions import RequestException
from ..core.plugin_base import PluginBase
from superset_tool.client import SupersetClient
from superset_tool.exceptions import SupersetAPIError
from superset_tool.utils.logger import SupersetLogger
from superset_tool.utils.fileio import (
save_and_unpack_dashboard,
archive_exports,
sanitize_filename,
consolidate_archive_folders,
remove_empty_directories,
RetentionPolicy
)
from superset_tool.utils.init_clients import setup_clients
class BackupPlugin(PluginBase):
"""
A plugin to back up Superset dashboards.
"""
@property
def id(self) -> str:
return "superset-backup"
@property
def name(self) -> str:
return "Superset Dashboard Backup"
@property
def description(self) -> str:
return "Backs up all dashboards from a Superset instance."
@property
def version(self) -> str:
return "1.0.0"
def get_schema(self) -> Dict[str, Any]:
return {
"type": "object",
"properties": {
"env": {
"type": "string",
"title": "Environment",
"description": "The Superset environment to back up (e.g., 'dev', 'prod').",
"enum": ["dev", "sbx", "prod", "preprod"],
},
"backup_path": {
"type": "string",
"title": "Backup Path",
"description": "The root directory to save backups to.",
"default": "P:\\Superset\\010 Бекапы"
}
},
"required": ["env", "backup_path"],
}
async def execute(self, params: Dict[str, Any]):
env = params["env"]
backup_path = Path(params["backup_path"])
logger = SupersetLogger(log_dir=backup_path / "Logs", console=True)
logger.info(f"[BackupPlugin][Entry] Starting backup for {env}.")
try:
clients = setup_clients(logger)
client = clients[env]
dashboard_count, dashboard_meta = client.get_dashboards()
logger.info(f"[BackupPlugin][Progress] Found {dashboard_count} dashboards to export in {env}.")
if dashboard_count == 0:
logger.info("[BackupPlugin][Exit] No dashboards to back up.")
return
for db in dashboard_meta:
dashboard_id = db.get('id')
dashboard_title = db.get('dashboard_title', 'Unknown Dashboard')
if not dashboard_id:
continue
try:
dashboard_base_dir_name = sanitize_filename(f"{dashboard_title}")
dashboard_dir = backup_path / env.upper() / dashboard_base_dir_name
dashboard_dir.mkdir(parents=True, exist_ok=True)
zip_content, filename = client.export_dashboard(dashboard_id)
save_and_unpack_dashboard(
zip_content=zip_content,
original_filename=filename,
output_dir=dashboard_dir,
unpack=False,
logger=logger
)
archive_exports(str(dashboard_dir), policy=RetentionPolicy(), logger=logger)
except (SupersetAPIError, RequestException, IOError, OSError) as db_error:
logger.error(f"[BackupPlugin][Failure] Failed to export dashboard {dashboard_title} (ID: {dashboard_id}): {db_error}", exc_info=True)
continue
consolidate_archive_folders(backup_path / env.upper(), logger=logger)
remove_empty_directories(str(backup_path / env.upper()), logger=logger)
logger.info(f"[BackupPlugin][CoherenceCheck:Passed] Backup logic completed for {env}.")
except (RequestException, IOError, KeyError) as e:
logger.critical(f"[BackupPlugin][Failure] Fatal error during backup for {env}: {e}", exc_info=True)
raise e
# [DEF:BackupPlugin:Module]
# @SEMANTICS: backup, superset, automation, dashboard, plugin
# @PURPOSE: A plugin that provides functionality to back up Superset dashboards.
# @LAYER: App
# @RELATION: IMPLEMENTS -> PluginBase
# @RELATION: DEPENDS_ON -> superset_tool.client
# @RELATION: DEPENDS_ON -> superset_tool.utils
from typing import Dict, Any
from pathlib import Path
from requests.exceptions import RequestException
from ..core.plugin_base import PluginBase
from superset_tool.client import SupersetClient
from superset_tool.exceptions import SupersetAPIError
from superset_tool.utils.logger import SupersetLogger
from superset_tool.utils.fileio import (
save_and_unpack_dashboard,
archive_exports,
sanitize_filename,
consolidate_archive_folders,
remove_empty_directories,
RetentionPolicy
)
from superset_tool.utils.init_clients import setup_clients
from ..dependencies import get_config_manager
class BackupPlugin(PluginBase):
"""
A plugin to back up Superset dashboards.
"""
@property
def id(self) -> str:
return "superset-backup"
@property
def name(self) -> str:
return "Superset Dashboard Backup"
@property
def description(self) -> str:
return "Backs up all dashboards from a Superset instance."
@property
def version(self) -> str:
return "1.0.0"
def get_schema(self) -> Dict[str, Any]:
config_manager = get_config_manager()
envs = [e.name for e in config_manager.get_environments()]
default_path = config_manager.get_config().settings.backup_path
return {
"type": "object",
"properties": {
"env": {
"type": "string",
"title": "Environment",
"description": "The Superset environment to back up.",
"enum": envs if envs else ["dev", "prod"],
},
"backup_path": {
"type": "string",
"title": "Backup Path",
"description": "The root directory to save backups to.",
"default": default_path
}
},
"required": ["env", "backup_path"],
}
async def execute(self, params: Dict[str, Any]):
env = params["env"]
backup_path = Path(params["backup_path"])
logger = SupersetLogger(log_dir=backup_path / "Logs", console=True)
logger.info(f"[BackupPlugin][Entry] Starting backup for {env}.")
try:
config_manager = get_config_manager()
clients = setup_clients(logger, custom_envs=config_manager.get_environments())
client = clients.get(env)
if not client:
raise ValueError(f"Environment '{env}' not found in configuration.")
dashboard_count, dashboard_meta = client.get_dashboards()
logger.info(f"[BackupPlugin][Progress] Found {dashboard_count} dashboards to export in {env}.")
if dashboard_count == 0:
logger.info("[BackupPlugin][Exit] No dashboards to back up.")
return
for db in dashboard_meta:
dashboard_id = db.get('id')
dashboard_title = db.get('dashboard_title', 'Unknown Dashboard')
if not dashboard_id:
continue
try:
dashboard_base_dir_name = sanitize_filename(f"{dashboard_title}")
dashboard_dir = backup_path / env.upper() / dashboard_base_dir_name
dashboard_dir.mkdir(parents=True, exist_ok=True)
zip_content, filename = client.export_dashboard(dashboard_id)
save_and_unpack_dashboard(
zip_content=zip_content,
original_filename=filename,
output_dir=dashboard_dir,
unpack=False,
logger=logger
)
archive_exports(str(dashboard_dir), policy=RetentionPolicy(), logger=logger)
except (SupersetAPIError, RequestException, IOError, OSError) as db_error:
logger.error(f"[BackupPlugin][Failure] Failed to export dashboard {dashboard_title} (ID: {dashboard_id}): {db_error}", exc_info=True)
continue
consolidate_archive_folders(backup_path / env.upper(), logger=logger)
remove_empty_directories(str(backup_path / env.upper()), logger=logger)
logger.info(f"[BackupPlugin][CoherenceCheck:Passed] Backup logic completed for {env}.")
except (RequestException, IOError, KeyError) as e:
logger.critical(f"[BackupPlugin][Failure] Fatal error during backup for {env}: {e}", exc_info=True)
raise e
# [/DEF:BackupPlugin]

306
backend/src/plugins/migration.py Normal file → Executable file
View File

@@ -1,150 +1,158 @@
# [DEF:MigrationPlugin:Module]
# @SEMANTICS: migration, superset, automation, dashboard, plugin
# @PURPOSE: A plugin that provides functionality to migrate Superset dashboards between environments.
# @LAYER: App
# @RELATION: IMPLEMENTS -> PluginBase
# @RELATION: DEPENDS_ON -> superset_tool.client
# @RELATION: DEPENDS_ON -> superset_tool.utils
from typing import Dict, Any, List
from pathlib import Path
import zipfile
import re
from ..core.plugin_base import PluginBase
from superset_tool.client import SupersetClient
from superset_tool.utils.init_clients import setup_clients
from superset_tool.utils.fileio import create_temp_file, update_yamls, create_dashboard_export
from superset_tool.utils.logger import SupersetLogger
class MigrationPlugin(PluginBase):
"""
A plugin to migrate Superset dashboards between environments.
"""
@property
def id(self) -> str:
return "superset-migration"
@property
def name(self) -> str:
return "Superset Dashboard Migration"
@property
def description(self) -> str:
return "Migrates dashboards between Superset environments."
@property
def version(self) -> str:
return "1.0.0"
def get_schema(self) -> Dict[str, Any]:
return {
"type": "object",
"properties": {
"from_env": {
"type": "string",
"title": "Source Environment",
"description": "The environment to migrate from.",
"enum": ["dev", "sbx", "prod", "preprod"],
},
"to_env": {
"type": "string",
"title": "Target Environment",
"description": "The environment to migrate to.",
"enum": ["dev", "sbx", "prod", "preprod"],
},
"dashboard_regex": {
"type": "string",
"title": "Dashboard Regex",
"description": "A regular expression to filter dashboards to migrate.",
},
"replace_db_config": {
"type": "boolean",
"title": "Replace DB Config",
"description": "Whether to replace the database configuration.",
"default": False,
},
"from_db_id": {
"type": "integer",
"title": "Source DB ID",
"description": "The ID of the source database to replace (if replacing).",
},
"to_db_id": {
"type": "integer",
"title": "Target DB ID",
"description": "The ID of the target database to replace with (if replacing).",
},
},
"required": ["from_env", "to_env", "dashboard_regex"],
}
async def execute(self, params: Dict[str, Any]):
from_env = params["from_env"]
to_env = params["to_env"]
dashboard_regex = params["dashboard_regex"]
replace_db_config = params.get("replace_db_config", False)
from_db_id = params.get("from_db_id")
to_db_id = params.get("to_db_id")
logger = SupersetLogger(log_dir=Path.cwd() / "logs", console=True)
logger.info(f"[MigrationPlugin][Entry] Starting migration from {from_env} to {to_env}.")
try:
all_clients = setup_clients(logger)
from_c = all_clients[from_env]
to_c = all_clients[to_env]
_, all_dashboards = from_c.get_dashboards()
regex_str = str(dashboard_regex)
dashboards_to_migrate = [
d for d in all_dashboards if re.search(regex_str, d["dashboard_title"], re.IGNORECASE)
]
if not dashboards_to_migrate:
logger.warning("[MigrationPlugin][State] No dashboards found matching the regex.")
return
db_config_replacement = None
if replace_db_config:
if from_db_id is None or to_db_id is None:
raise ValueError("Source and target database IDs are required when replacing database configuration.")
from_db = from_c.get_database(int(from_db_id))
to_db = to_c.get_database(int(to_db_id))
old_result = from_db.get("result", {})
new_result = to_db.get("result", {})
db_config_replacement = {
"old": {"database_name": old_result.get("database_name"), "uuid": old_result.get("uuid"), "id": str(from_db.get("id"))},
"new": {"database_name": new_result.get("database_name"), "uuid": new_result.get("uuid"), "id": str(to_db.get("id"))}
}
for dash in dashboards_to_migrate:
dash_id, dash_slug, title = dash["id"], dash.get("slug"), dash["dashboard_title"]
try:
exported_content, _ = from_c.export_dashboard(dash_id)
with create_temp_file(content=exported_content, dry_run=True, suffix=".zip", logger=logger) as tmp_zip_path:
if not db_config_replacement:
to_c.import_dashboard(file_name=tmp_zip_path, dash_id=dash_id, dash_slug=dash_slug)
else:
with create_temp_file(suffix=".dir", logger=logger) as tmp_unpack_dir:
with zipfile.ZipFile(tmp_zip_path, "r") as zip_ref:
zip_ref.extractall(tmp_unpack_dir)
update_yamls(db_configs=[db_config_replacement], path=str(tmp_unpack_dir))
with create_temp_file(suffix=".zip", dry_run=True, logger=logger) as tmp_new_zip:
create_dashboard_export(zip_path=tmp_new_zip, source_paths=[str(p) for p in Path(tmp_unpack_dir).glob("**/*")])
to_c.import_dashboard(file_name=tmp_new_zip, dash_id=dash_id, dash_slug=dash_slug)
logger.info(f"[MigrationPlugin][Success] Dashboard {title} imported.")
except Exception as exc:
logger.error(f"[MigrationPlugin][Failure] Failed to migrate dashboard {title}: {exc}", exc_info=True)
logger.info("[MigrationPlugin][Exit] Migration finished.")
except Exception as e:
logger.critical(f"[MigrationPlugin][Failure] Fatal error during migration: {e}", exc_info=True)
raise e
# [DEF:MigrationPlugin:Module]
# @SEMANTICS: migration, superset, automation, dashboard, plugin
# @PURPOSE: A plugin that provides functionality to migrate Superset dashboards between environments.
# @LAYER: App
# @RELATION: IMPLEMENTS -> PluginBase
# @RELATION: DEPENDS_ON -> superset_tool.client
# @RELATION: DEPENDS_ON -> superset_tool.utils
from typing import Dict, Any, List
from pathlib import Path
import zipfile
import re
from ..core.plugin_base import PluginBase
from superset_tool.client import SupersetClient
from superset_tool.utils.init_clients import setup_clients
from superset_tool.utils.fileio import create_temp_file, update_yamls, create_dashboard_export
from ..dependencies import get_config_manager
from superset_tool.utils.logger import SupersetLogger
class MigrationPlugin(PluginBase):
"""
A plugin to migrate Superset dashboards between environments.
"""
@property
def id(self) -> str:
return "superset-migration"
@property
def name(self) -> str:
return "Superset Dashboard Migration"
@property
def description(self) -> str:
return "Migrates dashboards between Superset environments."
@property
def version(self) -> str:
return "1.0.0"
def get_schema(self) -> Dict[str, Any]:
config_manager = get_config_manager()
envs = [e.name for e in config_manager.get_environments()]
return {
"type": "object",
"properties": {
"from_env": {
"type": "string",
"title": "Source Environment",
"description": "The environment to migrate from.",
"enum": envs if envs else ["dev", "prod"],
},
"to_env": {
"type": "string",
"title": "Target Environment",
"description": "The environment to migrate to.",
"enum": envs if envs else ["dev", "prod"],
},
"dashboard_regex": {
"type": "string",
"title": "Dashboard Regex",
"description": "A regular expression to filter dashboards to migrate.",
},
"replace_db_config": {
"type": "boolean",
"title": "Replace DB Config",
"description": "Whether to replace the database configuration.",
"default": False,
},
"from_db_id": {
"type": "integer",
"title": "Source DB ID",
"description": "The ID of the source database to replace (if replacing).",
},
"to_db_id": {
"type": "integer",
"title": "Target DB ID",
"description": "The ID of the target database to replace with (if replacing).",
},
},
"required": ["from_env", "to_env", "dashboard_regex"],
}
async def execute(self, params: Dict[str, Any]):
from_env = params["from_env"]
to_env = params["to_env"]
dashboard_regex = params["dashboard_regex"]
replace_db_config = params.get("replace_db_config", False)
from_db_id = params.get("from_db_id")
to_db_id = params.get("to_db_id")
logger = SupersetLogger(log_dir=Path.cwd() / "logs", console=True)
logger.info(f"[MigrationPlugin][Entry] Starting migration from {from_env} to {to_env}.")
try:
config_manager = get_config_manager()
all_clients = setup_clients(logger, custom_envs=config_manager.get_environments())
from_c = all_clients.get(from_env)
to_c = all_clients.get(to_env)
if not from_c or not to_c:
raise ValueError(f"One or both environments ('{from_env}', '{to_env}') not found in configuration.")
_, all_dashboards = from_c.get_dashboards()
regex_str = str(dashboard_regex)
dashboards_to_migrate = [
d for d in all_dashboards if re.search(regex_str, d["dashboard_title"], re.IGNORECASE)
]
if not dashboards_to_migrate:
logger.warning("[MigrationPlugin][State] No dashboards found matching the regex.")
return
db_config_replacement = None
if replace_db_config:
if from_db_id is None or to_db_id is None:
raise ValueError("Source and target database IDs are required when replacing database configuration.")
from_db = from_c.get_database(int(from_db_id))
to_db = to_c.get_database(int(to_db_id))
old_result = from_db.get("result", {})
new_result = to_db.get("result", {})
db_config_replacement = {
"old": {"database_name": old_result.get("database_name"), "uuid": old_result.get("uuid"), "id": str(from_db.get("id"))},
"new": {"database_name": new_result.get("database_name"), "uuid": new_result.get("uuid"), "id": str(to_db.get("id"))}
}
for dash in dashboards_to_migrate:
dash_id, dash_slug, title = dash["id"], dash.get("slug"), dash["dashboard_title"]
try:
exported_content, _ = from_c.export_dashboard(dash_id)
with create_temp_file(content=exported_content, dry_run=True, suffix=".zip", logger=logger) as tmp_zip_path:
if not db_config_replacement:
to_c.import_dashboard(file_name=tmp_zip_path, dash_id=dash_id, dash_slug=dash_slug)
else:
with create_temp_file(suffix=".dir", logger=logger) as tmp_unpack_dir:
with zipfile.ZipFile(tmp_zip_path, "r") as zip_ref:
zip_ref.extractall(tmp_unpack_dir)
update_yamls(db_configs=[db_config_replacement], path=str(tmp_unpack_dir))
with create_temp_file(suffix=".zip", dry_run=True, logger=logger) as tmp_new_zip:
create_dashboard_export(zip_path=tmp_new_zip, source_paths=[str(p) for p in Path(tmp_unpack_dir).glob("**/*")])
to_c.import_dashboard(file_name=tmp_new_zip, dash_id=dash_id, dash_slug=dash_slug)
logger.info(f"[MigrationPlugin][Success] Dashboard {title} imported.")
except Exception as exc:
logger.error(f"[MigrationPlugin][Failure] Failed to migrate dashboard {title}: {exc}", exc_info=True)
logger.info("[MigrationPlugin][Exit] Migration finished.")
except Exception as e:
logger.critical(f"[MigrationPlugin][Failure] Fatal error during migration: {e}", exc_info=True)
raise e
# [/DEF:MigrationPlugin]

326
backup_script.py Normal file → Executable file
View File

@@ -1,163 +1,163 @@
# [DEF:backup_script:Module]
#
# @SEMANTICS: backup, superset, automation, dashboard
# @PURPOSE: Этот модуль отвечает за автоматизированное резервное копирование дашбордов Superset.
# @LAYER: App
# @RELATION: DEPENDS_ON -> superset_tool.client
# @RELATION: DEPENDS_ON -> superset_tool.utils
# @PUBLIC_API: BackupConfig, backup_dashboards, main
# [SECTION: IMPORTS]
import logging
import sys
from pathlib import Path
from dataclasses import dataclass,field
from requests.exceptions import RequestException
from superset_tool.client import SupersetClient
from superset_tool.exceptions import SupersetAPIError
from superset_tool.utils.logger import SupersetLogger
from superset_tool.utils.fileio import (
save_and_unpack_dashboard,
archive_exports,
sanitize_filename,
consolidate_archive_folders,
remove_empty_directories,
RetentionPolicy
)
from superset_tool.utils.init_clients import setup_clients
# [/SECTION]
# [DEF:BackupConfig:DataClass]
# @PURPOSE: Хранит конфигурацию для процесса бэкапа.
@dataclass
class BackupConfig:
"""Конфигурация для процесса бэкапа."""
consolidate: bool = True
rotate_archive: bool = True
clean_folders: bool = True
retention_policy: RetentionPolicy = field(default_factory=RetentionPolicy)
# [/DEF:BackupConfig]
# [DEF:backup_dashboards:Function]
# @PURPOSE: Выполняет бэкап всех доступных дашбордов для заданного клиента и окружения, пропуская ошибки экспорта.
# @PRE: `client` должен быть инициализированным экземпляром `SupersetClient`.
# @PRE: `env_name` должен быть строкой, обозначающей окружение.
# @PRE: `backup_root` должен быть валидным путем к корневой директории бэкапа.
# @POST: Дашборды экспортируются и сохраняются. Ошибки экспорта логируются и не приводят к остановке скрипта.
# @RELATION: CALLS -> client.get_dashboards
# @RELATION: CALLS -> client.export_dashboard
# @RELATION: CALLS -> save_and_unpack_dashboard
# @RELATION: CALLS -> archive_exports
# @RELATION: CALLS -> consolidate_archive_folders
# @RELATION: CALLS -> remove_empty_directories
# @PARAM: client (SupersetClient) - Клиент для доступа к API Superset.
# @PARAM: env_name (str) - Имя окружения (e.g., 'PROD').
# @PARAM: backup_root (Path) - Корневая директория для сохранения бэкапов.
# @PARAM: logger (SupersetLogger) - Инстанс логгера.
# @PARAM: config (BackupConfig) - Конфигурация процесса бэкапа.
# @RETURN: bool - `True` если все дашборды были экспортированы без критических ошибок, `False` иначе.
def backup_dashboards(
client: SupersetClient,
env_name: str,
backup_root: Path,
logger: SupersetLogger,
config: BackupConfig
) -> bool:
logger.info(f"[backup_dashboards][Entry] Starting backup for {env_name}.")
try:
dashboard_count, dashboard_meta = client.get_dashboards()
logger.info(f"[backup_dashboards][Progress] Found {dashboard_count} dashboards to export in {env_name}.")
if dashboard_count == 0:
return True
success_count = 0
for db in dashboard_meta:
dashboard_id = db.get('id')
dashboard_title = db.get('dashboard_title', 'Unknown Dashboard')
if not dashboard_id:
continue
try:
dashboard_base_dir_name = sanitize_filename(f"{dashboard_title}")
dashboard_dir = backup_root / env_name / dashboard_base_dir_name
dashboard_dir.mkdir(parents=True, exist_ok=True)
zip_content, filename = client.export_dashboard(dashboard_id)
save_and_unpack_dashboard(
zip_content=zip_content,
original_filename=filename,
output_dir=dashboard_dir,
unpack=False,
logger=logger
)
if config.rotate_archive:
archive_exports(str(dashboard_dir), policy=config.retention_policy, logger=logger)
success_count += 1
except (SupersetAPIError, RequestException, IOError, OSError) as db_error:
logger.error(f"[backup_dashboards][Failure] Failed to export dashboard {dashboard_title} (ID: {dashboard_id}): {db_error}", exc_info=True)
continue
if config.consolidate:
consolidate_archive_folders(backup_root / env_name , logger=logger)
if config.clean_folders:
remove_empty_directories(str(backup_root / env_name), logger=logger)
logger.info(f"[backup_dashboards][CoherenceCheck:Passed] Backup logic completed.")
return success_count == dashboard_count
except (RequestException, IOError) as e:
logger.critical(f"[backup_dashboards][Failure] Fatal error during backup for {env_name}: {e}", exc_info=True)
return False
# [/DEF:backup_dashboards]
# [DEF:main:Function]
# @PURPOSE: Основная точка входа для запуска процесса резервного копирования.
# @RELATION: CALLS -> setup_clients
# @RELATION: CALLS -> backup_dashboards
# @RETURN: int - Код выхода (0 - успех, 1 - ошибка).
def main() -> int:
log_dir = Path("P:\\Superset\\010 Бекапы\\Logs")
logger = SupersetLogger(log_dir=log_dir, level=logging.INFO, console=True)
logger.info("[main][Entry] Starting Superset backup process.")
exit_code = 0
try:
clients = setup_clients(logger)
superset_backup_repo = Path("P:\\Superset\\010 Бекапы")
superset_backup_repo.mkdir(parents=True, exist_ok=True)
results = {}
environments = ['dev', 'sbx', 'prod', 'preprod']
backup_config = BackupConfig(rotate_archive=True)
for env in environments:
try:
results[env] = backup_dashboards(
clients[env],
env.upper(),
superset_backup_repo,
logger=logger,
config=backup_config
)
except Exception as env_error:
logger.critical(f"[main][Failure] Critical error for environment {env}: {env_error}", exc_info=True)
results[env] = False
if not all(results.values()):
exit_code = 1
except (RequestException, IOError) as e:
logger.critical(f"[main][Failure] Fatal error in main execution: {e}", exc_info=True)
exit_code = 1
logger.info("[main][Exit] Superset backup process finished.")
return exit_code
# [/DEF:main]
if __name__ == "__main__":
sys.exit(main())
# [/DEF:backup_script]
# [DEF:backup_script:Module]
#
# @SEMANTICS: backup, superset, automation, dashboard
# @PURPOSE: Этот модуль отвечает за автоматизированное резервное копирование дашбордов Superset.
# @LAYER: App
# @RELATION: DEPENDS_ON -> superset_tool.client
# @RELATION: DEPENDS_ON -> superset_tool.utils
# @PUBLIC_API: BackupConfig, backup_dashboards, main
# [SECTION: IMPORTS]
import logging
import sys
from pathlib import Path
from dataclasses import dataclass,field
from requests.exceptions import RequestException
from superset_tool.client import SupersetClient
from superset_tool.exceptions import SupersetAPIError
from superset_tool.utils.logger import SupersetLogger
from superset_tool.utils.fileio import (
save_and_unpack_dashboard,
archive_exports,
sanitize_filename,
consolidate_archive_folders,
remove_empty_directories,
RetentionPolicy
)
from superset_tool.utils.init_clients import setup_clients
# [/SECTION]
# [DEF:BackupConfig:DataClass]
# @PURPOSE: Хранит конфигурацию для процесса бэкапа.
@dataclass
class BackupConfig:
"""Конфигурация для процесса бэкапа."""
consolidate: bool = True
rotate_archive: bool = True
clean_folders: bool = True
retention_policy: RetentionPolicy = field(default_factory=RetentionPolicy)
# [/DEF:BackupConfig]
# [DEF:backup_dashboards:Function]
# @PURPOSE: Выполняет бэкап всех доступных дашбордов для заданного клиента и окружения, пропуская ошибки экспорта.
# @PRE: `client` должен быть инициализированным экземпляром `SupersetClient`.
# @PRE: `env_name` должен быть строкой, обозначающей окружение.
# @PRE: `backup_root` должен быть валидным путем к корневой директории бэкапа.
# @POST: Дашборды экспортируются и сохраняются. Ошибки экспорта логируются и не приводят к остановке скрипта.
# @RELATION: CALLS -> client.get_dashboards
# @RELATION: CALLS -> client.export_dashboard
# @RELATION: CALLS -> save_and_unpack_dashboard
# @RELATION: CALLS -> archive_exports
# @RELATION: CALLS -> consolidate_archive_folders
# @RELATION: CALLS -> remove_empty_directories
# @PARAM: client (SupersetClient) - Клиент для доступа к API Superset.
# @PARAM: env_name (str) - Имя окружения (e.g., 'PROD').
# @PARAM: backup_root (Path) - Корневая директория для сохранения бэкапов.
# @PARAM: logger (SupersetLogger) - Инстанс логгера.
# @PARAM: config (BackupConfig) - Конфигурация процесса бэкапа.
# @RETURN: bool - `True` если все дашборды были экспортированы без критических ошибок, `False` иначе.
def backup_dashboards(
client: SupersetClient,
env_name: str,
backup_root: Path,
logger: SupersetLogger,
config: BackupConfig
) -> bool:
logger.info(f"[backup_dashboards][Entry] Starting backup for {env_name}.")
try:
dashboard_count, dashboard_meta = client.get_dashboards()
logger.info(f"[backup_dashboards][Progress] Found {dashboard_count} dashboards to export in {env_name}.")
if dashboard_count == 0:
return True
success_count = 0
for db in dashboard_meta:
dashboard_id = db.get('id')
dashboard_title = db.get('dashboard_title', 'Unknown Dashboard')
if not dashboard_id:
continue
try:
dashboard_base_dir_name = sanitize_filename(f"{dashboard_title}")
dashboard_dir = backup_root / env_name / dashboard_base_dir_name
dashboard_dir.mkdir(parents=True, exist_ok=True)
zip_content, filename = client.export_dashboard(dashboard_id)
save_and_unpack_dashboard(
zip_content=zip_content,
original_filename=filename,
output_dir=dashboard_dir,
unpack=False,
logger=logger
)
if config.rotate_archive:
archive_exports(str(dashboard_dir), policy=config.retention_policy, logger=logger)
success_count += 1
except (SupersetAPIError, RequestException, IOError, OSError) as db_error:
logger.error(f"[backup_dashboards][Failure] Failed to export dashboard {dashboard_title} (ID: {dashboard_id}): {db_error}", exc_info=True)
continue
if config.consolidate:
consolidate_archive_folders(backup_root / env_name , logger=logger)
if config.clean_folders:
remove_empty_directories(str(backup_root / env_name), logger=logger)
logger.info(f"[backup_dashboards][CoherenceCheck:Passed] Backup logic completed.")
return success_count == dashboard_count
except (RequestException, IOError) as e:
logger.critical(f"[backup_dashboards][Failure] Fatal error during backup for {env_name}: {e}", exc_info=True)
return False
# [/DEF:backup_dashboards]
# [DEF:main:Function]
# @PURPOSE: Основная точка входа для запуска процесса резервного копирования.
# @RELATION: CALLS -> setup_clients
# @RELATION: CALLS -> backup_dashboards
# @RETURN: int - Код выхода (0 - успех, 1 - ошибка).
def main() -> int:
log_dir = Path("P:\\Superset\\010 Бекапы\\Logs")
logger = SupersetLogger(log_dir=log_dir, level=logging.INFO, console=True)
logger.info("[main][Entry] Starting Superset backup process.")
exit_code = 0
try:
clients = setup_clients(logger)
superset_backup_repo = Path("P:\\Superset\\010 Бекапы")
superset_backup_repo.mkdir(parents=True, exist_ok=True)
results = {}
environments = ['dev', 'sbx', 'prod', 'preprod']
backup_config = BackupConfig(rotate_archive=True)
for env in environments:
try:
results[env] = backup_dashboards(
clients[env],
env.upper(),
superset_backup_repo,
logger=logger,
config=backup_config
)
except Exception as env_error:
logger.critical(f"[main][Failure] Critical error for environment {env}: {env_error}", exc_info=True)
results[env] = False
if not all(results.values()):
exit_code = 1
except (RequestException, IOError) as e:
logger.critical(f"[main][Failure] Fatal error in main execution: {e}", exc_info=True)
exit_code = 1
logger.info("[main][Exit] Superset backup process finished.")
return exit_code
# [/DEF:main]
if __name__ == "__main__":
sys.exit(main())
# [/DEF:backup_script]

158
debug_db_api.py Normal file → Executable file
View File

@@ -1,79 +1,79 @@
# [DEF:debug_db_api:Module]
#
# @SEMANTICS: debug, api, database, script
# @PURPOSE: Скрипт для отладки структуры ответа API баз данных.
# @LAYER: App
# @RELATION: DEPENDS_ON -> superset_tool.client
# @RELATION: DEPENDS_ON -> superset_tool.utils
# @PUBLIC_API: debug_database_api
# [SECTION: IMPORTS]
import json
import logging
from superset_tool.client import SupersetClient
from superset_tool.utils.init_clients import setup_clients
from superset_tool.utils.logger import SupersetLogger
# [/SECTION]
# [DEF:debug_database_api:Function]
# @PURPOSE: Отладка структуры ответа API баз данных.
# @RELATION: CALLS -> setup_clients
# @RELATION: CALLS -> client.get_databases
def debug_database_api():
logger = SupersetLogger(name="debug_db_api", level=logging.DEBUG)
# Инициализируем клиенты
clients = setup_clients(logger)
# Log JWT bearer tokens for each client
for env_name, client in clients.items():
try:
# Ensure authentication (access token fetched via headers property)
_ = client.headers
token = client.network._tokens.get("access_token")
logger.info(f"[debug_database_api][Token] Bearer token for {env_name}: {token}")
except Exception as exc:
logger.error(f"[debug_database_api][Token] Failed to retrieve token for {env_name}: {exc}", exc_info=True)
# Проверяем доступные окружения
print("Доступные окружения:")
for env_name, client in clients.items():
print(f" {env_name}: {client.config.base_url}")
# Выбираем два окружения для тестирования
if len(clients) < 2:
print("Недостаточно окружений для тестирования")
return
env_names = list(clients.keys())[:2]
from_env, to_env = env_names[0], env_names[1]
from_client = clients[from_env]
to_client = clients[to_env]
print(f"\nТестируем API для окружений: {from_env} -> {to_env}")
try:
# Получаем список баз данных из первого окружения
print(f"\nПолучаем список БД из {from_env}:")
count, dbs = from_client.get_databases()
print(f"Найдено {count} баз данных")
print("Полный ответ API:")
print(json.dumps({"count": count, "result": dbs}, indent=2, ensure_ascii=False))
# Получаем список баз данных из второго окружения
print(f"\nПолучаем список БД из {to_env}:")
count, dbs = to_client.get_databases()
print(f"Найдено {count} баз данных")
print("Полный ответ API:")
print(json.dumps({"count": count, "result": dbs}, indent=2, ensure_ascii=False))
except Exception as e:
print(f"Ошибка при тестировании API: {e}")
import traceback
traceback.print_exc()
# [/DEF:debug_database_api]
if __name__ == "__main__":
debug_database_api()
# [/DEF:debug_db_api]
# [DEF:debug_db_api:Module]
#
# @SEMANTICS: debug, api, database, script
# @PURPOSE: Скрипт для отладки структуры ответа API баз данных.
# @LAYER: App
# @RELATION: DEPENDS_ON -> superset_tool.client
# @RELATION: DEPENDS_ON -> superset_tool.utils
# @PUBLIC_API: debug_database_api
# [SECTION: IMPORTS]
import json
import logging
from superset_tool.client import SupersetClient
from superset_tool.utils.init_clients import setup_clients
from superset_tool.utils.logger import SupersetLogger
# [/SECTION]
# [DEF:debug_database_api:Function]
# @PURPOSE: Отладка структуры ответа API баз данных.
# @RELATION: CALLS -> setup_clients
# @RELATION: CALLS -> client.get_databases
def debug_database_api():
logger = SupersetLogger(name="debug_db_api", level=logging.DEBUG)
# Инициализируем клиенты
clients = setup_clients(logger)
# Log JWT bearer tokens for each client
for env_name, client in clients.items():
try:
# Ensure authentication (access token fetched via headers property)
_ = client.headers
token = client.network._tokens.get("access_token")
logger.info(f"[debug_database_api][Token] Bearer token for {env_name}: {token}")
except Exception as exc:
logger.error(f"[debug_database_api][Token] Failed to retrieve token for {env_name}: {exc}", exc_info=True)
# Проверяем доступные окружения
print("Доступные окружения:")
for env_name, client in clients.items():
print(f" {env_name}: {client.config.base_url}")
# Выбираем два окружения для тестирования
if len(clients) < 2:
print("Недостаточно окружений для тестирования")
return
env_names = list(clients.keys())[:2]
from_env, to_env = env_names[0], env_names[1]
from_client = clients[from_env]
to_client = clients[to_env]
print(f"\nТестируем API для окружений: {from_env} -> {to_env}")
try:
# Получаем список баз данных из первого окружения
print(f"\nПолучаем список БД из {from_env}:")
count, dbs = from_client.get_databases()
print(f"Найдено {count} баз данных")
print("Полный ответ API:")
print(json.dumps({"count": count, "result": dbs}, indent=2, ensure_ascii=False))
# Получаем список баз данных из второго окружения
print(f"\nПолучаем список БД из {to_env}:")
count, dbs = to_client.get_databases()
print(f"Найдено {count} баз данных")
print("Полный ответ API:")
print(json.dumps({"count": count, "result": dbs}, indent=2, ensure_ascii=False))
except Exception as e:
print(f"Ошибка при тестировании API: {e}")
import traceback
traceback.print_exc()
# [/DEF:debug_database_api]
if __name__ == "__main__":
debug_database_api()
# [/DEF:debug_db_api]

172
docs/plugin_dev.md Normal file → Executable file
View File

@@ -1,87 +1,87 @@
# Plugin Development Guide
This guide explains how to create new plugins for the Superset Tools application.
## 1. Plugin Structure
A plugin is a single Python file located in the `backend/src/plugins/` directory. Each plugin file must contain a class that inherits from `PluginBase`.
## 2. Implementing `PluginBase`
The `PluginBase` class is an abstract base class that defines the interface for all plugins. You must implement the following properties and methods:
- **`id`**: A unique string identifier for your plugin (e.g., `"my-cool-plugin"`).
- **`name`**: A human-readable name for your plugin (e.g., `"My Cool Plugin"`).
- **`description`**: A brief description of what your plugin does.
- **`version`**: The version of your plugin (e.g., `"1.0.0"`).
- **`get_schema()`**: A method that returns a JSON schema dictionary defining the input parameters for your plugin. This schema is used to automatically generate a form in the frontend.
- **`execute(params: Dict[str, Any])`**: An `async` method that contains the main logic of your plugin. The `params` argument is a dictionary containing the input data from the user, validated against the schema you defined.
## 3. Example Plugin
Here is an example of a simple "Hello World" plugin:
```python
# backend/src/plugins/hello.py
# [DEF:HelloWorldPlugin:Plugin]
# @SEMANTICS: hello, world, example, plugin
# @PURPOSE: A simple "Hello World" plugin example.
# @LAYER: Domain (Plugin)
# @RELATION: Inherits from PluginBase
# @PUBLIC_API: execute
from typing import Dict, Any
from ..core.plugin_base import PluginBase
class HelloWorldPlugin(PluginBase):
@property
def id(self) -> str:
return "hello-world"
@property
def name(self) -> str:
return "Hello World"
@property
def description(self) -> str:
return "A simple plugin that prints a greeting."
@property
def version(self) -> str:
return "1.0.0"
def get_schema(self) -> Dict[str, Any]:
return {
"type": "object",
"properties": {
"name": {
"type": "string",
"title": "Name",
"description": "The name to greet.",
"default": "World",
}
},
"required": ["name"],
}
async def execute(self, params: Dict[str, Any]):
name = params["name"]
print(f"Hello, {name}!")
```
## 4. Logging
You can use the global logger instance to log messages from your plugin. The logger is available in the `superset_tool.utils.logger` module.
```python
from superset_tool.utils.logger import SupersetLogger
logger = SupersetLogger()
async def execute(self, params: Dict[str, Any]):
logger.info("My plugin is running!")
```
## 5. Testing
# Plugin Development Guide
This guide explains how to create new plugins for the Superset Tools application.
## 1. Plugin Structure
A plugin is a single Python file located in the `backend/src/plugins/` directory. Each plugin file must contain a class that inherits from `PluginBase`.
## 2. Implementing `PluginBase`
The `PluginBase` class is an abstract base class that defines the interface for all plugins. You must implement the following properties and methods:
- **`id`**: A unique string identifier for your plugin (e.g., `"my-cool-plugin"`).
- **`name`**: A human-readable name for your plugin (e.g., `"My Cool Plugin"`).
- **`description`**: A brief description of what your plugin does.
- **`version`**: The version of your plugin (e.g., `"1.0.0"`).
- **`get_schema()`**: A method that returns a JSON schema dictionary defining the input parameters for your plugin. This schema is used to automatically generate a form in the frontend.
- **`execute(params: Dict[str, Any])`**: An `async` method that contains the main logic of your plugin. The `params` argument is a dictionary containing the input data from the user, validated against the schema you defined.
## 3. Example Plugin
Here is an example of a simple "Hello World" plugin:
```python
# backend/src/plugins/hello.py
# [DEF:HelloWorldPlugin:Plugin]
# @SEMANTICS: hello, world, example, plugin
# @PURPOSE: A simple "Hello World" plugin example.
# @LAYER: Domain (Plugin)
# @RELATION: Inherits from PluginBase
# @PUBLIC_API: execute
from typing import Dict, Any
from ..core.plugin_base import PluginBase
class HelloWorldPlugin(PluginBase):
@property
def id(self) -> str:
return "hello-world"
@property
def name(self) -> str:
return "Hello World"
@property
def description(self) -> str:
return "A simple plugin that prints a greeting."
@property
def version(self) -> str:
return "1.0.0"
def get_schema(self) -> Dict[str, Any]:
return {
"type": "object",
"properties": {
"name": {
"type": "string",
"title": "Name",
"description": "The name to greet.",
"default": "World",
}
},
"required": ["name"],
}
async def execute(self, params: Dict[str, Any]):
name = params["name"]
print(f"Hello, {name}!")
```
## 4. Logging
You can use the global logger instance to log messages from your plugin. The logger is available in the `superset_tool.utils.logger` module.
```python
from superset_tool.utils.logger import SupersetLogger
logger = SupersetLogger()
async def execute(self, params: Dict[str, Any]):
logger.info("My plugin is running!")
```
## 5. Testing
To test your plugin, simply run the application and navigate to the web UI. Your plugin should appear in the list of available tools.

46
docs/settings.md Normal file
View File

@@ -0,0 +1,46 @@
# Web Application Settings Mechanism
This document describes the settings management system for the Superset Tools application.
## Overview
The settings mechanism allows users to configure multiple Superset environments and global application settings (like backup storage) via the web UI.
## Backend Architecture
### Data Models
Configuration is structured using Pydantic models in `backend/src/core/config_models.py`:
- `Environment`: Represents a Superset instance (URL, credentials).
- `GlobalSettings`: Global application parameters (e.g., `backup_path`).
- `AppConfig`: The root configuration object.
### Configuration Manager
The `ConfigManager` (`backend/src/core/config_manager.py`) handles:
- Persistence to `config.json`.
- CRUD operations for environments.
- Validation and logging.
### API Endpoints
The settings API is available at `/settings`:
- `GET /settings`: Retrieve all settings (passwords are masked).
- `PATCH /settings/global`: Update global settings.
- `GET /settings/environments`: List environments.
- `POST /settings/environments`: Add environment.
- `PUT /settings/environments/{id}`: Update environment.
- `DELETE /settings/environments/{id}`: Remove environment.
- `POST /settings/environments/{id}/test`: Test connection.
## Frontend Implementation
The settings page is located at `frontend/src/pages/Settings.svelte`. It provides forms for managing global settings and Superset environments.
## Integration
Existing plugins and utilities use the `ConfigManager` to fetch configuration:
- `superset_tool/utils/init_clients.py`: Dynamically initializes Superset clients from the configured environments.
- `BackupPlugin`: Uses the configured `backup_path` as the default storage location.

0
frontend/.vscode/extensions.json vendored Normal file → Executable file
View File

0
frontend/README.md Normal file → Executable file
View File

0
frontend/index.html Normal file → Executable file
View File

0
frontend/jsconfig.json Normal file → Executable file
View File

9
frontend/package-lock.json generated Normal file → Executable file
View File

@@ -883,6 +883,7 @@
"integrity": "sha512-YZs/OSKOQAQCnJvM/P+F1URotNnYNeU3P2s4oIpzm1uFaqUEqRxUB0g5ejMjEb5Gjb9/PiBI5Ktrq4rUUF8UVQ==",
"dev": true,
"license": "MIT",
"peer": true,
"dependencies": {
"@sveltejs/vite-plugin-svelte-inspector": "^5.0.0",
"debug": "^4.4.1",
@@ -929,6 +930,7 @@
"integrity": "sha512-NZyJarBfL7nWwIq+FDL6Zp/yHEhePMNnnJ0y3qfieCrmNvYct8uvtiV41UvlSe6apAfk0fY1FbWx+NwfmpvtTg==",
"dev": true,
"license": "MIT",
"peer": true,
"bin": {
"acorn": "bin/acorn"
},
@@ -1077,6 +1079,7 @@
}
],
"license": "MIT",
"peer": true,
"dependencies": {
"baseline-browser-mapping": "^2.9.0",
"caniuse-lite": "^1.0.30001759",
@@ -1514,6 +1517,7 @@
"integrity": "sha512-/imKNG4EbWNrVjoNC/1H5/9GFy+tqjGBHCaSsN+P2RnPqjsLmv6UD3Ej+Kj8nBWaRAwyk7kK5ZUc+OEatnTR3A==",
"dev": true,
"license": "MIT",
"peer": true,
"bin": {
"jiti": "bin/jiti.js"
}
@@ -1721,6 +1725,7 @@
}
],
"license": "MIT",
"peer": true,
"dependencies": {
"nanoid": "^3.3.11",
"picocolors": "^1.1.1",
@@ -2058,6 +2063,7 @@
"integrity": "sha512-ZhLtvroYxUxr+HQJfMZEDRsGsmU46x12RvAv/zi9584f5KOX7bUrEbhPJ7cKFmUvZTJXi/CFZUYwDC6M1FigPw==",
"dev": true,
"license": "MIT",
"peer": true,
"dependencies": {
"@jridgewell/remapping": "^2.3.4",
"@jridgewell/sourcemap-codec": "^1.5.0",
@@ -2181,6 +2187,7 @@
"integrity": "sha512-5gTmgEY/sqK6gFXLIsQNH19lWb4ebPDLA4SdLP7dsWkIXHWlG66oPuVvXSGFPppYZz8ZDZq0dYYrbHfBCVUb1Q==",
"dev": true,
"license": "MIT",
"peer": true,
"engines": {
"node": ">=12"
},
@@ -2252,6 +2259,7 @@
"integrity": "sha512-dZwN5L1VlUBewiP6H9s2+B3e3Jg96D0vzN+Ry73sOefebhYr9f94wwkMNN/9ouoU8pV1BqA1d1zGk8928cx0rg==",
"dev": true,
"license": "MIT",
"peer": true,
"dependencies": {
"esbuild": "^0.27.0",
"fdir": "^6.5.0",
@@ -2345,6 +2353,7 @@
"integrity": "sha512-5gTmgEY/sqK6gFXLIsQNH19lWb4ebPDLA4SdLP7dsWkIXHWlG66oPuVvXSGFPppYZz8ZDZq0dYYrbHfBCVUb1Q==",
"dev": true,
"license": "MIT",
"peer": true,
"engines": {
"node": ">=12"
},

0
frontend/package.json Normal file → Executable file
View File

10
frontend/postcss.config.js Normal file → Executable file
View File

@@ -1,6 +1,6 @@
export default {
plugins: {
tailwindcss: {},
autoprefixer: {},
},
export default {
plugins: {
tailwindcss: {},
autoprefixer: {},
},
};

0
frontend/public/vite.svg Normal file → Executable file
View File

Before

Width:  |  Height:  |  Size: 1.5 KiB

After

Width:  |  Height:  |  Size: 1.5 KiB

78
frontend/src/App.svelte Normal file → Executable file
View File

@@ -1,28 +1,91 @@
<!--
[DEF:App:Component]
@SEMANTICS: main, entrypoint, layout, navigation
@PURPOSE: The root component of the frontend application. Manages navigation and layout.
@LAYER: UI
@RELATION: DEPENDS_ON -> frontend/src/pages/Dashboard.svelte
@RELATION: DEPENDS_ON -> frontend/src/pages/Settings.svelte
@RELATION: DEPENDS_ON -> frontend/src/lib/stores.js
@PROPS: None
@EVENTS: None
@INVARIANT: Navigation state must be persisted in the currentPage store.
-->
<script>
import Dashboard from './pages/Dashboard.svelte';
import { selectedPlugin, selectedTask } from './lib/stores.js';
import Settings from './pages/Settings.svelte';
import { selectedPlugin, selectedTask, currentPage } from './lib/stores.js';
import TaskRunner from './components/TaskRunner.svelte';
import DynamicForm from './components/DynamicForm.svelte';
import { api } from './lib/api.js';
import Toast from './components/Toast.svelte';
// [DEF:handleFormSubmit:Function]
// @PURPOSE: Handles form submission for task creation.
// @PARAM: event (CustomEvent) - The submit event from DynamicForm.
async function handleFormSubmit(event) {
console.log("[App.handleFormSubmit][Action] Handling form submission for task creation.");
const params = event.detail;
const task = await api.createTask($selectedPlugin.id, params);
selectedTask.set(task);
selectedPlugin.set(null);
try {
const task = await api.createTask($selectedPlugin.id, params);
selectedTask.set(task);
selectedPlugin.set(null);
console.log(`[App.handleFormSubmit][Coherence:OK] Task created context={{'id': '${task.id}'}}`);
} catch (error) {
console.error(`[App.handleFormSubmit][Coherence:Failed] Task creation failed context={{'error': '${error}'}}`);
}
}
// [/DEF:handleFormSubmit]
// [DEF:navigate:Function]
// @PURPOSE: Changes the current page and resets state.
// @PARAM: page (string) - Target page name.
function navigate(page) {
console.log(`[App.navigate][Action] Navigating to ${page}.`);
// Reset selection first
if (page !== $currentPage) {
selectedPlugin.set(null);
selectedTask.set(null);
}
// Then set page
currentPage.set(page);
}
// [/DEF:navigate]
</script>
<Toast />
<main class="bg-gray-50 min-h-screen">
<header class="bg-white shadow-md p-4">
<h1 class="text-3xl font-bold text-gray-800">Superset Tools</h1>
<header class="bg-white shadow-md p-4 flex justify-between items-center">
<button
type="button"
class="text-3xl font-bold text-gray-800 focus:outline-none"
on:click={() => navigate('dashboard')}
>
Superset Tools
</button>
<nav class="space-x-4">
<button
type="button"
on:click={() => navigate('dashboard')}
class="text-gray-600 hover:text-blue-600 font-medium {$currentPage === 'dashboard' ? 'text-blue-600 border-b-2 border-blue-600' : ''}"
>
Dashboard
</button>
<button
type="button"
on:click={() => navigate('settings')}
class="text-gray-600 hover:text-blue-600 font-medium {$currentPage === 'settings' ? 'text-blue-600 border-b-2 border-blue-600' : ''}"
>
Settings
</button>
</nav>
</header>
<div class="p-4">
{#if $selectedTask}
{#if $currentPage === 'settings'}
<Settings />
{:else if $selectedTask}
<TaskRunner />
<button on:click={() => selectedTask.set(null)} class="mt-4 bg-blue-500 text-white p-2 rounded">
Back to Task List
@@ -38,3 +101,4 @@
{/if}
</div>
</main>
<!-- [/DEF:App] -->

0
frontend/src/app.css Normal file → Executable file
View File

0
frontend/src/assets/svelte.svg Normal file → Executable file
View File

Before

Width:  |  Height:  |  Size: 1.9 KiB

After

Width:  |  Height:  |  Size: 1.9 KiB

135
frontend/src/components/DynamicForm.svelte Normal file → Executable file
View File

@@ -1,56 +1,79 @@
<script>
import { createEventDispatcher } from 'svelte';
export let schema;
let formData = {};
const dispatch = createEventDispatcher();
function handleSubmit() {
dispatch('submit', formData);
}
// Initialize form data with default values from the schema
if (schema && schema.properties) {
for (const key in schema.properties) {
formData[key] = schema.properties[key].default || '';
}
}
</script>
<form on:submit|preventDefault={handleSubmit} class="space-y-4">
{#if schema && schema.properties}
{#each Object.entries(schema.properties) as [key, prop]}
<div class="flex flex-col">
<label for={key} class="mb-1 font-semibold text-gray-700">{prop.title || key}</label>
{#if prop.type === 'string'}
<input
type="text"
id={key}
bind:value={formData[key]}
placeholder={prop.description || ''}
class="p-2 border rounded-md"
/>
{:else if prop.type === 'number' || prop.type === 'integer'}
<input
type="number"
id={key}
bind:value={formData[key]}
placeholder={prop.description || ''}
class="p-2 border rounded-md"
/>
{:else if prop.type === 'boolean'}
<input
type="checkbox"
id={key}
bind:checked={formData[key]}
class="h-5 w-5"
/>
{/if}
</div>
{/each}
<button type="submit" class="w-full bg-green-500 text-white p-2 rounded-md hover:bg-green-600">
Run Task
</button>
{/if}
</form>
<!--
[DEF:DynamicForm:Component]
@SEMANTICS: form, schema, dynamic, json-schema
@PURPOSE: Generates a form dynamically based on a JSON schema.
@LAYER: UI
@RELATION: DEPENDS_ON -> svelte:createEventDispatcher
@PROPS:
- schema: Object - JSON schema for the form.
@EVENTS:
- submit: Object - Dispatched when the form is submitted, containing the form data.
-->
<script>
import { createEventDispatcher } from 'svelte';
export let schema;
let formData = {};
const dispatch = createEventDispatcher();
// [DEF:handleSubmit:Function]
// @PURPOSE: Dispatches the submit event with the form data.
function handleSubmit() {
console.log("[DynamicForm][Action] Submitting form data.", { formData });
dispatch('submit', formData);
}
// [/DEF:handleSubmit]
// [DEF:initializeForm:Function]
// @PURPOSE: Initialize form data with default values from the schema.
function initializeForm() {
if (schema && schema.properties) {
for (const key in schema.properties) {
formData[key] = schema.properties[key].default || '';
}
}
}
// [/DEF:initializeForm]
initializeForm();
</script>
<form on:submit|preventDefault={handleSubmit} class="space-y-4">
{#if schema && schema.properties}
{#each Object.entries(schema.properties) as [key, prop]}
<div class="flex flex-col">
<label for={key} class="mb-1 font-semibold text-gray-700">{prop.title || key}</label>
{#if prop.type === 'string'}
<input
type="text"
id={key}
bind:value={formData[key]}
placeholder={prop.description || ''}
class="p-2 border rounded-md"
/>
{:else if prop.type === 'number' || prop.type === 'integer'}
<input
type="number"
id={key}
bind:value={formData[key]}
placeholder={prop.description || ''}
class="p-2 border rounded-md"
/>
{:else if prop.type === 'boolean'}
<input
type="checkbox"
id={key}
bind:checked={formData[key]}
class="h-5 w-5"
/>
{/if}
</div>
{/each}
<button type="submit" class="w-full bg-green-500 text-white p-2 rounded-md hover:bg-green-600">
Run Task
</button>
{/if}
</form>
<!-- [/DEF:DynamicForm] -->

127
frontend/src/components/TaskRunner.svelte Normal file → Executable file
View File

@@ -1,54 +1,73 @@
<script>
import { onMount, onDestroy } from 'svelte';
import { selectedTask, taskLogs } from '../lib/stores.js';
let ws;
onMount(() => {
if ($selectedTask) {
taskLogs.set([]); // Clear previous logs
const wsUrl = `ws://localhost:8000/ws/logs/${$selectedTask.id}`;
ws = new WebSocket(wsUrl);
ws.onopen = () => {
console.log('WebSocket connection established');
};
ws.onmessage = (event) => {
const logEntry = JSON.parse(event.data);
taskLogs.update(logs => [...logs, logEntry]);
};
ws.onerror = (error) => {
console.error('WebSocket error:', error);
};
ws.onclose = () => {
console.log('WebSocket connection closed');
};
}
});
onDestroy(() => {
if (ws) {
ws.close();
}
});
</script>
<div class="p-4 border rounded-lg bg-white shadow-md">
{#if $selectedTask}
<h2 class="text-xl font-semibold mb-2">Task: {$selectedTask.plugin_id}</h2>
<div class="bg-gray-900 text-white font-mono text-sm p-4 rounded-md h-96 overflow-y-auto">
{#each $taskLogs as log}
<div>
<span class="text-gray-400">{new Date(log.timestamp).toLocaleTimeString()}</span>
<span class="{log.level === 'ERROR' ? 'text-red-500' : 'text-green-400'}">[{log.level}]</span>
<span>{log.message}</span>
</div>
{/each}
</div>
{:else}
<p>No task selected.</p>
{/if}
</div>
<!--
[DEF:TaskRunner:Component]
@SEMANTICS: task, runner, logs, websocket
@PURPOSE: Connects to a WebSocket to display real-time logs for a running task.
@LAYER: UI
@RELATION: DEPENDS_ON -> frontend/src/lib/stores.js
@PROPS: None
@EVENTS: None
-->
<script>
import { onMount, onDestroy } from 'svelte';
import { selectedTask, taskLogs } from '../lib/stores.js';
let ws;
// [DEF:onMount:Function]
// @PURPOSE: Initialize WebSocket connection for task logs.
onMount(() => {
if ($selectedTask) {
console.log(`[TaskRunner][Entry] Connecting to logs for task: ${$selectedTask.id}`);
taskLogs.set([]); // Clear previous logs
const wsUrl = `ws://localhost:8000/ws/logs/${$selectedTask.id}`;
ws = new WebSocket(wsUrl);
ws.onopen = () => {
console.log('[TaskRunner][Coherence:OK] WebSocket connection established');
};
ws.onmessage = (event) => {
const logEntry = JSON.parse(event.data);
taskLogs.update(logs => [...logs, logEntry]);
};
ws.onerror = (error) => {
console.error('[TaskRunner][Coherence:Failed] WebSocket error:', error);
};
ws.onclose = () => {
console.log('[TaskRunner][Exit] WebSocket connection closed');
};
}
});
// [/DEF:onMount]
// [DEF:onDestroy:Function]
// @PURPOSE: Close WebSocket connection when the component is destroyed.
onDestroy(() => {
if (ws) {
console.log("[TaskRunner][Action] Closing WebSocket connection.");
ws.close();
}
});
// [/DEF:onDestroy]
</script>
<div class="p-4 border rounded-lg bg-white shadow-md">
{#if $selectedTask}
<h2 class="text-xl font-semibold mb-2">Task: {$selectedTask.plugin_id}</h2>
<div class="bg-gray-900 text-white font-mono text-sm p-4 rounded-md h-96 overflow-y-auto">
{#each $taskLogs as log}
<div>
<span class="text-gray-400">{new Date(log.timestamp).toLocaleTimeString()}</span>
<span class="{log.level === 'ERROR' ? 'text-red-500' : 'text-green-400'}">[{log.level}]</span>
<span>{log.message}</span>
</div>
{/each}
</div>
{:else}
<p>No task selected.</p>
{/if}
</div>
<!-- [/DEF:TaskRunner] -->

41
frontend/src/components/Toast.svelte Normal file → Executable file
View File

@@ -1,15 +1,26 @@
<script>
import { toasts } from '../lib/toasts.js';
</script>
<div class="fixed bottom-0 right-0 p-4 space-y-2">
{#each $toasts as toast (toast.id)}
<div class="p-4 rounded-md shadow-lg text-white
{toast.type === 'info' && 'bg-blue-500'}
{toast.type === 'success' && 'bg-green-500'}
{toast.type === 'error' && 'bg-red-500'}
">
{toast.message}
</div>
{/each}
</div>
<!--
[DEF:Toast:Component]
@SEMANTICS: toast, notification, feedback, ui
@PURPOSE: Displays transient notifications (toasts) in the bottom-right corner.
@LAYER: UI
@RELATION: DEPENDS_ON -> frontend/src/lib/toasts.js
@PROPS: None
@EVENTS: None
-->
<script>
import { toasts } from '../lib/toasts.js';
</script>
<div class="fixed bottom-0 right-0 p-4 space-y-2">
{#each $toasts as toast (toast.id)}
<div class="p-4 rounded-md shadow-lg text-white
{toast.type === 'info' && 'bg-blue-500'}
{toast.type === 'success' && 'bg-green-500'}
{toast.type === 'error' && 'bg-red-500'}
">
{toast.message}
</div>
{/each}
</div>
<!-- [/DEF:Toast] -->

0
frontend/src/lib/Counter.svelte Normal file → Executable file
View File

158
frontend/src/lib/api.js Normal file → Executable file
View File

@@ -1,55 +1,103 @@
import { addToast } from './toasts.js';
const API_BASE_URL = 'http://localhost:8000';
/**
* Fetches data from the API.
* @param {string} endpoint The API endpoint to fetch data from.
* @returns {Promise<any>} The JSON response from the API.
*/
async function fetchApi(endpoint) {
try {
const response = await fetch(`${API_BASE_URL}${endpoint}`);
if (!response.ok) {
throw new Error(`API request failed with status ${response.status}`);
}
return await response.json();
} catch (error) {
console.error(`Error fetching from ${endpoint}:`, error);
addToast(error.message, 'error');
throw error;
}
}
/**
* Posts data to the API.
* @param {string} endpoint The API endpoint to post data to.
* @param {object} body The data to post.
* @returns {Promise<any>} The JSON response from the API.
*/
async function postApi(endpoint, body) {
try {
const response = await fetch(`${API_BASE_URL}${endpoint}`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify(body),
});
if (!response.ok) {
throw new Error(`API request failed with status ${response.status}`);
}
return await response.json();
} catch (error) {
console.error(`Error posting to ${endpoint}:`, error);
addToast(error.message, 'error');
throw error;
}
}
export const api = {
getPlugins: () => fetchApi('/plugins'),
getTasks: () => fetchApi('/tasks'),
getTask: (taskId) => fetchApi(`/tasks/${taskId}`),
createTask: (pluginId, params) => postApi('/tasks', { plugin_id: pluginId, params }),
};
// [DEF:api_module:Module]
// @SEMANTICS: api, client, fetch, rest
// @PURPOSE: Handles all communication with the backend API.
// @LAYER: Infra-API
import { addToast } from './toasts.js';
const API_BASE_URL = 'http://localhost:8000';
// [DEF:fetchApi:Function]
// @PURPOSE: Generic GET request wrapper.
// @PARAM: endpoint (string) - API endpoint.
// @RETURN: Promise<any> - JSON response.
async function fetchApi(endpoint) {
try {
console.log(`[api.fetchApi][Action] Fetching from context={{'endpoint': '${endpoint}'}}`);
const response = await fetch(`${API_BASE_URL}${endpoint}`);
if (!response.ok) {
throw new Error(`API request failed with status ${response.status}`);
}
return await response.json();
} catch (error) {
console.error(`[api.fetchApi][Coherence:Failed] Error fetching from ${endpoint}:`, error);
addToast(error.message, 'error');
throw error;
}
}
// [/DEF:fetchApi]
// [DEF:postApi:Function]
// @PURPOSE: Generic POST request wrapper.
// @PARAM: endpoint (string) - API endpoint.
// @PARAM: body (object) - Request payload.
// @RETURN: Promise<any> - JSON response.
async function postApi(endpoint, body) {
try {
console.log(`[api.postApi][Action] Posting to context={{'endpoint': '${endpoint}'}}`);
const response = await fetch(`${API_BASE_URL}${endpoint}`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify(body),
});
if (!response.ok) {
throw new Error(`API request failed with status ${response.status}`);
}
return await response.json();
} catch (error) {
console.error(`[api.postApi][Coherence:Failed] Error posting to ${endpoint}:`, error);
addToast(error.message, 'error');
throw error;
}
}
// [/DEF:postApi]
// [DEF:api:Data]
// @PURPOSE: API client object with specific methods.
export const api = {
getPlugins: () => fetchApi('/plugins/'),
getTasks: () => fetchApi('/tasks/'),
getTask: (taskId) => fetchApi(`/tasks/${taskId}`),
createTask: (pluginId, params) => postApi('/tasks', { plugin_id: pluginId, params }),
// Settings
getSettings: () => fetchApi('/settings'),
updateGlobalSettings: (settings) => {
return fetch(`${API_BASE_URL}/settings/global`, {
method: 'PATCH',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(settings)
}).then(res => res.json());
},
getEnvironments: () => fetchApi('/settings/environments'),
addEnvironment: (env) => postApi('/settings/environments', env),
updateEnvironment: (id, env) => {
return fetch(`${API_BASE_URL}/settings/environments/${id}`, {
method: 'PUT',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(env)
}).then(res => res.json());
},
deleteEnvironment: (id) => {
return fetch(`${API_BASE_URL}/settings/environments/${id}`, {
method: 'DELETE'
}).then(res => res.json());
},
testEnvironmentConnection: (id) => postApi(`/settings/environments/${id}/test`, {}),
};
// [/DEF:api_module]
// Export individual functions for easier use in components
export const getPlugins = api.getPlugins;
export const getTasks = api.getTasks;
export const getTask = api.getTask;
export const createTask = api.createTask;
export const getSettings = api.getSettings;
export const updateGlobalSettings = api.updateGlobalSettings;
export const getEnvironments = api.getEnvironments;
export const addEnvironment = api.addEnvironment;
export const updateEnvironment = api.updateEnvironment;
export const deleteEnvironment = api.deleteEnvironment;
export const testEnvironmentConnection = api.testEnvironmentConnection;

100
frontend/src/lib/stores.js Normal file → Executable file
View File

@@ -1,40 +1,60 @@
import { writable } from 'svelte/store';
import { api } from './api.js';
// Store for the list of available plugins
export const plugins = writable([]);
// Store for the list of tasks
export const tasks = writable([]);
// Store for the currently selected plugin
export const selectedPlugin = writable(null);
// Store for the currently selected task
export const selectedTask = writable(null);
// Store for the logs of the currently selected task
export const taskLogs = writable([]);
// Function to fetch plugins from the API
export async function fetchPlugins() {
try {
const data = await api.getPlugins();
console.log('Fetched plugins:', data); // Add console log
plugins.set(data);
} catch (error) {
console.error('Error fetching plugins:', error);
// Handle error appropriately in the UI
}
}
// Function to fetch tasks from the API
export async function fetchTasks() {
try {
const data = await api.getTasks();
tasks.set(data);
} catch (error) {
console.error('Error fetching tasks:', error);
// Handle error appropriately in the UI
}
}
// [DEF:stores_module:Module]
// @SEMANTICS: state, stores, svelte, plugins, tasks
// @PURPOSE: Global state management using Svelte stores.
// @LAYER: UI-State
import { writable } from 'svelte/store';
import { api } from './api.js';
// [DEF:plugins:Data]
// @PURPOSE: Store for the list of available plugins.
export const plugins = writable([]);
// [DEF:tasks:Data]
// @PURPOSE: Store for the list of tasks.
export const tasks = writable([]);
// [DEF:selectedPlugin:Data]
// @PURPOSE: Store for the currently selected plugin.
export const selectedPlugin = writable(null);
// [DEF:selectedTask:Data]
// @PURPOSE: Store for the currently selected task.
export const selectedTask = writable(null);
// [DEF:currentPage:Data]
// @PURPOSE: Store for the current page.
export const currentPage = writable('dashboard');
// [DEF:taskLogs:Data]
// @PURPOSE: Store for the logs of the currently selected task.
export const taskLogs = writable([]);
// [DEF:fetchPlugins:Function]
// @PURPOSE: Fetches plugins from the API and updates the plugins store.
export async function fetchPlugins() {
try {
console.log("[stores.fetchPlugins][Action] Fetching plugins.");
const data = await api.getPlugins();
console.log("[stores.fetchPlugins][Coherence:OK] Plugins fetched context={{'count': " + data.length + "}}");
plugins.set(data);
} catch (error) {
console.error(`[stores.fetchPlugins][Coherence:Failed] Error fetching plugins context={{'error': '${error}'}}`);
}
}
// [/DEF:fetchPlugins]
// [DEF:fetchTasks:Function]
// @PURPOSE: Fetches tasks from the API and updates the tasks store.
export async function fetchTasks() {
try {
console.log("[stores.fetchTasks][Action] Fetching tasks.");
const data = await api.getTasks();
console.log("[stores.fetchTasks][Coherence:OK] Tasks fetched context={{'count': " + data.length + "}}");
tasks.set(data);
} catch (error) {
console.error(`[stores.fetchTasks][Coherence:Failed] Error fetching tasks context={{'error': '${error}'}}`);
}
}
// [/DEF:fetchTasks]
// [/DEF:stores_module]

46
frontend/src/lib/toasts.js Normal file → Executable file
View File

@@ -1,13 +1,33 @@
import { writable } from 'svelte/store';
export const toasts = writable([]);
export function addToast(message, type = 'info', duration = 3000) {
const id = Math.random().toString(36).substr(2, 9);
toasts.update(all => [...all, { id, message, type }]);
setTimeout(() => removeToast(id), duration);
}
function removeToast(id) {
toasts.update(all => all.filter(t => t.id !== id));
}
// [DEF:toasts_module:Module]
// @SEMANTICS: notification, toast, feedback, state
// @PURPOSE: Manages toast notifications using a Svelte writable store.
// @LAYER: UI-State
import { writable } from 'svelte/store';
// [DEF:toasts:Data]
// @PURPOSE: Writable store containing the list of active toasts.
export const toasts = writable([]);
// [DEF:addToast:Function]
// @PURPOSE: Adds a new toast message.
// @PARAM: message (string) - The message text.
// @PARAM: type (string) - The type of toast (info, success, error).
// @PARAM: duration (number) - Duration in ms before the toast is removed.
export function addToast(message, type = 'info', duration = 3000) {
const id = Math.random().toString(36).substr(2, 9);
console.log(`[toasts.addToast][Action] Adding toast context={{'id': '${id}', 'type': '${type}', 'message': '${message}'}}`);
toasts.update(all => [...all, { id, message, type }]);
setTimeout(() => removeToast(id), duration);
}
// [/DEF:addToast]
// [DEF:removeToast:Function]
// @PURPOSE: Removes a toast message by ID.
// @PARAM: id (string) - The ID of the toast to remove.
function removeToast(id) {
console.log(`[toasts.removeToast][Action] Removing toast context={{'id': '${id}'}}`);
toasts.update(all => all.filter(t => t.id !== id));
}
// [/DEF:removeToast]
// [/DEF:toasts_module]

8
frontend/src/main.js Normal file → Executable file
View File

@@ -1,9 +1,17 @@
// [DEF:main:Module]
// @SEMANTICS: entrypoint, svelte, init
// @PURPOSE: Entry point for the Svelte application.
// @LAYER: UI-Entry
import './app.css'
import App from './App.svelte'
// [DEF:app_instance:Data]
// @PURPOSE: Initialized Svelte app instance.
const app = new App({
target: document.getElementById('app'),
props: {}
})
export default app
// [/DEF:main]

76
frontend/src/pages/Dashboard.svelte Normal file → Executable file
View File

@@ -1,28 +1,48 @@
<script>
import { onMount } from 'svelte';
import { plugins, fetchPlugins, selectedPlugin } from '../lib/stores.js';
onMount(async () => {
await fetchPlugins();
});
function selectPlugin(plugin) {
selectedPlugin.set(plugin);
}
</script>
<div class="container mx-auto p-4">
<h1 class="text-2xl font-bold mb-4">Available Tools</h1>
<div class="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-3 gap-4">
{#each $plugins as plugin}
<div
class="border rounded-lg p-4 cursor-pointer hover:bg-gray-100"
on:click={() => selectPlugin(plugin)}
>
<h2 class="text-xl font-semibold">{plugin.name}</h2>
<p class="text-gray-600">{plugin.description}</p>
<span class="text-sm text-gray-400">v{plugin.version}</span>
</div>
{/each}
</div>
</div>
<!--
[DEF:Dashboard:Component]
@SEMANTICS: dashboard, plugins, tools, list
@PURPOSE: Displays the list of available plugins and allows selecting one.
@LAYER: UI
@RELATION: DEPENDS_ON -> frontend/src/lib/stores.js
@PROPS: None
@EVENTS: None
-->
<script>
import { onMount } from 'svelte';
import { plugins, fetchPlugins, selectedPlugin } from '../lib/stores.js';
// [DEF:onMount:Function]
// @PURPOSE: Fetch plugins when the component mounts.
onMount(async () => {
console.log("[Dashboard][Entry] Component mounted, fetching plugins.");
await fetchPlugins();
});
// [/DEF:onMount]
// [DEF:selectPlugin:Function]
// @PURPOSE: Selects a plugin to display its form.
// @PARAM: plugin (Object) - The plugin object to select.
function selectPlugin(plugin) {
console.log(`[Dashboard][Action] Selecting plugin: ${plugin.id}`);
selectedPlugin.set(plugin);
}
// [/DEF:selectPlugin]
</script>
<div class="container mx-auto p-4">
<h1 class="text-2xl font-bold mb-4">Available Tools</h1>
<div class="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-3 gap-4">
{#each $plugins as plugin}
<div
class="border rounded-lg p-4 cursor-pointer hover:bg-gray-100"
on:click={() => selectPlugin(plugin)}
>
<h2 class="text-xl font-semibold">{plugin.name}</h2>
<p class="text-gray-600">{plugin.description}</p>
<span class="text-sm text-gray-400">v{plugin.version}</span>
</div>
{/each}
</div>
</div>
<!-- [/DEF:Dashboard] -->

View File

@@ -0,0 +1,207 @@
<!--
[DEF:Settings:Component]
@SEMANTICS: settings, ui, configuration
@PURPOSE: The main settings page for the application, allowing management of environments and global settings.
@LAYER: UI
@RELATION: CALLS -> api.js
@RELATION: USES -> stores.js
@PROPS:
None
@EVENTS:
None
@INVARIANT: Settings changes must be saved to the backend.
-->
<script>
import { onMount } from 'svelte';
import { getSettings, updateGlobalSettings, getEnvironments, addEnvironment, updateEnvironment, deleteEnvironment, testEnvironmentConnection } from '../lib/api';
import { addToast } from '../lib/toasts';
let settings = {
environments: [],
settings: {
backup_path: '',
default_environment_id: null
}
};
let newEnv = {
id: '',
name: '',
url: '',
username: '',
password: '',
is_default: false
};
let editingEnvId = null;
async function loadSettings() {
try {
const data = await getSettings();
settings = data;
} catch (error) {
addToast('Failed to load settings', 'error');
}
}
async function handleSaveGlobal() {
try {
await updateGlobalSettings(settings.settings);
addToast('Global settings saved', 'success');
} catch (error) {
addToast('Failed to save global settings', 'error');
}
}
async function handleAddOrUpdateEnv() {
try {
if (editingEnvId) {
await updateEnvironment(editingEnvId, newEnv);
addToast('Environment updated', 'success');
} else {
await addEnvironment(newEnv);
addToast('Environment added', 'success');
}
resetEnvForm();
await loadSettings();
} catch (error) {
addToast('Failed to save environment', 'error');
}
}
async function handleDeleteEnv(id) {
if (confirm('Are you sure you want to delete this environment?')) {
try {
await deleteEnvironment(id);
addToast('Environment deleted', 'success');
await loadSettings();
} catch (error) {
addToast('Failed to delete environment', 'error');
}
}
}
async function handleTestEnv(id) {
try {
const result = await testEnvironmentConnection(id);
if (result.status === 'success') {
addToast('Connection successful', 'success');
} else {
addToast(`Connection failed: ${result.message}`, 'error');
}
} catch (error) {
addToast('Failed to test connection', 'error');
}
}
function editEnv(env) {
newEnv = { ...env };
editingEnvId = env.id;
}
function resetEnvForm() {
newEnv = {
id: '',
name: '',
url: '',
username: '',
password: '',
is_default: false
};
editingEnvId = null;
}
onMount(loadSettings);
</script>
<div class="container mx-auto p-4">
<h1 class="text-2xl font-bold mb-6">Settings</h1>
<section class="mb-8 bg-white p-6 rounded shadow">
<h2 class="text-xl font-semibold mb-4">Global Settings</h2>
<div class="grid grid-cols-1 gap-4">
<div>
<label for="backup_path" class="block text-sm font-medium text-gray-700">Backup Storage Path</label>
<input type="text" id="backup_path" bind:value={settings.settings.backup_path} class="mt-1 block w-full border border-gray-300 rounded-md shadow-sm p-2" />
</div>
<button on:click={handleSaveGlobal} class="bg-blue-500 text-white px-4 py-2 rounded hover:bg-blue-600 w-max">
Save Global Settings
</button>
</div>
</section>
<section class="mb-8 bg-white p-6 rounded shadow">
<h2 class="text-xl font-semibold mb-4">Superset Environments</h2>
<div class="mb-6 overflow-x-auto">
<table class="min-w-full divide-y divide-gray-200">
<thead class="bg-gray-50">
<tr>
<th class="px-6 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider">Name</th>
<th class="px-6 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider">URL</th>
<th class="px-6 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider">Username</th>
<th class="px-6 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider">Default</th>
<th class="px-6 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider">Actions</th>
</tr>
</thead>
<tbody class="bg-white divide-y divide-gray-200">
{#each settings.environments as env}
<tr>
<td class="px-6 py-4 whitespace-nowrap">{env.name}</td>
<td class="px-6 py-4 whitespace-nowrap">{env.url}</td>
<td class="px-6 py-4 whitespace-nowrap">{env.username}</td>
<td class="px-6 py-4 whitespace-nowrap">{env.is_default ? 'Yes' : 'No'}</td>
<td class="px-6 py-4 whitespace-nowrap">
<button on:click={() => handleTestEnv(env.id)} class="text-green-600 hover:text-green-900 mr-4">Test</button>
<button on:click={() => editEnv(env)} class="text-indigo-600 hover:text-indigo-900 mr-4">Edit</button>
<button on:click={() => handleDeleteEnv(env.id)} class="text-red-600 hover:text-red-900">Delete</button>
</td>
</tr>
{/each}
</tbody>
</table>
</div>
<div class="bg-gray-50 p-4 rounded">
<h3 class="text-lg font-medium mb-4">{editingEnvId ? 'Edit' : 'Add'} Environment</h3>
<div class="grid grid-cols-1 md:grid-cols-2 gap-4">
<div>
<label for="env_id" class="block text-sm font-medium text-gray-700">ID</label>
<input type="text" id="env_id" bind:value={newEnv.id} disabled={!!editingEnvId} class="mt-1 block w-full border border-gray-300 rounded-md shadow-sm p-2" />
</div>
<div>
<label for="env_name" class="block text-sm font-medium text-gray-700">Name</label>
<input type="text" id="env_name" bind:value={newEnv.name} class="mt-1 block w-full border border-gray-300 rounded-md shadow-sm p-2" />
</div>
<div>
<label for="env_url" class="block text-sm font-medium text-gray-700">URL</label>
<input type="text" id="env_url" bind:value={newEnv.url} class="mt-1 block w-full border border-gray-300 rounded-md shadow-sm p-2" />
</div>
<div>
<label for="env_user" class="block text-sm font-medium text-gray-700">Username</label>
<input type="text" id="env_user" bind:value={newEnv.username} class="mt-1 block w-full border border-gray-300 rounded-md shadow-sm p-2" />
</div>
<div>
<label for="env_pass" class="block text-sm font-medium text-gray-700">Password</label>
<input type="password" id="env_pass" bind:value={newEnv.password} class="mt-1 block w-full border border-gray-300 rounded-md shadow-sm p-2" />
</div>
<div class="flex items-center">
<input type="checkbox" id="env_default" bind:checked={newEnv.is_default} class="h-4 w-4 text-blue-600 border-gray-300 rounded" />
<label for="env_default" class="ml-2 block text-sm text-gray-900">Default Environment</label>
</div>
</div>
<div class="mt-4 flex gap-2">
<button on:click={handleAddOrUpdateEnv} class="bg-green-500 text-white px-4 py-2 rounded hover:bg-green-600">
{editingEnvId ? 'Update' : 'Add'} Environment
</button>
{#if editingEnvId}
<button on:click={resetEnvForm} class="bg-gray-500 text-white px-4 py-2 rounded hover:bg-gray-600">
Cancel
</button>
{/if}
</div>
</div>
</section>
</div>
<!-- [/DEF:Settings] -->

0
frontend/svelte.config.js Normal file → Executable file
View File

20
frontend/tailwind.config.js Normal file → Executable file
View File

@@ -1,11 +1,11 @@
/** @type {import('tailwindcss').Config} */
export default {
content: [
"./index.html",
"./src/**/*.{svelte,js,ts,jsx,tsx}",
],
theme: {
extend: {},
},
plugins: [],
/** @type {import('tailwindcss').Config} */
export default {
content: [
"./index.html",
"./src/**/*.{svelte,js,ts,jsx,tsx}",
],
theme: {
extend: {},
},
plugins: [],
}

0
frontend/vite.config.js Normal file → Executable file
View File

128
get_dataset_structure.py Normal file → Executable file
View File

@@ -1,64 +1,64 @@
# [DEF:get_dataset_structure:Module]
#
# @SEMANTICS: superset, dataset, structure, debug, json
# @PURPOSE: Этот модуль предназначен для получения и сохранения структуры данных датасета из Superset. Он используется для отладки и анализа данных, возвращаемых API.
# @LAYER: App
# @RELATION: DEPENDS_ON -> superset_tool.client
# @RELATION: DEPENDS_ON -> superset_tool.utils.init_clients
# @RELATION: DEPENDS_ON -> superset_tool.utils.logger
# @PUBLIC_API: get_and_save_dataset
# [SECTION: IMPORTS]
import argparse
import json
from superset_tool.utils.init_clients import setup_clients
from superset_tool.utils.logger import SupersetLogger
# [/SECTION]
# [DEF:get_and_save_dataset:Function]
# @PURPOSE: Получает структуру датасета из Superset и сохраняет ее в JSON-файл.
# @RELATION: CALLS -> setup_clients
# @RELATION: CALLS -> superset_client.get_dataset
# @PARAM: env (str) - Среда (dev, prod, и т.д.) для подключения.
# @PARAM: dataset_id (int) - ID датасета для получения.
# @PARAM: output_path (str) - Путь для сохранения JSON-файла.
def get_and_save_dataset(env: str, dataset_id: int, output_path: str):
"""
Получает структуру датасета и сохраняет в файл.
"""
logger = SupersetLogger(name="DatasetStructureRetriever")
logger.info("[get_and_save_dataset][Enter] Starting to fetch dataset structure for ID %d from env '%s'.", dataset_id, env)
try:
clients = setup_clients(logger=logger)
superset_client = clients.get(env)
if not superset_client:
logger.error("[get_and_save_dataset][Failure] Environment '%s' not found.", env)
return
dataset_response = superset_client.get_dataset(dataset_id)
dataset_data = dataset_response.get('result')
if not dataset_data:
logger.error("[get_and_save_dataset][Failure] No result in dataset response.")
return
with open(output_path, 'w', encoding='utf-8') as f:
json.dump(dataset_data, f, ensure_ascii=False, indent=4)
logger.info("[get_and_save_dataset][Success] Dataset structure saved to %s.", output_path)
except Exception as e:
logger.error("[get_and_save_dataset][Failure] An error occurred: %s", e, exc_info=True)
# [/DEF:get_and_save_dataset]
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Получение структуры датасета из Superset.")
parser.add_argument("--dataset-id", required=True, type=int, help="ID датасета.")
parser.add_argument("--env", required=True, help="Среда для подключения (например, dev).")
parser.add_argument("--output-path", default="dataset_structure.json", help="Путь для сохранения JSON-файла.")
args = parser.parse_args()
get_and_save_dataset(args.env, args.dataset_id, args.output_path)
# [/DEF:get_dataset_structure]
# [DEF:get_dataset_structure:Module]
#
# @SEMANTICS: superset, dataset, structure, debug, json
# @PURPOSE: Этот модуль предназначен для получения и сохранения структуры данных датасета из Superset. Он используется для отладки и анализа данных, возвращаемых API.
# @LAYER: App
# @RELATION: DEPENDS_ON -> superset_tool.client
# @RELATION: DEPENDS_ON -> superset_tool.utils.init_clients
# @RELATION: DEPENDS_ON -> superset_tool.utils.logger
# @PUBLIC_API: get_and_save_dataset
# [SECTION: IMPORTS]
import argparse
import json
from superset_tool.utils.init_clients import setup_clients
from superset_tool.utils.logger import SupersetLogger
# [/SECTION]
# [DEF:get_and_save_dataset:Function]
# @PURPOSE: Получает структуру датасета из Superset и сохраняет ее в JSON-файл.
# @RELATION: CALLS -> setup_clients
# @RELATION: CALLS -> superset_client.get_dataset
# @PARAM: env (str) - Среда (dev, prod, и т.д.) для подключения.
# @PARAM: dataset_id (int) - ID датасета для получения.
# @PARAM: output_path (str) - Путь для сохранения JSON-файла.
def get_and_save_dataset(env: str, dataset_id: int, output_path: str):
"""
Получает структуру датасета и сохраняет в файл.
"""
logger = SupersetLogger(name="DatasetStructureRetriever")
logger.info("[get_and_save_dataset][Enter] Starting to fetch dataset structure for ID %d from env '%s'.", dataset_id, env)
try:
clients = setup_clients(logger=logger)
superset_client = clients.get(env)
if not superset_client:
logger.error("[get_and_save_dataset][Failure] Environment '%s' not found.", env)
return
dataset_response = superset_client.get_dataset(dataset_id)
dataset_data = dataset_response.get('result')
if not dataset_data:
logger.error("[get_and_save_dataset][Failure] No result in dataset response.")
return
with open(output_path, 'w', encoding='utf-8') as f:
json.dump(dataset_data, f, ensure_ascii=False, indent=4)
logger.info("[get_and_save_dataset][Success] Dataset structure saved to %s.", output_path)
except Exception as e:
logger.error("[get_and_save_dataset][Failure] An error occurred: %s", e, exc_info=True)
# [/DEF:get_and_save_dataset]
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Получение структуры датасета из Superset.")
parser.add_argument("--dataset-id", required=True, type=int, help="ID датасета.")
parser.add_argument("--env", required=True, help="Среда для подключения (например, dev).")
parser.add_argument("--output-path", default="dataset_structure.json", help="Путь для сохранения JSON-файла.")
args = parser.parse_args()
get_and_save_dataset(args.env, args.dataset_id, args.output_path)
# [/DEF:get_dataset_structure]

802
migration_script.py Normal file → Executable file
View File

@@ -1,401 +1,401 @@
# [DEF:migration_script:Module]
#
# @SEMANTICS: migration, cli, superset, ui, logging, error-recovery, batch-delete
# @PURPOSE: Предоставляет интерактивный CLI для миграции дашбордов Superset между окружениями с возможностью восстановления после ошибок.
# @LAYER: App
# @RELATION: DEPENDS_ON -> superset_tool.client
# @RELATION: DEPENDS_ON -> superset_tool.utils
# @PUBLIC_API: Migration
# [SECTION: IMPORTS]
import json
import logging
import sys
import zipfile
import re
from pathlib import Path
from typing import List, Optional, Tuple, Dict
from superset_tool.client import SupersetClient
from superset_tool.utils.init_clients import setup_clients
from superset_tool.utils.fileio import create_temp_file, update_yamls, create_dashboard_export
from superset_tool.utils.whiptail_fallback import menu, checklist, yesno, msgbox, inputbox, gauge
from superset_tool.utils.logger import SupersetLogger
# [/SECTION]
# [DEF:Migration:Class]
# @PURPOSE: Инкапсулирует логику интерактивной миграции дашбордов с возможностью «удалить‑и‑перезаписать» при ошибке импорта.
# @RELATION: CREATES_INSTANCE_OF -> SupersetLogger
# @RELATION: USES -> SupersetClient
class Migration:
"""
Интерактивный процесс миграции дашбордов.
"""
# [DEF:Migration.__init__:Function]
# @PURPOSE: Инициализирует сервис миграции, настраивает логгер и начальные состояния.
# @POST: `self.logger` готов к использованию; `enable_delete_on_failure` = `False`.
def __init__(self) -> None:
default_log_dir = Path.cwd() / "logs"
self.logger = SupersetLogger(
name="migration_script",
log_dir=default_log_dir,
level=logging.INFO,
console=True,
)
self.enable_delete_on_failure = False
self.from_c: Optional[SupersetClient] = None
self.to_c: Optional[SupersetClient] = None
self.dashboards_to_migrate: List[dict] = []
self.db_config_replacement: Optional[dict] = None
self._failed_imports: List[dict] = []
# [/DEF:Migration.__init__]
# [DEF:Migration.run:Function]
# @PURPOSE: Точка входа последовательный запуск всех шагов миграции.
# @PRE: Логгер готов.
# @POST: Скрипт завершён, пользователю выведено сообщение.
# @RELATION: CALLS -> self.ask_delete_on_failure
# @RELATION: CALLS -> self.select_environments
# @RELATION: CALLS -> self.select_dashboards
# @RELATION: CALLS -> self.confirm_db_config_replacement
# @RELATION: CALLS -> self.execute_migration
def run(self) -> None:
self.logger.info("[run][Entry] Запуск скрипта миграции.")
self.ask_delete_on_failure()
self.select_environments()
self.select_dashboards()
self.confirm_db_config_replacement()
self.execute_migration()
self.logger.info("[run][Exit] Скрипт миграции завершён.")
# [/DEF:Migration.run]
# [DEF:Migration.ask_delete_on_failure:Function]
# @PURPOSE: Запрашивает у пользователя, следует ли удалять дашборд при ошибке импорта.
# @POST: `self.enable_delete_on_failure` установлен.
# @RELATION: CALLS -> yesno
def ask_delete_on_failure(self) -> None:
self.enable_delete_on_failure = yesno(
"Поведение при ошибке импорта",
"Если импорт завершится ошибкой, удалить существующий дашборд и попытаться импортировать заново?",
)
self.logger.info(
"[ask_delete_on_failure][State] Delete-on-failure = %s",
self.enable_delete_on_failure,
)
# [/DEF:Migration.ask_delete_on_failure]
# [DEF:Migration.select_environments:Function]
# @PURPOSE: Позволяет пользователю выбрать исходное и целевое окружения Superset.
# @PRE: `setup_clients` успешно инициализирует все клиенты.
# @POST: `self.from_c` и `self.to_c` установлены.
# @RELATION: CALLS -> setup_clients
# @RELATION: CALLS -> menu
def select_environments(self) -> None:
self.logger.info("[select_environments][Entry] Шаг 1/5: Выбор окружений.")
try:
all_clients = setup_clients(self.logger)
available_envs = list(all_clients.keys())
except Exception as e:
self.logger.error("[select_environments][Failure] %s", e, exc_info=True)
msgbox("Ошибка", "Не удалось инициализировать клиенты.")
return
rc, from_env_name = menu(
title="Выбор окружения",
prompt="Исходное окружение:",
choices=available_envs,
)
if rc != 0 or from_env_name is None:
self.logger.info("[select_environments][State] Source environment selection cancelled.")
return
self.from_c = all_clients[from_env_name]
self.logger.info("[select_environments][State] from = %s", from_env_name)
available_envs.remove(from_env_name)
rc, to_env_name = menu(
title="Выбор окружения",
prompt="Целевое окружение:",
choices=available_envs,
)
if rc != 0 or to_env_name is None:
self.logger.info("[select_environments][State] Target environment selection cancelled.")
return
self.to_c = all_clients[to_env_name]
self.logger.info("[select_environments][State] to = %s", to_env_name)
self.logger.info("[select_environments][Exit] Шаг 1 завершён.")
# [/DEF:Migration.select_environments]
# [DEF:Migration.select_dashboards:Function]
# @PURPOSE: Позволяет пользователю выбрать набор дашбордов для миграции.
# @PRE: `self.from_c` инициализирован.
# @POST: `self.dashboards_to_migrate` заполнен.
# @RELATION: CALLS -> self.from_c.get_dashboards
# @RELATION: CALLS -> checklist
def select_dashboards(self) -> None:
self.logger.info("[select_dashboards][Entry] Шаг 2/5: Выбор дашбордов.")
if self.from_c is None:
self.logger.error("[select_dashboards][Failure] Source client not initialized.")
msgbox("Ошибка", "Исходное окружение не выбрано.")
return
try:
_, all_dashboards = self.from_c.get_dashboards()
if not all_dashboards:
self.logger.warning("[select_dashboards][State] No dashboards.")
msgbox("Информация", "В исходном окружении нет дашбордов.")
return
rc, regex = inputbox("Поиск", "Введите регулярное выражение для поиска дашбордов:")
if rc != 0:
return
# Ensure regex is a string and perform caseinsensitive search
regex_str = str(regex)
filtered_dashboards = [
d for d in all_dashboards if re.search(regex_str, d["dashboard_title"], re.IGNORECASE)
]
options = [("ALL", "Все дашборды")] + [
(str(d["id"]), d["dashboard_title"]) for d in filtered_dashboards
]
rc, selected = checklist(
title="Выбор дашбордов",
prompt="Отметьте нужные дашборды (введите номера):",
options=options,
)
if rc != 0:
return
if "ALL" in selected:
self.dashboards_to_migrate = filtered_dashboards
else:
self.dashboards_to_migrate = [
d for d in filtered_dashboards if str(d["id"]) in selected
]
self.logger.info(
"[select_dashboards][State] Выбрано %d дашбордов.",
len(self.dashboards_to_migrate),
)
except Exception as e:
self.logger.error("[select_dashboards][Failure] %s", e, exc_info=True)
msgbox("Ошибка", "Не удалось получить список дашбордов.")
self.logger.info("[select_dashboards][Exit] Шаг 2 завершён.")
# [/DEF:Migration.select_dashboards]
# [DEF:Migration.confirm_db_config_replacement:Function]
# @PURPOSE: Запрашивает у пользователя, требуется ли заменить имена БД в YAML-файлах.
# @POST: `self.db_config_replacement` либо `None`, либо заполнен.
# @RELATION: CALLS -> yesno
# @RELATION: CALLS -> self._select_databases
def confirm_db_config_replacement(self) -> None:
if yesno("Замена БД", "Заменить конфигурацию БД в YAMLфайлах?"):
old_db, new_db = self._select_databases()
if not old_db or not new_db:
self.logger.info("[confirm_db_config_replacement][State] Selection cancelled.")
return
print(f"old_db: {old_db}")
old_result = old_db.get("result", {})
new_result = new_db.get("result", {})
self.db_config_replacement = {
"old": {
"database_name": old_result.get("database_name"),
"uuid": old_result.get("uuid"),
"database_uuid": old_result.get("uuid"),
"id": str(old_db.get("id"))
},
"new": {
"database_name": new_result.get("database_name"),
"uuid": new_result.get("uuid"),
"database_uuid": new_result.get("uuid"),
"id": str(new_db.get("id"))
}
}
self.logger.info("[confirm_db_config_replacement][State] Replacement set: %s", self.db_config_replacement)
else:
self.logger.info("[confirm_db_config_replacement][State] Skipped.")
# [/DEF:Migration.confirm_db_config_replacement]
# [DEF:Migration._select_databases:Function]
# @PURPOSE: Позволяет пользователю выбрать исходную и целевую БД через API.
# @POST: Возвращает кортеж (старая БД, новая БД) или (None, None) при отмене.
# @RELATION: CALLS -> self.from_c.get_databases
# @RELATION: CALLS -> self.to_c.get_databases
# @RELATION: CALLS -> self.from_c.get_database
# @RELATION: CALLS -> self.to_c.get_database
# @RELATION: CALLS -> menu
def _select_databases(self) -> Tuple[Optional[Dict], Optional[Dict]]:
self.logger.info("[_select_databases][Entry] Selecting databases from both environments.")
if self.from_c is None or self.to_c is None:
self.logger.error("[_select_databases][Failure] Source or target client not initialized.")
msgbox("Ошибка", "Исходное или целевое окружение не выбрано.")
return None, None
# Получаем список БД из обоих окружений
try:
_, from_dbs = self.from_c.get_databases()
_, to_dbs = self.to_c.get_databases()
except Exception as e:
self.logger.error("[_select_databases][Failure] Failed to fetch databases: %s", e)
msgbox("Ошибка", "Не удалось получить список баз данных.")
return None, None
# Формируем список для выбора
# По Swagger документации, в ответе API поле называется "database_name"
from_choices = []
for db in from_dbs:
db_name = db.get("database_name", "Без имени")
from_choices.append((str(db["id"]), f"{db_name} (ID: {db['id']})"))
to_choices = []
for db in to_dbs:
db_name = db.get("database_name", "Без имени")
to_choices.append((str(db["id"]), f"{db_name} (ID: {db['id']})"))
# Показываем список БД для исходного окружения
rc, from_sel = menu(
title="Выбор исходной БД",
prompt="Выберите исходную БД:",
choices=[f"{name}" for id, name in from_choices]
)
if rc != 0:
return None, None
# Определяем выбранную БД
from_db_id = from_choices[[choice[1] for choice in from_choices].index(from_sel)][0]
# Получаем полную информацию о выбранной БД из исходного окружения
try:
from_db = self.from_c.get_database(int(from_db_id))
except Exception as e:
self.logger.error("[_select_databases][Failure] Failed to fetch database details: %s", e)
msgbox("Ошибка", "Не удалось получить информацию о выбранной базе данных.")
return None, None
# Показываем список БД для целевого окружения
rc, to_sel = menu(
title="Выбор целевой БД",
prompt="Выберите целевую БД:",
choices=[f"{name}" for id, name in to_choices]
)
if rc != 0:
return None, None
# Определяем выбранную БД
to_db_id = to_choices[[choice[1] for choice in to_choices].index(to_sel)][0]
# Получаем полную информацию о выбранной БД из целевого окружения
try:
to_db = self.to_c.get_database(int(to_db_id))
except Exception as e:
self.logger.error("[_select_databases][Failure] Failed to fetch database details: %s", e)
msgbox("Ошибка", "Не удалось получить информацию о выбранной базе данных.")
return None, None
self.logger.info("[_select_databases][Exit] Selected databases: %s -> %s", from_db.get("database_name", "Без имени"), to_db.get("database_name", "Без имени"))
return from_db, to_db
# [/DEF:Migration._select_databases]
# [DEF:Migration._batch_delete_by_ids:Function]
# @PURPOSE: Удаляет набор дашбордов по их ID единым запросом.
# @PRE: `ids` непустой список целых чисел.
# @POST: Все указанные дашборды удалены (если они существовали).
# @RELATION: CALLS -> self.to_c.network.request
# @PARAM: ids (List[int]) - Список ID дашбордов для удаления.
def _batch_delete_by_ids(self, ids: List[int]) -> None:
if not ids:
self.logger.debug("[_batch_delete_by_ids][Skip] Empty ID list nothing to delete.")
return
if self.to_c is None:
self.logger.error("[_batch_delete_by_ids][Failure] Target client not initialized.")
msgbox("Ошибка", "Целевое окружение не выбрано.")
return
self.logger.info("[_batch_delete_by_ids][Entry] Deleting dashboards IDs: %s", ids)
q_param = json.dumps(ids)
response = self.to_c.network.request(method="DELETE", endpoint="/dashboard/", params={"q": q_param})
if isinstance(response, dict) and response.get("result", True) is False:
self.logger.warning("[_batch_delete_by_ids][Warning] Unexpected delete response: %s", response)
else:
self.logger.info("[_batch_delete_by_ids][Success] Delete request completed.")
# [/DEF:Migration._batch_delete_by_ids]
# [DEF:Migration.execute_migration:Function]
# @PURPOSE: Выполняет экспорт-импорт дашбордов, обрабатывает ошибки и, при необходимости, выполняет процедуру восстановления.
# @PRE: `self.dashboards_to_migrate` не пуст; `self.from_c` и `self.to_c` инициализированы.
# @POST: Успешные дашборды импортированы; неудачные - восстановлены или залогированы.
# @RELATION: CALLS -> self.from_c.export_dashboard
# @RELATION: CALLS -> create_temp_file
# @RELATION: CALLS -> update_yamls
# @RELATION: CALLS -> create_dashboard_export
# @RELATION: CALLS -> self.to_c.import_dashboard
# @RELATION: CALLS -> self._batch_delete_by_ids
def execute_migration(self) -> None:
if not self.dashboards_to_migrate:
self.logger.warning("[execute_migration][Skip] No dashboards to migrate.")
msgbox("Информация", "Нет дашбордов для миграции.")
return
if self.from_c is None or self.to_c is None:
self.logger.error("[execute_migration][Failure] Source or target client not initialized.")
msgbox("Ошибка", "Исходное или целевое окружение не выбрано.")
return
total = len(self.dashboards_to_migrate)
self.logger.info("[execute_migration][Entry] Starting migration of %d dashboards.", total)
self.to_c.delete_before_reimport = self.enable_delete_on_failure
with gauge("Миграция...", width=60, height=10) as g:
for i, dash in enumerate(self.dashboards_to_migrate):
dash_id, dash_slug, title = dash["id"], dash.get("slug"), dash["dashboard_title"]
g.set_text(f"Миграция: {title} ({i + 1}/{total})")
g.set_percent(int((i / total) * 100))
exported_content = None # Initialize exported_content
try:
exported_content, _ = self.from_c.export_dashboard(dash_id)
with create_temp_file(content=exported_content, dry_run=True, suffix=".zip", logger=self.logger) as tmp_zip_path, \
create_temp_file(suffix=".dir", logger=self.logger) as tmp_unpack_dir:
if not self.db_config_replacement:
self.to_c.import_dashboard(file_name=tmp_zip_path, dash_id=dash_id, dash_slug=dash_slug)
else:
with zipfile.ZipFile(tmp_zip_path, "r") as zip_ref:
zip_ref.extractall(tmp_unpack_dir)
if self.db_config_replacement:
update_yamls(db_configs=[self.db_config_replacement], path=str(tmp_unpack_dir))
with create_temp_file(suffix=".zip", dry_run=True, logger=self.logger) as tmp_new_zip:
create_dashboard_export(zip_path=tmp_new_zip, source_paths=[str(p) for p in Path(tmp_unpack_dir).glob("**/*")])
self.to_c.import_dashboard(file_name=tmp_new_zip, dash_id=dash_id, dash_slug=dash_slug)
self.logger.info("[execute_migration][Success] Dashboard %s imported.", title)
except Exception as exc:
self.logger.error("[execute_migration][Failure] %s", exc, exc_info=True)
self._failed_imports.append({"slug": dash_slug, "dash_id": dash_id, "zip_content": exported_content})
msgbox("Ошибка", f"Не удалось мигрировать дашборд {title}.\n\n{exc}")
g.set_percent(100)
if self.enable_delete_on_failure and self._failed_imports:
self.logger.info("[execute_migration][Recovery] %d dashboards failed. Starting recovery.", len(self._failed_imports))
_, target_dashboards = self.to_c.get_dashboards()
slug_to_id = {d["slug"]: d["id"] for d in target_dashboards if "slug" in d and "id" in d}
ids_to_delete = [slug_to_id[f["slug"]] for f in self._failed_imports if f["slug"] in slug_to_id]
self._batch_delete_by_ids(ids_to_delete)
for fail in self._failed_imports:
with create_temp_file(content=fail["zip_content"], suffix=".zip", logger=self.logger) as retry_zip:
self.to_c.import_dashboard(file_name=retry_zip, dash_id=fail["dash_id"], dash_slug=fail["slug"])
self.logger.info("[execute_migration][Recovered] Dashboard slug '%s' re-imported.", fail["slug"])
self.logger.info("[execute_migration][Exit] Migration finished.")
msgbox("Информация", "Миграция завершена!")
# [/DEF:Migration.execute_migration]
# [/DEF:Migration]
if __name__ == "__main__":
Migration().run()
# [/DEF:migration_script]
# [DEF:migration_script:Module]
#
# @SEMANTICS: migration, cli, superset, ui, logging, error-recovery, batch-delete
# @PURPOSE: Предоставляет интерактивный CLI для миграции дашбордов Superset между окружениями с возможностью восстановления после ошибок.
# @LAYER: App
# @RELATION: DEPENDS_ON -> superset_tool.client
# @RELATION: DEPENDS_ON -> superset_tool.utils
# @PUBLIC_API: Migration
# [SECTION: IMPORTS]
import json
import logging
import sys
import zipfile
import re
from pathlib import Path
from typing import List, Optional, Tuple, Dict
from superset_tool.client import SupersetClient
from superset_tool.utils.init_clients import setup_clients
from superset_tool.utils.fileio import create_temp_file, update_yamls, create_dashboard_export
from superset_tool.utils.whiptail_fallback import menu, checklist, yesno, msgbox, inputbox, gauge
from superset_tool.utils.logger import SupersetLogger
# [/SECTION]
# [DEF:Migration:Class]
# @PURPOSE: Инкапсулирует логику интерактивной миграции дашбордов с возможностью «удалить‑и‑перезаписать» при ошибке импорта.
# @RELATION: CREATES_INSTANCE_OF -> SupersetLogger
# @RELATION: USES -> SupersetClient
class Migration:
"""
Интерактивный процесс миграции дашбордов.
"""
# [DEF:Migration.__init__:Function]
# @PURPOSE: Инициализирует сервис миграции, настраивает логгер и начальные состояния.
# @POST: `self.logger` готов к использованию; `enable_delete_on_failure` = `False`.
def __init__(self) -> None:
default_log_dir = Path.cwd() / "logs"
self.logger = SupersetLogger(
name="migration_script",
log_dir=default_log_dir,
level=logging.INFO,
console=True,
)
self.enable_delete_on_failure = False
self.from_c: Optional[SupersetClient] = None
self.to_c: Optional[SupersetClient] = None
self.dashboards_to_migrate: List[dict] = []
self.db_config_replacement: Optional[dict] = None
self._failed_imports: List[dict] = []
# [/DEF:Migration.__init__]
# [DEF:Migration.run:Function]
# @PURPOSE: Точка входа последовательный запуск всех шагов миграции.
# @PRE: Логгер готов.
# @POST: Скрипт завершён, пользователю выведено сообщение.
# @RELATION: CALLS -> self.ask_delete_on_failure
# @RELATION: CALLS -> self.select_environments
# @RELATION: CALLS -> self.select_dashboards
# @RELATION: CALLS -> self.confirm_db_config_replacement
# @RELATION: CALLS -> self.execute_migration
def run(self) -> None:
self.logger.info("[run][Entry] Запуск скрипта миграции.")
self.ask_delete_on_failure()
self.select_environments()
self.select_dashboards()
self.confirm_db_config_replacement()
self.execute_migration()
self.logger.info("[run][Exit] Скрипт миграции завершён.")
# [/DEF:Migration.run]
# [DEF:Migration.ask_delete_on_failure:Function]
# @PURPOSE: Запрашивает у пользователя, следует ли удалять дашборд при ошибке импорта.
# @POST: `self.enable_delete_on_failure` установлен.
# @RELATION: CALLS -> yesno
def ask_delete_on_failure(self) -> None:
self.enable_delete_on_failure = yesno(
"Поведение при ошибке импорта",
"Если импорт завершится ошибкой, удалить существующий дашборд и попытаться импортировать заново?",
)
self.logger.info(
"[ask_delete_on_failure][State] Delete-on-failure = %s",
self.enable_delete_on_failure,
)
# [/DEF:Migration.ask_delete_on_failure]
# [DEF:Migration.select_environments:Function]
# @PURPOSE: Позволяет пользователю выбрать исходное и целевое окружения Superset.
# @PRE: `setup_clients` успешно инициализирует все клиенты.
# @POST: `self.from_c` и `self.to_c` установлены.
# @RELATION: CALLS -> setup_clients
# @RELATION: CALLS -> menu
def select_environments(self) -> None:
self.logger.info("[select_environments][Entry] Шаг 1/5: Выбор окружений.")
try:
all_clients = setup_clients(self.logger)
available_envs = list(all_clients.keys())
except Exception as e:
self.logger.error("[select_environments][Failure] %s", e, exc_info=True)
msgbox("Ошибка", "Не удалось инициализировать клиенты.")
return
rc, from_env_name = menu(
title="Выбор окружения",
prompt="Исходное окружение:",
choices=available_envs,
)
if rc != 0 or from_env_name is None:
self.logger.info("[select_environments][State] Source environment selection cancelled.")
return
self.from_c = all_clients[from_env_name]
self.logger.info("[select_environments][State] from = %s", from_env_name)
available_envs.remove(from_env_name)
rc, to_env_name = menu(
title="Выбор окружения",
prompt="Целевое окружение:",
choices=available_envs,
)
if rc != 0 or to_env_name is None:
self.logger.info("[select_environments][State] Target environment selection cancelled.")
return
self.to_c = all_clients[to_env_name]
self.logger.info("[select_environments][State] to = %s", to_env_name)
self.logger.info("[select_environments][Exit] Шаг 1 завершён.")
# [/DEF:Migration.select_environments]
# [DEF:Migration.select_dashboards:Function]
# @PURPOSE: Позволяет пользователю выбрать набор дашбордов для миграции.
# @PRE: `self.from_c` инициализирован.
# @POST: `self.dashboards_to_migrate` заполнен.
# @RELATION: CALLS -> self.from_c.get_dashboards
# @RELATION: CALLS -> checklist
def select_dashboards(self) -> None:
self.logger.info("[select_dashboards][Entry] Шаг 2/5: Выбор дашбордов.")
if self.from_c is None:
self.logger.error("[select_dashboards][Failure] Source client not initialized.")
msgbox("Ошибка", "Исходное окружение не выбрано.")
return
try:
_, all_dashboards = self.from_c.get_dashboards()
if not all_dashboards:
self.logger.warning("[select_dashboards][State] No dashboards.")
msgbox("Информация", "В исходном окружении нет дашбордов.")
return
rc, regex = inputbox("Поиск", "Введите регулярное выражение для поиска дашбордов:")
if rc != 0:
return
# Ensure regex is a string and perform caseinsensitive search
regex_str = str(regex)
filtered_dashboards = [
d for d in all_dashboards if re.search(regex_str, d["dashboard_title"], re.IGNORECASE)
]
options = [("ALL", "Все дашборды")] + [
(str(d["id"]), d["dashboard_title"]) for d in filtered_dashboards
]
rc, selected = checklist(
title="Выбор дашбордов",
prompt="Отметьте нужные дашборды (введите номера):",
options=options,
)
if rc != 0:
return
if "ALL" in selected:
self.dashboards_to_migrate = filtered_dashboards
else:
self.dashboards_to_migrate = [
d for d in filtered_dashboards if str(d["id"]) in selected
]
self.logger.info(
"[select_dashboards][State] Выбрано %d дашбордов.",
len(self.dashboards_to_migrate),
)
except Exception as e:
self.logger.error("[select_dashboards][Failure] %s", e, exc_info=True)
msgbox("Ошибка", "Не удалось получить список дашбордов.")
self.logger.info("[select_dashboards][Exit] Шаг 2 завершён.")
# [/DEF:Migration.select_dashboards]
# [DEF:Migration.confirm_db_config_replacement:Function]
# @PURPOSE: Запрашивает у пользователя, требуется ли заменить имена БД в YAML-файлах.
# @POST: `self.db_config_replacement` либо `None`, либо заполнен.
# @RELATION: CALLS -> yesno
# @RELATION: CALLS -> self._select_databases
def confirm_db_config_replacement(self) -> None:
if yesno("Замена БД", "Заменить конфигурацию БД в YAMLфайлах?"):
old_db, new_db = self._select_databases()
if not old_db or not new_db:
self.logger.info("[confirm_db_config_replacement][State] Selection cancelled.")
return
print(f"old_db: {old_db}")
old_result = old_db.get("result", {})
new_result = new_db.get("result", {})
self.db_config_replacement = {
"old": {
"database_name": old_result.get("database_name"),
"uuid": old_result.get("uuid"),
"database_uuid": old_result.get("uuid"),
"id": str(old_db.get("id"))
},
"new": {
"database_name": new_result.get("database_name"),
"uuid": new_result.get("uuid"),
"database_uuid": new_result.get("uuid"),
"id": str(new_db.get("id"))
}
}
self.logger.info("[confirm_db_config_replacement][State] Replacement set: %s", self.db_config_replacement)
else:
self.logger.info("[confirm_db_config_replacement][State] Skipped.")
# [/DEF:Migration.confirm_db_config_replacement]
# [DEF:Migration._select_databases:Function]
# @PURPOSE: Позволяет пользователю выбрать исходную и целевую БД через API.
# @POST: Возвращает кортеж (старая БД, новая БД) или (None, None) при отмене.
# @RELATION: CALLS -> self.from_c.get_databases
# @RELATION: CALLS -> self.to_c.get_databases
# @RELATION: CALLS -> self.from_c.get_database
# @RELATION: CALLS -> self.to_c.get_database
# @RELATION: CALLS -> menu
def _select_databases(self) -> Tuple[Optional[Dict], Optional[Dict]]:
self.logger.info("[_select_databases][Entry] Selecting databases from both environments.")
if self.from_c is None or self.to_c is None:
self.logger.error("[_select_databases][Failure] Source or target client not initialized.")
msgbox("Ошибка", "Исходное или целевое окружение не выбрано.")
return None, None
# Получаем список БД из обоих окружений
try:
_, from_dbs = self.from_c.get_databases()
_, to_dbs = self.to_c.get_databases()
except Exception as e:
self.logger.error("[_select_databases][Failure] Failed to fetch databases: %s", e)
msgbox("Ошибка", "Не удалось получить список баз данных.")
return None, None
# Формируем список для выбора
# По Swagger документации, в ответе API поле называется "database_name"
from_choices = []
for db in from_dbs:
db_name = db.get("database_name", "Без имени")
from_choices.append((str(db["id"]), f"{db_name} (ID: {db['id']})"))
to_choices = []
for db in to_dbs:
db_name = db.get("database_name", "Без имени")
to_choices.append((str(db["id"]), f"{db_name} (ID: {db['id']})"))
# Показываем список БД для исходного окружения
rc, from_sel = menu(
title="Выбор исходной БД",
prompt="Выберите исходную БД:",
choices=[f"{name}" for id, name in from_choices]
)
if rc != 0:
return None, None
# Определяем выбранную БД
from_db_id = from_choices[[choice[1] for choice in from_choices].index(from_sel)][0]
# Получаем полную информацию о выбранной БД из исходного окружения
try:
from_db = self.from_c.get_database(int(from_db_id))
except Exception as e:
self.logger.error("[_select_databases][Failure] Failed to fetch database details: %s", e)
msgbox("Ошибка", "Не удалось получить информацию о выбранной базе данных.")
return None, None
# Показываем список БД для целевого окружения
rc, to_sel = menu(
title="Выбор целевой БД",
prompt="Выберите целевую БД:",
choices=[f"{name}" for id, name in to_choices]
)
if rc != 0:
return None, None
# Определяем выбранную БД
to_db_id = to_choices[[choice[1] for choice in to_choices].index(to_sel)][0]
# Получаем полную информацию о выбранной БД из целевого окружения
try:
to_db = self.to_c.get_database(int(to_db_id))
except Exception as e:
self.logger.error("[_select_databases][Failure] Failed to fetch database details: %s", e)
msgbox("Ошибка", "Не удалось получить информацию о выбранной базе данных.")
return None, None
self.logger.info("[_select_databases][Exit] Selected databases: %s -> %s", from_db.get("database_name", "Без имени"), to_db.get("database_name", "Без имени"))
return from_db, to_db
# [/DEF:Migration._select_databases]
# [DEF:Migration._batch_delete_by_ids:Function]
# @PURPOSE: Удаляет набор дашбордов по их ID единым запросом.
# @PRE: `ids` непустой список целых чисел.
# @POST: Все указанные дашборды удалены (если они существовали).
# @RELATION: CALLS -> self.to_c.network.request
# @PARAM: ids (List[int]) - Список ID дашбордов для удаления.
def _batch_delete_by_ids(self, ids: List[int]) -> None:
if not ids:
self.logger.debug("[_batch_delete_by_ids][Skip] Empty ID list nothing to delete.")
return
if self.to_c is None:
self.logger.error("[_batch_delete_by_ids][Failure] Target client not initialized.")
msgbox("Ошибка", "Целевое окружение не выбрано.")
return
self.logger.info("[_batch_delete_by_ids][Entry] Deleting dashboards IDs: %s", ids)
q_param = json.dumps(ids)
response = self.to_c.network.request(method="DELETE", endpoint="/dashboard/", params={"q": q_param})
if isinstance(response, dict) and response.get("result", True) is False:
self.logger.warning("[_batch_delete_by_ids][Warning] Unexpected delete response: %s", response)
else:
self.logger.info("[_batch_delete_by_ids][Success] Delete request completed.")
# [/DEF:Migration._batch_delete_by_ids]
# [DEF:Migration.execute_migration:Function]
# @PURPOSE: Выполняет экспорт-импорт дашбордов, обрабатывает ошибки и, при необходимости, выполняет процедуру восстановления.
# @PRE: `self.dashboards_to_migrate` не пуст; `self.from_c` и `self.to_c` инициализированы.
# @POST: Успешные дашборды импортированы; неудачные - восстановлены или залогированы.
# @RELATION: CALLS -> self.from_c.export_dashboard
# @RELATION: CALLS -> create_temp_file
# @RELATION: CALLS -> update_yamls
# @RELATION: CALLS -> create_dashboard_export
# @RELATION: CALLS -> self.to_c.import_dashboard
# @RELATION: CALLS -> self._batch_delete_by_ids
def execute_migration(self) -> None:
if not self.dashboards_to_migrate:
self.logger.warning("[execute_migration][Skip] No dashboards to migrate.")
msgbox("Информация", "Нет дашбордов для миграции.")
return
if self.from_c is None or self.to_c is None:
self.logger.error("[execute_migration][Failure] Source or target client not initialized.")
msgbox("Ошибка", "Исходное или целевое окружение не выбрано.")
return
total = len(self.dashboards_to_migrate)
self.logger.info("[execute_migration][Entry] Starting migration of %d dashboards.", total)
self.to_c.delete_before_reimport = self.enable_delete_on_failure
with gauge("Миграция...", width=60, height=10) as g:
for i, dash in enumerate(self.dashboards_to_migrate):
dash_id, dash_slug, title = dash["id"], dash.get("slug"), dash["dashboard_title"]
g.set_text(f"Миграция: {title} ({i + 1}/{total})")
g.set_percent(int((i / total) * 100))
exported_content = None # Initialize exported_content
try:
exported_content, _ = self.from_c.export_dashboard(dash_id)
with create_temp_file(content=exported_content, dry_run=True, suffix=".zip", logger=self.logger) as tmp_zip_path, \
create_temp_file(suffix=".dir", logger=self.logger) as tmp_unpack_dir:
if not self.db_config_replacement:
self.to_c.import_dashboard(file_name=tmp_zip_path, dash_id=dash_id, dash_slug=dash_slug)
else:
with zipfile.ZipFile(tmp_zip_path, "r") as zip_ref:
zip_ref.extractall(tmp_unpack_dir)
if self.db_config_replacement:
update_yamls(db_configs=[self.db_config_replacement], path=str(tmp_unpack_dir))
with create_temp_file(suffix=".zip", dry_run=True, logger=self.logger) as tmp_new_zip:
create_dashboard_export(zip_path=tmp_new_zip, source_paths=[str(p) for p in Path(tmp_unpack_dir).glob("**/*")])
self.to_c.import_dashboard(file_name=tmp_new_zip, dash_id=dash_id, dash_slug=dash_slug)
self.logger.info("[execute_migration][Success] Dashboard %s imported.", title)
except Exception as exc:
self.logger.error("[execute_migration][Failure] %s", exc, exc_info=True)
self._failed_imports.append({"slug": dash_slug, "dash_id": dash_id, "zip_content": exported_content})
msgbox("Ошибка", f"Не удалось мигрировать дашборд {title}.\n\n{exc}")
g.set_percent(100)
if self.enable_delete_on_failure and self._failed_imports:
self.logger.info("[execute_migration][Recovery] %d dashboards failed. Starting recovery.", len(self._failed_imports))
_, target_dashboards = self.to_c.get_dashboards()
slug_to_id = {d["slug"]: d["id"] for d in target_dashboards if "slug" in d and "id" in d}
ids_to_delete = [slug_to_id[f["slug"]] for f in self._failed_imports if f["slug"] in slug_to_id]
self._batch_delete_by_ids(ids_to_delete)
for fail in self._failed_imports:
with create_temp_file(content=fail["zip_content"], suffix=".zip", logger=self.logger) as retry_zip:
self.to_c.import_dashboard(file_name=retry_zip, dash_id=fail["dash_id"], dash_slug=fail["slug"])
self.logger.info("[execute_migration][Recovered] Dashboard slug '%s' re-imported.", fail["slug"])
self.logger.info("[execute_migration][Exit] Migration finished.")
msgbox("Информация", "Миграция завершена!")
# [/DEF:Migration.execute_migration]
# [/DEF:Migration]
if __name__ == "__main__":
Migration().run()
# [/DEF:migration_script]

21
reproduce_issue.py Normal file
View File

@@ -0,0 +1,21 @@
import sys
import os
from pathlib import Path
# Add root to sys.path
sys.path.append(os.getcwd())
try:
from backend.src.core.plugin_loader import PluginLoader
except ImportError as e:
print(f"Failed to import PluginLoader: {e}")
sys.exit(1)
plugin_dir = Path("backend/src/plugins").absolute()
print(f"Plugin dir: {plugin_dir}")
loader = PluginLoader(str(plugin_dir))
configs = loader.get_all_plugin_configs()
print(f"Loaded plugins: {len(configs)}")
for config in configs:
print(f" - {config.id}")

0
requirements.txt Normal file → Executable file
View File

144
run_mapper.py Normal file → Executable file
View File

@@ -1,72 +1,72 @@
# [DEF:run_mapper:Module]
#
# @SEMANTICS: runner, configuration, cli, main
# @PURPOSE: Этот модуль является CLI-точкой входа для запуска процесса меппинга метаданных датасетов.
# @LAYER: App
# @RELATION: DEPENDS_ON -> superset_tool.utils.dataset_mapper
# @RELATION: DEPENDS_ON -> superset_tool.utils
# @PUBLIC_API: main
# [SECTION: IMPORTS]
import argparse
import keyring
from superset_tool.utils.init_clients import setup_clients
from superset_tool.utils.logger import SupersetLogger
from superset_tool.utils.dataset_mapper import DatasetMapper
# [/SECTION]
# [DEF:main:Function]
# @PURPOSE: Парсит аргументы командной строки и запускает процесс меппинга.
# @RELATION: CREATES_INSTANCE_OF -> DatasetMapper
# @RELATION: CALLS -> setup_clients
# @RELATION: CALLS -> DatasetMapper.run_mapping
def main():
parser = argparse.ArgumentParser(description="Map dataset verbose names in Superset.")
parser.add_argument('--source', type=str, required=True, choices=['postgres', 'excel', 'both'], help='The source for the mapping.')
parser.add_argument('--dataset-id', type=int, required=True, help='The ID of the dataset to update.')
parser.add_argument('--table-name', type=str, help='The table name for PostgreSQL source.')
parser.add_argument('--table-schema', type=str, help='The table schema for PostgreSQL source.')
parser.add_argument('--excel-path', type=str, help='The path to the Excel file.')
parser.add_argument('--env', type=str, default='dev', help='The Superset environment to use.')
args = parser.parse_args()
logger = SupersetLogger(name="dataset_mapper_main")
# [AI_NOTE]: Конфигурация БД должна быть вынесена во внешний файл или переменные окружения.
POSTGRES_CONFIG = {
'dbname': 'dwh',
'user': keyring.get_password("system", f"dwh gp user"),
'password': keyring.get_password("system", f"dwh gp password"),
'host': '10.66.229.201',
'port': '5432'
}
logger.info("[main][Enter] Starting dataset mapper CLI.")
try:
clients = setup_clients(logger)
superset_client = clients.get(args.env)
if not superset_client:
logger.error(f"[main][Failure] Superset client for '{args.env}' environment not found.")
return
mapper = DatasetMapper(logger)
mapper.run_mapping(
superset_client=superset_client,
dataset_id=args.dataset_id,
source=args.source,
postgres_config=POSTGRES_CONFIG if args.source in ['postgres', 'both'] else None,
excel_path=args.excel_path if args.source in ['excel', 'both'] else None,
table_name=args.table_name if args.source in ['postgres', 'both'] else None,
table_schema=args.table_schema if args.source in ['postgres', 'both'] else None
)
logger.info("[main][Exit] Dataset mapper process finished.")
except Exception as main_exc:
logger.error("[main][Failure] An unexpected error occurred: %s", main_exc, exc_info=True)
# [/DEF:main]
if __name__ == '__main__':
main()
# [/DEF:run_mapper]
# [DEF:run_mapper:Module]
#
# @SEMANTICS: runner, configuration, cli, main
# @PURPOSE: Этот модуль является CLI-точкой входа для запуска процесса меппинга метаданных датасетов.
# @LAYER: App
# @RELATION: DEPENDS_ON -> superset_tool.utils.dataset_mapper
# @RELATION: DEPENDS_ON -> superset_tool.utils
# @PUBLIC_API: main
# [SECTION: IMPORTS]
import argparse
import keyring
from superset_tool.utils.init_clients import setup_clients
from superset_tool.utils.logger import SupersetLogger
from superset_tool.utils.dataset_mapper import DatasetMapper
# [/SECTION]
# [DEF:main:Function]
# @PURPOSE: Парсит аргументы командной строки и запускает процесс меппинга.
# @RELATION: CREATES_INSTANCE_OF -> DatasetMapper
# @RELATION: CALLS -> setup_clients
# @RELATION: CALLS -> DatasetMapper.run_mapping
def main():
parser = argparse.ArgumentParser(description="Map dataset verbose names in Superset.")
parser.add_argument('--source', type=str, required=True, choices=['postgres', 'excel', 'both'], help='The source for the mapping.')
parser.add_argument('--dataset-id', type=int, required=True, help='The ID of the dataset to update.')
parser.add_argument('--table-name', type=str, help='The table name for PostgreSQL source.')
parser.add_argument('--table-schema', type=str, help='The table schema for PostgreSQL source.')
parser.add_argument('--excel-path', type=str, help='The path to the Excel file.')
parser.add_argument('--env', type=str, default='dev', help='The Superset environment to use.')
args = parser.parse_args()
logger = SupersetLogger(name="dataset_mapper_main")
# [AI_NOTE]: Конфигурация БД должна быть вынесена во внешний файл или переменные окружения.
POSTGRES_CONFIG = {
'dbname': 'dwh',
'user': keyring.get_password("system", f"dwh gp user"),
'password': keyring.get_password("system", f"dwh gp password"),
'host': '10.66.229.201',
'port': '5432'
}
logger.info("[main][Enter] Starting dataset mapper CLI.")
try:
clients = setup_clients(logger)
superset_client = clients.get(args.env)
if not superset_client:
logger.error(f"[main][Failure] Superset client for '{args.env}' environment not found.")
return
mapper = DatasetMapper(logger)
mapper.run_mapping(
superset_client=superset_client,
dataset_id=args.dataset_id,
source=args.source,
postgres_config=POSTGRES_CONFIG if args.source in ['postgres', 'both'] else None,
excel_path=args.excel_path if args.source in ['excel', 'both'] else None,
table_name=args.table_name if args.source in ['postgres', 'both'] else None,
table_schema=args.table_schema if args.source in ['postgres', 'both'] else None
)
logger.info("[main][Exit] Dataset mapper process finished.")
except Exception as main_exc:
logger.error("[main][Failure] An unexpected error occurred: %s", main_exc, exc_info=True)
# [/DEF:main]
if __name__ == '__main__':
main()
# [/DEF:run_mapper]

408
search_script.py Normal file → Executable file
View File

@@ -1,204 +1,204 @@
# [DEF:search_script:Module]
#
# @SEMANTICS: search, superset, dataset, regex, file_output
# @PURPOSE: Предоставляет утилиты для поиска по текстовым паттернам в метаданных датасетов Superset.
# @LAYER: App
# @RELATION: DEPENDS_ON -> superset_tool.client
# @RELATION: DEPENDS_ON -> superset_tool.utils
# @PUBLIC_API: search_datasets, save_results_to_file, print_search_results, main
# [SECTION: IMPORTS]
import logging
import re
import os
from typing import Dict, Optional
from requests.exceptions import RequestException
from superset_tool.client import SupersetClient
from superset_tool.exceptions import SupersetAPIError
from superset_tool.utils.logger import SupersetLogger
from superset_tool.utils.init_clients import setup_clients
# [/SECTION]
# [DEF:search_datasets:Function]
# @PURPOSE: Выполняет поиск по строковому паттерну в метаданных всех датасетов.
# @PRE: `client` должен быть инициализированным экземпляром `SupersetClient`.
# @PRE: `search_pattern` должен быть валидной строкой регулярного выражения.
# @POST: Возвращает словарь с результатами поиска, где ключ - ID датасета, значение - список совпадений.
# @RELATION: CALLS -> client.get_datasets
# @THROW: re.error - Если паттерн регулярного выражения невалиден.
# @THROW: SupersetAPIError, RequestException - При критических ошибках API.
# @PARAM: client (SupersetClient) - Клиент для доступа к API Superset.
# @PARAM: search_pattern (str) - Регулярное выражение для поиска.
# @PARAM: logger (Optional[SupersetLogger]) - Инстанс логгера.
# @RETURN: Optional[Dict] - Словарь с результатами или None, если ничего не найдено.
def search_datasets(
client: SupersetClient,
search_pattern: str,
logger: Optional[SupersetLogger] = None
) -> Optional[Dict]:
logger = logger or SupersetLogger(name="dataset_search")
logger.info(f"[search_datasets][Enter] Searching for pattern: '{search_pattern}'")
try:
_, datasets = client.get_datasets(query={"columns": ["id", "table_name", "sql", "database", "columns"]})
if not datasets:
logger.warning("[search_datasets][State] No datasets found.")
return None
pattern = re.compile(search_pattern, re.IGNORECASE)
results = {}
for dataset in datasets:
dataset_id = dataset.get('id')
if not dataset_id:
continue
matches = []
for field, value in dataset.items():
value_str = str(value)
if pattern.search(value_str):
match_obj = pattern.search(value_str)
matches.append({
"field": field,
"match": match_obj.group() if match_obj else "",
"value": value_str
})
if matches:
results[dataset_id] = matches
logger.info(f"[search_datasets][Success] Found matches in {len(results)} datasets.")
return results
except re.error as e:
logger.error(f"[search_datasets][Failure] Invalid regex pattern: {e}", exc_info=True)
raise
except (SupersetAPIError, RequestException) as e:
logger.critical(f"[search_datasets][Failure] Critical error during search: {e}", exc_info=True)
raise
# [/DEF:search_datasets]
# [DEF:save_results_to_file:Function]
# @PURPOSE: Сохраняет результаты поиска в текстовый файл.
# @PRE: `results` является словарем, возвращенным `search_datasets`, или `None`.
# @PRE: `filename` должен быть допустимым путем к файлу.
# @POST: Записывает отформатированные результаты в указанный файл.
# @PARAM: results (Optional[Dict]) - Словарь с результатами поиска.
# @PARAM: filename (str) - Имя файла для сохранения результатов.
# @PARAM: logger (Optional[SupersetLogger]) - Инстанс логгера.
# @RETURN: bool - Успешно ли выполнено сохранение.
def save_results_to_file(results: Optional[Dict], filename: str, logger: Optional[SupersetLogger] = None) -> bool:
logger = logger or SupersetLogger(name="file_writer")
logger.info(f"[save_results_to_file][Enter] Saving results to file: {filename}")
try:
formatted_report = print_search_results(results)
with open(filename, 'w', encoding='utf-8') as f:
f.write(formatted_report)
logger.info(f"[save_results_to_file][Success] Results saved to {filename}")
return True
except Exception as e:
logger.error(f"[save_results_to_file][Failure] Failed to save results to file: {e}", exc_info=True)
return False
# [/DEF:save_results_to_file]
# [DEF:print_search_results:Function]
# @PURPOSE: Форматирует результаты поиска для читаемого вывода в консоль.
# @PRE: `results` является словарем, возвращенным `search_datasets`, или `None`.
# @POST: Возвращает отформатированную строку с результатами.
# @PARAM: results (Optional[Dict]) - Словарь с результатами поиска.
# @PARAM: context_lines (int) - Количество строк контекста для вывода до и после совпадения.
# @RETURN: str - Отформатированный отчет.
def print_search_results(results: Optional[Dict], context_lines: int = 3) -> str:
if not results:
return "Ничего не найдено"
output = []
for dataset_id, matches in results.items():
# Получаем информацию о базе данных для текущего датасета
database_info = ""
# Ищем поле database среди совпадений, чтобы вывести его
for match_info in matches:
if match_info['field'] == 'database':
database_info = match_info['value']
break
# Если database не найден в совпадениях, пробуем получить из других полей
if not database_info:
# Предполагаем, что база данных может быть в одном из полей, например sql или table_name
# Но для точности лучше использовать специальное поле, которое мы уже получили
pass # Пока не выводим, если не нашли явно
output.append(f"\n--- Dataset ID: {dataset_id} ---")
if database_info:
output.append(f" Database: {database_info}")
output.append("") # Пустая строка для читабельности
for match_info in matches:
field, match_text, full_value = match_info['field'], match_info['match'], match_info['value']
output.append(f" - Поле: {field}")
output.append(f" Совпадение: '{match_text}'")
lines = full_value.splitlines()
if not lines: continue
match_line_index = -1
for i, line in enumerate(lines):
if match_text in line:
match_line_index = i
break
if match_line_index != -1:
start = max(0, match_line_index - context_lines)
end = min(len(lines), match_line_index + context_lines + 1)
output.append(" Контекст:")
for i in range(start, end):
prefix = f"{i + 1:5d}: "
line_content = lines[i]
if i == match_line_index:
highlighted = line_content.replace(match_text, f">>>{match_text}<<<")
output.append(f" {prefix}{highlighted}")
else:
output.append(f" {prefix}{line_content}")
output.append("-" * 25)
return "\n".join(output)
# [/DEF:print_search_results]
# [DEF:main:Function]
# @PURPOSE: Основная точка входа для запуска скрипта поиска.
# @RELATION: CALLS -> setup_clients
# @RELATION: CALLS -> search_datasets
# @RELATION: CALLS -> print_search_results
# @RELATION: CALLS -> save_results_to_file
def main():
logger = SupersetLogger(level=logging.INFO, console=True)
clients = setup_clients(logger)
target_client = clients['dev5']
search_query = r"from dm(_view)*.account_debt"
# Генерируем имя файла на основе времени
import datetime
timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
output_filename = f"search_results_{timestamp}.txt"
results = search_datasets(
client=target_client,
search_pattern=search_query,
logger=logger
)
report = print_search_results(results)
logger.info(f"[main][Success] Search finished. Report:\n{report}")
# Сохраняем результаты в файл
success = save_results_to_file(results, output_filename, logger)
if success:
logger.info(f"[main][Success] Results also saved to file: {output_filename}")
else:
logger.error(f"[main][Failure] Failed to save results to file: {output_filename}")
# [/DEF:main]
if __name__ == "__main__":
main()
# [/DEF:search_script]
# [DEF:search_script:Module]
#
# @SEMANTICS: search, superset, dataset, regex, file_output
# @PURPOSE: Предоставляет утилиты для поиска по текстовым паттернам в метаданных датасетов Superset.
# @LAYER: App
# @RELATION: DEPENDS_ON -> superset_tool.client
# @RELATION: DEPENDS_ON -> superset_tool.utils
# @PUBLIC_API: search_datasets, save_results_to_file, print_search_results, main
# [SECTION: IMPORTS]
import logging
import re
import os
from typing import Dict, Optional
from requests.exceptions import RequestException
from superset_tool.client import SupersetClient
from superset_tool.exceptions import SupersetAPIError
from superset_tool.utils.logger import SupersetLogger
from superset_tool.utils.init_clients import setup_clients
# [/SECTION]
# [DEF:search_datasets:Function]
# @PURPOSE: Выполняет поиск по строковому паттерну в метаданных всех датасетов.
# @PRE: `client` должен быть инициализированным экземпляром `SupersetClient`.
# @PRE: `search_pattern` должен быть валидной строкой регулярного выражения.
# @POST: Возвращает словарь с результатами поиска, где ключ - ID датасета, значение - список совпадений.
# @RELATION: CALLS -> client.get_datasets
# @THROW: re.error - Если паттерн регулярного выражения невалиден.
# @THROW: SupersetAPIError, RequestException - При критических ошибках API.
# @PARAM: client (SupersetClient) - Клиент для доступа к API Superset.
# @PARAM: search_pattern (str) - Регулярное выражение для поиска.
# @PARAM: logger (Optional[SupersetLogger]) - Инстанс логгера.
# @RETURN: Optional[Dict] - Словарь с результатами или None, если ничего не найдено.
def search_datasets(
client: SupersetClient,
search_pattern: str,
logger: Optional[SupersetLogger] = None
) -> Optional[Dict]:
logger = logger or SupersetLogger(name="dataset_search")
logger.info(f"[search_datasets][Enter] Searching for pattern: '{search_pattern}'")
try:
_, datasets = client.get_datasets(query={"columns": ["id", "table_name", "sql", "database", "columns"]})
if not datasets:
logger.warning("[search_datasets][State] No datasets found.")
return None
pattern = re.compile(search_pattern, re.IGNORECASE)
results = {}
for dataset in datasets:
dataset_id = dataset.get('id')
if not dataset_id:
continue
matches = []
for field, value in dataset.items():
value_str = str(value)
if pattern.search(value_str):
match_obj = pattern.search(value_str)
matches.append({
"field": field,
"match": match_obj.group() if match_obj else "",
"value": value_str
})
if matches:
results[dataset_id] = matches
logger.info(f"[search_datasets][Success] Found matches in {len(results)} datasets.")
return results
except re.error as e:
logger.error(f"[search_datasets][Failure] Invalid regex pattern: {e}", exc_info=True)
raise
except (SupersetAPIError, RequestException) as e:
logger.critical(f"[search_datasets][Failure] Critical error during search: {e}", exc_info=True)
raise
# [/DEF:search_datasets]
# [DEF:save_results_to_file:Function]
# @PURPOSE: Сохраняет результаты поиска в текстовый файл.
# @PRE: `results` является словарем, возвращенным `search_datasets`, или `None`.
# @PRE: `filename` должен быть допустимым путем к файлу.
# @POST: Записывает отформатированные результаты в указанный файл.
# @PARAM: results (Optional[Dict]) - Словарь с результатами поиска.
# @PARAM: filename (str) - Имя файла для сохранения результатов.
# @PARAM: logger (Optional[SupersetLogger]) - Инстанс логгера.
# @RETURN: bool - Успешно ли выполнено сохранение.
def save_results_to_file(results: Optional[Dict], filename: str, logger: Optional[SupersetLogger] = None) -> bool:
logger = logger or SupersetLogger(name="file_writer")
logger.info(f"[save_results_to_file][Enter] Saving results to file: {filename}")
try:
formatted_report = print_search_results(results)
with open(filename, 'w', encoding='utf-8') as f:
f.write(formatted_report)
logger.info(f"[save_results_to_file][Success] Results saved to {filename}")
return True
except Exception as e:
logger.error(f"[save_results_to_file][Failure] Failed to save results to file: {e}", exc_info=True)
return False
# [/DEF:save_results_to_file]
# [DEF:print_search_results:Function]
# @PURPOSE: Форматирует результаты поиска для читаемого вывода в консоль.
# @PRE: `results` является словарем, возвращенным `search_datasets`, или `None`.
# @POST: Возвращает отформатированную строку с результатами.
# @PARAM: results (Optional[Dict]) - Словарь с результатами поиска.
# @PARAM: context_lines (int) - Количество строк контекста для вывода до и после совпадения.
# @RETURN: str - Отформатированный отчет.
def print_search_results(results: Optional[Dict], context_lines: int = 3) -> str:
if not results:
return "Ничего не найдено"
output = []
for dataset_id, matches in results.items():
# Получаем информацию о базе данных для текущего датасета
database_info = ""
# Ищем поле database среди совпадений, чтобы вывести его
for match_info in matches:
if match_info['field'] == 'database':
database_info = match_info['value']
break
# Если database не найден в совпадениях, пробуем получить из других полей
if not database_info:
# Предполагаем, что база данных может быть в одном из полей, например sql или table_name
# Но для точности лучше использовать специальное поле, которое мы уже получили
pass # Пока не выводим, если не нашли явно
output.append(f"\n--- Dataset ID: {dataset_id} ---")
if database_info:
output.append(f" Database: {database_info}")
output.append("") # Пустая строка для читабельности
for match_info in matches:
field, match_text, full_value = match_info['field'], match_info['match'], match_info['value']
output.append(f" - Поле: {field}")
output.append(f" Совпадение: '{match_text}'")
lines = full_value.splitlines()
if not lines: continue
match_line_index = -1
for i, line in enumerate(lines):
if match_text in line:
match_line_index = i
break
if match_line_index != -1:
start = max(0, match_line_index - context_lines)
end = min(len(lines), match_line_index + context_lines + 1)
output.append(" Контекст:")
for i in range(start, end):
prefix = f"{i + 1:5d}: "
line_content = lines[i]
if i == match_line_index:
highlighted = line_content.replace(match_text, f">>>{match_text}<<<")
output.append(f" {prefix}{highlighted}")
else:
output.append(f" {prefix}{line_content}")
output.append("-" * 25)
return "\n".join(output)
# [/DEF:print_search_results]
# [DEF:main:Function]
# @PURPOSE: Основная точка входа для запуска скрипта поиска.
# @RELATION: CALLS -> setup_clients
# @RELATION: CALLS -> search_datasets
# @RELATION: CALLS -> print_search_results
# @RELATION: CALLS -> save_results_to_file
def main():
logger = SupersetLogger(level=logging.INFO, console=True)
clients = setup_clients(logger)
target_client = clients['dev5']
search_query = r"from dm(_view)*.account_debt"
# Генерируем имя файла на основе времени
import datetime
timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
output_filename = f"search_results_{timestamp}.txt"
results = search_datasets(
client=target_client,
search_pattern=search_query,
logger=logger
)
report = print_search_results(results)
logger.info(f"[main][Success] Search finished. Report:\n{report}")
# Сохраняем результаты в файл
success = save_results_to_file(results, output_filename, logger)
if success:
logger.info(f"[main][Success] Results also saved to file: {output_filename}")
else:
logger.error(f"[main][Failure] Failed to save results to file: {output_filename}")
# [/DEF:main]
if __name__ == "__main__":
main()
# [/DEF:search_script]

298
semantic_protocol.md Normal file → Executable file
View File

@@ -1,124 +1,174 @@
# SYSTEM STANDARD: CODE GENERATION PROTOCOL
**OBJECTIVE:** Generate Python code that strictly adheres to the Semantic Coherence standards defined below. All output must be machine-readable, fractal-structured, and optimized for Sparse Attention navigation.
## I. CORE REQUIREMENTS
1. **Causal Validity:** Semantic definitions (Contracts) must ALWAYS precede implementation code.
2. **Immutability:** Once defined, architectural decisions in the Module Header are treated as immutable constraints.
3. **Format Compliance:** Output must strictly follow the `[DEF]` / `[/DEF]` anchor syntax.
---
## II. SYNTAX SPECIFICATION
Code must be wrapped in semantic anchors using square brackets to minimize token interference.
### 1. Entity Anchors (The "Container")
* **Start:** `# [DEF:identifier:Type]`
* **End:** `# [/DEF:identifier]` (MANDATORY for semantic accumulation)
* **Types:** `Module`, `Class`, `Function`, `DataClass`, `Enum`.
### 2. Metadata Tags (The "Content")
* **Syntax:** `# @KEY: Value`
* **Location:** Inside the `[DEF]` block, before any code.
### 3. Graph Relations (The "Map")
* **Syntax:** `# @RELATION: TYPE -> TARGET_ID`
* **Types:** `DEPENDS_ON`, `CALLS`, `INHERITS_FROM`, `IMPLEMENTS`, `WRITES_TO`, `READS_FROM`.
---
## III. FILE STRUCTURE STANDARD
### 1. Python Module Header
Every `.py` file starts with a Module definition.
```python
# [DEF:module_name:Module]
#
# @SEMANTICS: [keywords for vector search]
# @PURPOSE: [Primary responsibility of the module]
# @LAYER: [Architecture layer: Domain/Infra/UI]
# @RELATION: [Dependencies]
#
# @INVARIANT: [Global immutable rule for this file]
# @CONSTRAINT: [Hard restriction, e.g., "No SQL here"]
# @PUBLIC_API: [Exported symbols]
# [SECTION: IMPORTS]
...
# [/SECTION]
# ... IMPLEMENTATION ...
# [/DEF:module_name]
```
### 2. Svelte Component Header
Every `.svelte` file starts with a Component definition inside an HTML comment.
```html
<!--
[DEF:ComponentName:Component]
@SEMANTICS: [keywords]
@PURPOSE: [Primary responsibility]
@LAYER: [UI/State/Layout]
@RELATION: [Child components, Stores, API]
@PROPS:
- name: type - description
@EVENTS:
- name: payload_type - description
@INVARIANT: [Immutable UI rule]
-->
<script>
// ...
</script>
<!-- [/DEF:ComponentName] -->
```
---
## IV. FUNCTION & CLASS CONTRACTS (DbC)
Contracts are the **Source of Truth**.
**Required Template:**
```python
# [DEF:func_name:Function]
# @PURPOSE: [Description]
# @SPEC_LINK: [Requirement ID]
#
# @PRE: [Condition required before execution]
# @POST: [Condition guaranteed after execution]
# @PARAM: [name] ([type]) - [desc]
# @RETURN: [type] - [desc]
# @THROW: [Exception] - [Reason]
#
# @RELATION: [Graph connections]
def func_name(...):
# 1. Runtime check of @PRE
# 2. Logic implementation
# 3. Runtime check of @POST
pass
# [/DEF:func_name]
```
---
## V. LOGGING STANDARD (BELIEF STATE)
Logs define the agent's internal state for debugging and coherence checks.
**Format:** `logger.level(f"[{ANCHOR_ID}][{STATE}] {MESSAGE} context={...}")`
**States:** `Entry`, `Validation`, `Action`, `Coherence:OK`, `Coherence:Failed`, `Exit`.
---
## VI. GENERATION WORKFLOW
1. **Analyze Request:** Identify target module and graph position.
2. **Define Structure:** Generate `[DEF]` anchors and Contracts FIRST.
3. **Implement Logic:** Write code satisfying Contracts.
4. **Validate:** If logic conflicts with Contract -> Stop -> Report Error.
Here is the revised **System Standard**, adapted for a Polyglot environment (Python Backend + Svelte Frontend) and removing the requirement for explicit assertion generation.
This protocol standardizes the "Semantic Bridge" between the two languages using unified Anchor logic while respecting the native documentation standards (Comments for Python, JSDoc for JavaScript/Svelte).
***
# SYSTEM STANDARD: POLYGLOT CODE GENERATION PROTOCOL (GRACE-Poly)
**OBJECTIVE:** Generate Python and Svelte/TypeScript code that strictly adheres to Semantic Coherence standards. Output must be machine-readable, fractal-structured, and optimized for Sparse Attention navigation.
## I. CORE REQUIREMENTS
1. **Causal Validity:** Semantic definitions (Contracts) must ALWAYS precede implementation code.
2. **Immutability:** Architectural decisions defined in the Module/Component Header are treated as immutable constraints.
3. **Format Compliance:** Output must strictly follow the `[DEF]` / `[/DEF]` anchor syntax for structure.
4. **Logic over Assertion:** Contracts define the *logic flow*. Do not generate explicit `assert` statements unless requested. The code logic itself must inherently satisfy the Pre/Post conditions (e.g., via control flow, guards, or types).
---
## II. SYNTAX SPECIFICATION
Code structure is defined by **Anchors** (square brackets). Metadata is defined by **Tags** (native comment style).
### 1. Entity Anchors (The "Container")
Used to define the boundaries of Modules, Classes, Components, and Functions.
* **Python:**
* Start: `# [DEF:identifier:Type]`
* End: `# [/DEF:identifier]`
* **Svelte (Top-level):**
* Start: `<!-- [DEF:ComponentName:Component] -->`
* End: `<!-- [/DEF:ComponentName] -->`
* **Svelte (Script/JS/TS):**
* Start: `// [DEF:funcName:Function]`
* End: `// [/DEF:funcName]`
**Types:** `Module`, `Component`, `Class`, `Function`, `Store`, `Action`.
### 2. Graph Relations (The "Map")
Defines high-level dependencies.
* **Python Syntax:** `# @RELATION: TYPE -> TARGET_ID`
* **Svelte/JS Syntax:** `// @RELATION: TYPE -> TARGET_ID`
* **Types:** `DEPENDS_ON`, `CALLS`, `INHERITS_FROM`, `IMPLEMENTS`, `BINDS_TO`, `DISPATCHES`.
---
## III. FILE STRUCTURE STANDARD
### 1. Python Module Header (`.py`)
```python
# [DEF:module_name:Module]
#
# @SEMANTICS: [keywords for vector search]
# @PURPOSE: [Primary responsibility of the module]
# @LAYER: [Domain/Infra/API]
# @RELATION: [Dependencies]
#
# @INVARIANT: [Global immutable rule]
# @CONSTRAINT: [Hard restriction, e.g., "No ORM calls here"]
# [SECTION: IMPORTS]
...
# [/SECTION]
# ... IMPLEMENTATION ...
# [/DEF:module_name]
```
### 2. Svelte Component Header (`.svelte`)
```html
<!-- [DEF:ComponentName:Component] -->
<!--
@SEMANTICS: [keywords]
@PURPOSE: [Primary UI responsibility]
@LAYER: [Feature/Atom/Layout]
@RELATION: [Child components, Stores]
@INVARIANT: [UI rules, e.g., "Always responsive"]
-->
<script lang="ts">
// [SECTION: IMPORTS]
// ...
// [/SECTION]
// ... LOGIC IMPLEMENTATION ...
</script>
<!-- [SECTION: TEMPLATE] -->
...
<!-- [/SECTION] -->
<style>
/* ... */
</style>
<!-- [/DEF:ComponentName] -->
```
---
## IV. CONTRACTS (Design by Contract)
Contracts define *what* the code does before *how* it does it.
### 1. Python Contract Style
Uses comment blocks inside the anchor.
```python
# [DEF:calculate_total:Function]
# @PURPOSE: Calculates cart total including tax.
# @PRE: items list is not empty.
# @POST: returns non-negative Decimal.
# @PARAM: items (List[Item]) - Cart items.
# @RETURN: Decimal - Final total.
def calculate_total(items: List[Item]) -> Decimal:
# Logic implementation that respects @PRE
if not items:
return Decimal(0)
# ... calculation ...
# Logic ensuring @POST
return total
# [/DEF:calculate_total]
```
### 2. Svelte/JS Contract Style (JSDoc)
Uses JSDoc blocks inside the anchor. Standard JSDoc tags are used where possible; custom GRACE tags are added for strictness.
```javascript
// [DEF:updateUserProfile:Function]
/**
* @purpose Updates user data in the store and backend.
* @pre User must be authenticated (session token exists).
* @post UserStore is updated with new data.
* @param {Object} profileData - The new profile fields.
* @returns {Promise<void>}
* @throws {AuthError} If session is invalid.
*/
// @RELATION: CALLS -> api.user.update
async function updateUserProfile(profileData) {
// Logic implementation
if (!session.token) throw new AuthError();
// ...
}
// [/DEF:updateUserProfile]
```
---
## V. LOGGING STANDARD (BELIEF STATE)
Logs delineate the agent's internal state.
* **Python:** `logger.info(f"[{ANCHOR_ID}][{STATE}] Msg")`
* **Svelte/JS:** `console.log(\`[${ANCHOR_ID}][${STATE}] Msg\`)`
**Required States:**
1. `Entry` (Start of block)
2. `Action` (Key business logic)
3. `Coherence:OK` (Logic successfully completed)
4. `Coherence:Failed` (Error handling)
5. `Exit` (End of block)
---
## VI. GENERATION WORKFLOW
1. **Context Analysis:** Identify language (Python vs Svelte) and Architecture Layer.
2. **Scaffolding:** Generate the `[DEF]` Anchors and Header/Contract **before** writing any logic.
3. **Implementation:** Write the code. Ensure the code logic handles the `@PRE` conditions (e.g., via `if/return` or guards) and satisfies `@POST` conditions naturally. **Do not write explicit `assert` statements unless debugging mode is requested.**
4. **Closure:** Ensure every `[DEF]` is closed with `[/DEF]` to accumulate semantic context.

View File

@@ -1,34 +1,34 @@
# Specification Quality Checklist: Plugin Architecture & Svelte Web UI
**Purpose**: Validate specification completeness and quality before proceeding to planning
**Created**: 2025-12-19
**Feature**: [Link to spec.md](../spec.md)
## Content Quality
- [x] No implementation details (languages, frameworks, APIs)
- [x] Focused on user value and business needs
- [x] Written for non-technical stakeholders
- [x] All mandatory sections completed
## Requirement Completeness
- [x] No [NEEDS CLARIFICATION] markers remain
- [x] Requirements are testable and unambiguous
- [x] Success criteria are measurable
- [x] Success criteria are technology-agnostic (no implementation details)
- [x] All acceptance scenarios are defined
- [x] Edge cases are identified
- [x] Scope is clearly bounded
- [x] Dependencies and assumptions identified
## Feature Readiness
- [x] All functional requirements have clear acceptance criteria
- [x] User scenarios cover primary flows
- [x] Feature meets measurable outcomes defined in Success Criteria
- [x] No implementation details leak into specification
## Notes
# Specification Quality Checklist: Plugin Architecture & Svelte Web UI
**Purpose**: Validate specification completeness and quality before proceeding to planning
**Created**: 2025-12-19
**Feature**: [Link to spec.md](../spec.md)
## Content Quality
- [x] No implementation details (languages, frameworks, APIs)
- [x] Focused on user value and business needs
- [x] Written for non-technical stakeholders
- [x] All mandatory sections completed
## Requirement Completeness
- [x] No [NEEDS CLARIFICATION] markers remain
- [x] Requirements are testable and unambiguous
- [x] Success criteria are measurable
- [x] Success criteria are technology-agnostic (no implementation details)
- [x] All acceptance scenarios are defined
- [x] Edge cases are identified
- [x] Scope is clearly bounded
- [x] Dependencies and assumptions identified
## Feature Readiness
- [x] All functional requirements have clear acceptance criteria
- [x] User scenarios cover primary flows
- [x] Feature meets measurable outcomes defined in Success Criteria
- [x] No implementation details leak into specification
## Notes
- Clarification resolved: Deployment context is hosted multi-user service with ADFS login.

264
specs/001-plugin-arch-svelte-ui/contracts/api.yaml Normal file → Executable file
View File

@@ -1,132 +1,132 @@
openapi: 3.0.0
info:
title: Superset Tools API
version: 1.0.0
description: API for managing Superset automation tools and plugins.
paths:
/plugins:
get:
summary: List available plugins
operationId: list_plugins
responses:
'200':
description: List of plugins
content:
application/json:
schema:
type: array
items:
$ref: '#/components/schemas/Plugin'
/tasks:
post:
summary: Start a new task
operationId: create_task
requestBody:
required: true
content:
application/json:
schema:
type: object
required:
- plugin_id
- params
properties:
plugin_id:
type: string
params:
type: object
responses:
'201':
description: Task created
content:
application/json:
schema:
$ref: '#/components/schemas/Task'
get:
summary: List recent tasks
operationId: list_tasks
responses:
'200':
description: List of tasks
content:
application/json:
schema:
type: array
items:
$ref: '#/components/schemas/Task'
/tasks/{task_id}:
get:
summary: Get task details
operationId: get_task
parameters:
- name: task_id
in: path
required: true
schema:
type: string
format: uuid
responses:
'200':
description: Task details
content:
application/json:
schema:
$ref: '#/components/schemas/Task'
/tasks/{task_id}/logs:
get:
summary: Stream task logs (WebSocket upgrade)
operationId: stream_logs
parameters:
- name: task_id
in: path
required: true
schema:
type: string
format: uuid
responses:
'101':
description: Switching Protocols to WebSocket
components:
schemas:
Plugin:
type: object
properties:
id:
type: string
name:
type: string
description:
type: string
version:
type: string
schema:
type: object
description: JSON Schema for input parameters
enabled:
type: boolean
Task:
type: object
properties:
id:
type: string
format: uuid
plugin_id:
type: string
status:
type: string
enum: [PENDING, RUNNING, SUCCESS, FAILED]
started_at:
type: string
format: date-time
finished_at:
type: string
format: date-time
user_id:
type: string
openapi: 3.0.0
info:
title: Superset Tools API
version: 1.0.0
description: API for managing Superset automation tools and plugins.
paths:
/plugins:
get:
summary: List available plugins
operationId: list_plugins
responses:
'200':
description: List of plugins
content:
application/json:
schema:
type: array
items:
$ref: '#/components/schemas/Plugin'
/tasks:
post:
summary: Start a new task
operationId: create_task
requestBody:
required: true
content:
application/json:
schema:
type: object
required:
- plugin_id
- params
properties:
plugin_id:
type: string
params:
type: object
responses:
'201':
description: Task created
content:
application/json:
schema:
$ref: '#/components/schemas/Task'
get:
summary: List recent tasks
operationId: list_tasks
responses:
'200':
description: List of tasks
content:
application/json:
schema:
type: array
items:
$ref: '#/components/schemas/Task'
/tasks/{task_id}:
get:
summary: Get task details
operationId: get_task
parameters:
- name: task_id
in: path
required: true
schema:
type: string
format: uuid
responses:
'200':
description: Task details
content:
application/json:
schema:
$ref: '#/components/schemas/Task'
/tasks/{task_id}/logs:
get:
summary: Stream task logs (WebSocket upgrade)
operationId: stream_logs
parameters:
- name: task_id
in: path
required: true
schema:
type: string
format: uuid
responses:
'101':
description: Switching Protocols to WebSocket
components:
schemas:
Plugin:
type: object
properties:
id:
type: string
name:
type: string
description:
type: string
version:
type: string
schema:
type: object
description: JSON Schema for input parameters
enabled:
type: boolean
Task:
type: object
properties:
id:
type: string
format: uuid
plugin_id:
type: string
status:
type: string
enum: [PENDING, RUNNING, SUCCESS, FAILED]
started_at:
type: string
format: date-time
finished_at:
type: string
format: date-time
user_id:
type: string

100
specs/001-plugin-arch-svelte-ui/data-model.md Normal file → Executable file
View File

@@ -1,51 +1,51 @@
# Data Model: Plugin Architecture & Svelte Web UI
## Entities
### Plugin
Represents a loadable extension module.
| Field | Type | Description |
|-------|------|-------------|
| `id` | `str` | Unique identifier (e.g., "backup-tool") |
| `name` | `str` | Display name (e.g., "Backup Dashboard") |
| `description` | `str` | Short description of functionality |
| `version` | `str` | Plugin version string |
| `schema` | `dict` | JSON Schema for input parameters (generated from Pydantic) |
| `enabled` | `bool` | Whether the plugin is active |
### Task
Represents an execution instance of a plugin.
| Field | Type | Description |
|-------|------|-------------|
| `id` | `UUID` | Unique execution ID |
| `plugin_id` | `str` | ID of the plugin being executed |
| `status` | `Enum` | `PENDING`, `RUNNING`, `SUCCESS`, `FAILED` |
| `started_at` | `DateTime` | Timestamp when task started |
| `finished_at` | `DateTime` | Timestamp when task completed (nullable) |
| `user_id` | `str` | ID of the user who triggered the task |
| `logs` | `List[LogEntry]` | Structured logs from the execution |
### LogEntry
Represents a single log line from a task.
| Field | Type | Description |
|-------|------|-------------|
| `timestamp` | `DateTime` | Time of log event |
| `level` | `Enum` | `INFO`, `WARNING`, `ERROR`, `DEBUG` |
| `message` | `str` | Log content |
| `context` | `dict` | Additional metadata (optional) |
## State Transitions
### Task Lifecycle
1. **Created**: Task initialized with input parameters. Status: `PENDING`.
2. **Started**: Worker picks up task. Status: `RUNNING`.
3. **Completed**: Execution finishes without exception. Status: `SUCCESS`.
4. **Failed**: Execution raises unhandled exception. Status: `FAILED`.
## Validation Rules
- **Plugin ID**: Must be alphanumeric, lowercase, hyphens allowed.
# Data Model: Plugin Architecture & Svelte Web UI
## Entities
### Plugin
Represents a loadable extension module.
| Field | Type | Description |
|-------|------|-------------|
| `id` | `str` | Unique identifier (e.g., "backup-tool") |
| `name` | `str` | Display name (e.g., "Backup Dashboard") |
| `description` | `str` | Short description of functionality |
| `version` | `str` | Plugin version string |
| `schema` | `dict` | JSON Schema for input parameters (generated from Pydantic) |
| `enabled` | `bool` | Whether the plugin is active |
### Task
Represents an execution instance of a plugin.
| Field | Type | Description |
|-------|------|-------------|
| `id` | `UUID` | Unique execution ID |
| `plugin_id` | `str` | ID of the plugin being executed |
| `status` | `Enum` | `PENDING`, `RUNNING`, `SUCCESS`, `FAILED` |
| `started_at` | `DateTime` | Timestamp when task started |
| `finished_at` | `DateTime` | Timestamp when task completed (nullable) |
| `user_id` | `str` | ID of the user who triggered the task |
| `logs` | `List[LogEntry]` | Structured logs from the execution |
### LogEntry
Represents a single log line from a task.
| Field | Type | Description |
|-------|------|-------------|
| `timestamp` | `DateTime` | Time of log event |
| `level` | `Enum` | `INFO`, `WARNING`, `ERROR`, `DEBUG` |
| `message` | `str` | Log content |
| `context` | `dict` | Additional metadata (optional) |
## State Transitions
### Task Lifecycle
1. **Created**: Task initialized with input parameters. Status: `PENDING`.
2. **Started**: Worker picks up task. Status: `RUNNING`.
3. **Completed**: Execution finishes without exception. Status: `SUCCESS`.
4. **Failed**: Execution raises unhandled exception. Status: `FAILED`.
## Validation Rules
- **Plugin ID**: Must be alphanumeric, lowercase, hyphens allowed.
- **Input Parameters**: Must validate against the plugin's `schema`.

0
specs/001-plugin-arch-svelte-ui/plan.md Normal file → Executable file
View File

92
specs/001-plugin-arch-svelte-ui/quickstart.md Normal file → Executable file
View File

@@ -1,47 +1,47 @@
# Quickstart: Plugin Architecture & Svelte Web UI
## Prerequisites
- Python 3.9+
- Node.js 18+
- npm or pnpm
## Setup
1. **Install Backend Dependencies**:
```bash
cd backend
python -m venv venv
source venv/bin/activate # or venv\Scripts\activate on Windows
pip install -r requirements.txt
```
2. **Install Frontend Dependencies**:
```bash
cd frontend
npm install
```
## Running the Application
1. **Start Backend Server**:
```bash
# From backend/ directory
uvicorn src.app:app --reload --port 8000
```
2. **Start Frontend Dev Server**:
```bash
# From frontend/ directory
npm run dev
```
3. **Access the UI**:
Open `http://localhost:5173` in your browser.
## Adding a Plugin
1. Create a new Python file in `backend/src/plugins/` (e.g., `my_plugin.py`).
2. Define your plugin class inheriting from `PluginBase`.
3. Implement `execute` and `get_schema` methods.
4. Restart the backend (or rely on auto-reload).
# Quickstart: Plugin Architecture & Svelte Web UI
## Prerequisites
- Python 3.9+
- Node.js 18+
- npm or pnpm
## Setup
1. **Install Backend Dependencies**:
```bash
cd backend
python -m venv venv
source venv/bin/activate # or venv\Scripts\activate on Windows
pip install -r requirements.txt
```
2. **Install Frontend Dependencies**:
```bash
cd frontend
npm install
```
## Running the Application
1. **Start Backend Server**:
```bash
# From backend/ directory
uvicorn src.app:app --reload --port 8000
```
2. **Start Frontend Dev Server**:
```bash
# From frontend/ directory
npm run dev
```
3. **Access the UI**:
Open `http://localhost:5173` in your browser.
## Adding a Plugin
1. Create a new Python file in `backend/src/plugins/` (e.g., `my_plugin.py`).
2. Define your plugin class inheriting from `PluginBase`.
3. Implement `execute` and `get_schema` methods.
4. Restart the backend (or rely on auto-reload).
5. Your plugin should appear in the Web UI.

90
specs/001-plugin-arch-svelte-ui/research.md Normal file → Executable file
View File

@@ -1,46 +1,46 @@
# Research: Plugin Architecture & Svelte Web UI
## Decisions
### 1. Web Framework: FastAPI
- **Decision**: Use FastAPI for the Python backend.
- **Rationale**:
- Native support for Pydantic models (crucial for plugin schema validation).
- Async support (essential for handling long-running tasks and log streaming via WebSockets/SSE).
- Automatic OpenAPI documentation generation (simplifies frontend integration).
- High performance and modern ecosystem.
- **Alternatives Considered**:
- **Flask**: Mature but requires extensions for validation (Marshmallow) and async support is less native. Slower for high-concurrency API calls.
- **Django**: Too heavy for this use case; brings unnecessary ORM and template engine overhead.
### 2. Plugin System: `importlib` + Abstract Base Classes (ABC)
- **Decision**: Use Python's built-in `importlib` for dynamic loading and `abc` for defining the plugin interface.
- **Rationale**:
- `importlib` provides a standard, secure way to load modules from a path.
- ABCs ensure plugins implement required methods (`execute`, `get_schema`) at load time.
- Lightweight, no external dependencies required.
- **Alternatives Considered**:
- **Pluggy**: Used by pytest, powerful but adds complexity and dependency overhead.
- **Stevedore**: OpenStack's plugin loader, too complex for this scope.
### 3. Authentication: `authlib` + ADFS (OIDC/SAML)
- **Decision**: Use `authlib` to handle ADFS authentication via OpenID Connect (OIDC) or SAML.
- **Rationale**:
- `authlib` is the modern standard for OAuth/OIDC in Python.
- Supports integration with FastAPI via middleware.
- ADFS is the required identity provider (IdP).
- **Alternatives Considered**:
- **python-social-auth**: Older, harder to integrate with FastAPI.
- **Manual JWT implementation**: Risky and reinvents the wheel; ADFS handles the token issuance.
### 4. Frontend: Svelte + Vite
- **Decision**: Use Svelte for the UI framework and Vite as the build tool.
- **Rationale**:
- Svelte's compiler-based approach results in small bundles and high performance.
- Reactive model maps well to real-time log updates.
- Vite provides a fast development experience and easy integration with backend proxies.
## Unknowns Resolved
- **Deployment Context**: Hosted multi-user service with ADFS.
# Research: Plugin Architecture & Svelte Web UI
## Decisions
### 1. Web Framework: FastAPI
- **Decision**: Use FastAPI for the Python backend.
- **Rationale**:
- Native support for Pydantic models (crucial for plugin schema validation).
- Async support (essential for handling long-running tasks and log streaming via WebSockets/SSE).
- Automatic OpenAPI documentation generation (simplifies frontend integration).
- High performance and modern ecosystem.
- **Alternatives Considered**:
- **Flask**: Mature but requires extensions for validation (Marshmallow) and async support is less native. Slower for high-concurrency API calls.
- **Django**: Too heavy for this use case; brings unnecessary ORM and template engine overhead.
### 2. Plugin System: `importlib` + Abstract Base Classes (ABC)
- **Decision**: Use Python's built-in `importlib` for dynamic loading and `abc` for defining the plugin interface.
- **Rationale**:
- `importlib` provides a standard, secure way to load modules from a path.
- ABCs ensure plugins implement required methods (`execute`, `get_schema`) at load time.
- Lightweight, no external dependencies required.
- **Alternatives Considered**:
- **Pluggy**: Used by pytest, powerful but adds complexity and dependency overhead.
- **Stevedore**: OpenStack's plugin loader, too complex for this scope.
### 3. Authentication: `authlib` + ADFS (OIDC/SAML)
- **Decision**: Use `authlib` to handle ADFS authentication via OpenID Connect (OIDC) or SAML.
- **Rationale**:
- `authlib` is the modern standard for OAuth/OIDC in Python.
- Supports integration with FastAPI via middleware.
- ADFS is the required identity provider (IdP).
- **Alternatives Considered**:
- **python-social-auth**: Older, harder to integrate with FastAPI.
- **Manual JWT implementation**: Risky and reinvents the wheel; ADFS handles the token issuance.
### 4. Frontend: Svelte + Vite
- **Decision**: Use Svelte for the UI framework and Vite as the build tool.
- **Rationale**:
- Svelte's compiler-based approach results in small bundles and high performance.
- Reactive model maps well to real-time log updates.
- Vite provides a fast development experience and easy integration with backend proxies.
## Unknowns Resolved
- **Deployment Context**: Hosted multi-user service with ADFS.
- **Plugin Interface**: Will use Pydantic models to define input schemas, allowing the frontend to generate forms dynamically.

142
specs/001-plugin-arch-svelte-ui/spec.md Normal file → Executable file
View File

@@ -1,72 +1,72 @@
# Feature Specification: Plugin Architecture & Svelte Web UI
**Feature Branch**: `001-plugin-arch-svelte-ui`
**Created**: 2025-12-19
**Status**: Draft
**Input**: User description: "Я хочу перевести проект на плагинную архитектуру + добавить web-ui на svelte"
## User Scenarios & Testing *(mandatory)*
### User Story 1 - Web Interface for Superset Tools (Priority: P1)
As a user, I want to interact with the Superset tools (Backup, Migration, Search) through a graphical web interface so that I don't have to memorize CLI commands and arguments.
**Why this priority**: drastically improves usability and accessibility of the tools for non-technical users or quick operations.
**Independent Test**: Can be tested by launching the web server and successfully running a "Backup" task from the browser without touching the command line.
**Acceptance Scenarios**:
1. **Given** the web server is running, **When** I navigate to the home page, **Then** I see a dashboard with available tools (Backup, Migration, etc.).
2. **Given** I am on the Backup tool page, **When** I click "Run Backup", **Then** I see the progress logs in real-time and a success message upon completion.
3. **Given** I am on the Search tool page, **When** I enter a search term and submit, **Then** I see a list of matching datasets/dashboards displayed in a table.
---
### User Story 2 - Dynamic Plugin System (Priority: P2)
As a developer, I want to add new functionality (e.g., a new migration type or report generator) by simply dropping a file into a `plugins` directory, so that I can extend the tool without modifying the core codebase.
**Why this priority**: Enables scalable development and separation of concerns; allows custom extensions without merge conflicts in core files.
**Independent Test**: Create a simple "Hello World" plugin file, place it in the plugins folder, and verify it appears in the list of available tasks in the CLI/Web UI.
**Acceptance Scenarios**:
1. **Given** a valid plugin file in the `plugins/` directory, **When** the application starts, **Then** the plugin is automatically registered and listed as an available capability.
2. **Given** a plugin with specific configuration requirements, **When** I select it in the UI, **Then** the UI dynamically generates a form for those parameters.
3. **Given** an invalid or broken plugin file, **When** the application starts, **Then** the system logs an error but continues to function for other plugins.
---
## Requirements *(mandatory)*
### Functional Requirements
*All functional requirements are covered by the Acceptance Scenarios in the User Stories section.*
- **FR-001**: System MUST provide a Python-based web server (backend) to expose existing tool functionality via API.
- **FR-002**: System MUST provide a Single Page Application (SPA) frontend built with Svelte.
- **FR-003**: System MUST implement a plugin loader that scans a designated directory for Python modules matching a specific interface.
- **FR-004**: The Web UI MUST communicate with the backend via REST or WebSocket API.
- **FR-005**: The Web UI MUST display real-time logs/output from running tasks (streaming response).
- **FR-006**: System MUST support multi-user hosted deployment with authentication via ADFS (Active Directory Federation Services).
- **FR-007**: The Plugin interface MUST allow defining input parameters (schema) so the UI can auto-generate forms.
### System Invariants (Constitution Check)
- **INV-001**: Core logic (backup/migrate functions) must remain decoupled from the UI layer (can still be imported/used by CLI).
- **INV-002**: Plugins must not block the main application thread (long-running tasks must be async or threaded).
### Key Entities
- **Plugin**: Represents an extension module with metadata (name, version), input schema, and an execution entry point.
- **Task**: A specific execution instance of a Plugin or Core tool, having a status (Running, Success, Failed) and logs.
## Success Criteria *(mandatory)*
### Measurable Outcomes
- **SC-001**: A new plugin can be added and recognized by the system without restarting (or with a simple restart) and without code changes to core files.
- **SC-002**: Users can successfully trigger a Backup and Migration via the Web UI with 100% functional parity to the CLI.
- **SC-003**: The Web UI loads and becomes interactive in under 1 second on local networks.
# Feature Specification: Plugin Architecture & Svelte Web UI
**Feature Branch**: `001-plugin-arch-svelte-ui`
**Created**: 2025-12-19
**Status**: Draft
**Input**: User description: "Я хочу перевести проект на плагинную архитектуру + добавить web-ui на svelte"
## User Scenarios & Testing *(mandatory)*
### User Story 1 - Web Interface for Superset Tools (Priority: P1)
As a user, I want to interact with the Superset tools (Backup, Migration, Search) through a graphical web interface so that I don't have to memorize CLI commands and arguments.
**Why this priority**: drastically improves usability and accessibility of the tools for non-technical users or quick operations.
**Independent Test**: Can be tested by launching the web server and successfully running a "Backup" task from the browser without touching the command line.
**Acceptance Scenarios**:
1. **Given** the web server is running, **When** I navigate to the home page, **Then** I see a dashboard with available tools (Backup, Migration, etc.).
2. **Given** I am on the Backup tool page, **When** I click "Run Backup", **Then** I see the progress logs in real-time and a success message upon completion.
3. **Given** I am on the Search tool page, **When** I enter a search term and submit, **Then** I see a list of matching datasets/dashboards displayed in a table.
---
### User Story 2 - Dynamic Plugin System (Priority: P2)
As a developer, I want to add new functionality (e.g., a new migration type or report generator) by simply dropping a file into a `plugins` directory, so that I can extend the tool without modifying the core codebase.
**Why this priority**: Enables scalable development and separation of concerns; allows custom extensions without merge conflicts in core files.
**Independent Test**: Create a simple "Hello World" plugin file, place it in the plugins folder, and verify it appears in the list of available tasks in the CLI/Web UI.
**Acceptance Scenarios**:
1. **Given** a valid plugin file in the `plugins/` directory, **When** the application starts, **Then** the plugin is automatically registered and listed as an available capability.
2. **Given** a plugin with specific configuration requirements, **When** I select it in the UI, **Then** the UI dynamically generates a form for those parameters.
3. **Given** an invalid or broken plugin file, **When** the application starts, **Then** the system logs an error but continues to function for other plugins.
---
## Requirements *(mandatory)*
### Functional Requirements
*All functional requirements are covered by the Acceptance Scenarios in the User Stories section.*
- **FR-001**: System MUST provide a Python-based web server (backend) to expose existing tool functionality via API.
- **FR-002**: System MUST provide a Single Page Application (SPA) frontend built with Svelte.
- **FR-003**: System MUST implement a plugin loader that scans a designated directory for Python modules matching a specific interface.
- **FR-004**: The Web UI MUST communicate with the backend via REST or WebSocket API.
- **FR-005**: The Web UI MUST display real-time logs/output from running tasks (streaming response).
- **FR-006**: System MUST support multi-user hosted deployment with authentication via ADFS (Active Directory Federation Services).
- **FR-007**: The Plugin interface MUST allow defining input parameters (schema) so the UI can auto-generate forms.
### System Invariants (Constitution Check)
- **INV-001**: Core logic (backup/migrate functions) must remain decoupled from the UI layer (can still be imported/used by CLI).
- **INV-002**: Plugins must not block the main application thread (long-running tasks must be async or threaded).
### Key Entities
- **Plugin**: Represents an extension module with metadata (name, version), input schema, and an execution entry point.
- **Task**: A specific execution instance of a Plugin or Core tool, having a status (Running, Success, Failed) and logs.
## Success Criteria *(mandatory)*
### Measurable Outcomes
- **SC-001**: A new plugin can be added and recognized by the system without restarting (or with a simple restart) and without code changes to core files.
- **SC-002**: Users can successfully trigger a Backup and Migration via the Web UI with 100% functional parity to the CLI.
- **SC-003**: The Web UI loads and becomes interactive in under 1 second on local networks.
- **SC-004**: Real-time logs in the UI appear with less than 200ms latency from the backend execution.

134
specs/001-plugin-arch-svelte-ui/tasks.md Normal file → Executable file
View File

@@ -1,68 +1,68 @@
# Tasks: Plugin Architecture & Svelte Web UI
**Feature**: `001-plugin-arch-svelte-ui`
**Status**: Planned
## Dependencies
1. **Phase 1 (Setup)**: Must be completed first to establish the environment.
2. **Phase 2 (Foundational)**: Implements the core Plugin system and Backend infrastructure required by all User Stories.
3. **Phase 3 (US1)**: Web Interface depends on the Backend API and Plugin system.
4. **Phase 4 (US2)**: Dynamic Plugin System extends the core infrastructure.
## Parallel Execution Opportunities
- **US1 (Frontend)**: Frontend components (T013-T016) can be developed in parallel with Backend API endpoints (T011-T012) once the API contract is finalized.
- **US2 (Plugins)**: Plugin development (T019-T020) can proceed independently once the Plugin Interface (T005) is stable.
---
## Phase 1: Setup
**Goal**: Initialize the project structure and development environment for Backend (Python/FastAPI) and Frontend (Svelte/Vite).
- [x] T001 Create backend directory structure (src/api, src/core, src/plugins) in `backend/`
- [x] T002 Create frontend directory structure using Vite (Svelte template) in `frontend/`
- [x] T003 Configure Python environment (requirements.txt with FastAPI, Uvicorn, Pydantic) in `backend/requirements.txt`
- [x] T004 Configure Frontend environment (package.json with TailwindCSS) in `frontend/package.json`
## Phase 2: Foundational (Core Infrastructure)
**Goal**: Implement the core Plugin interface, Task management system, and basic Backend server.
- [x] T005 [P] Define `PluginBase` abstract class and Pydantic models in `backend/src/core/plugin_base.py`
- [x] T006 [P] Implement `PluginLoader` to scan and load plugins from directory in `backend/src/core/plugin_loader.py`
- [x] T007 Implement `TaskManager` to handle async task execution and state in `backend/src/core/task_manager.py`
- [x] T008 [P] Implement `Logger` with WebSocket streaming support in `backend/src/core/logger.py`
- [x] T009 Create basic FastAPI application entry point with CORS in `backend/src/app.py`
- [x] T010 [P] Implement ADFS Authentication middleware in `backend/src/api/auth.py`
## Phase 3: User Story 1 - Web Interface (Priority: P1)
**Goal**: Enable users to interact with tools via a web dashboard.
**Independent Test**: Launch web server, navigate to dashboard, run a dummy task, view logs.
- [x] T011 [US1] Implement REST API endpoints for Plugin listing (`GET /plugins`) in `backend/src/api/routes/plugins.py`
- [x] T012 [US1] Implement REST API endpoints for Task management (`POST /tasks`, `GET /tasks/{id}`) in `backend/src/api/routes/tasks.py`
- [x] T013 [P] [US1] Create Svelte store for Plugin and Task state in `frontend/src/lib/stores.js`
- [x] T014 [P] [US1] Create `Dashboard` page component listing available tools in `frontend/src/pages/Dashboard.svelte`
- [x] T015 [P] [US1] Create `TaskRunner` component with real-time log viewer (WebSocket) in `frontend/src/components/TaskRunner.svelte`
- [x] T016 [US1] Integrate Frontend with Backend API using `fetch` client in `frontend/src/lib/api.js`
## Phase 4: User Story 2 - Dynamic Plugin System (Priority: P2)
**Goal**: Allow developers to add new functionality by dropping files.
**Independent Test**: Add `hello_world.py` to plugins dir, verify it appears in UI.
- [x] T017 [US2] Implement dynamic form generation component based on JSON Schema in `frontend/src/components/DynamicForm.svelte`
- [x] T018 [US2] Update `PluginLoader` to validate plugin schema on load in `backend/src/core/plugin_loader.py`
- [x] T019 [P] [US2] Refactor existing `backup_script.py` into a Plugin (`BackupPlugin`) in `backend/src/plugins/backup.py`
- [x] T020 [P] [US2] Refactor existing `migration_script.py` into a Plugin (`MigrationPlugin`) in `backend/src/plugins/migration.py`
## Final Phase: Polish
**Goal**: Ensure production readiness.
- [x] T021 Add error handling and user notifications (Toasts) in Frontend
- [x] T022 Write documentation for Plugin Development in `docs/plugin_dev.md`
# Tasks: Plugin Architecture & Svelte Web UI
**Feature**: `001-plugin-arch-svelte-ui`
**Status**: Planned
## Dependencies
1. **Phase 1 (Setup)**: Must be completed first to establish the environment.
2. **Phase 2 (Foundational)**: Implements the core Plugin system and Backend infrastructure required by all User Stories.
3. **Phase 3 (US1)**: Web Interface depends on the Backend API and Plugin system.
4. **Phase 4 (US2)**: Dynamic Plugin System extends the core infrastructure.
## Parallel Execution Opportunities
- **US1 (Frontend)**: Frontend components (T013-T016) can be developed in parallel with Backend API endpoints (T011-T012) once the API contract is finalized.
- **US2 (Plugins)**: Plugin development (T019-T020) can proceed independently once the Plugin Interface (T005) is stable.
---
## Phase 1: Setup
**Goal**: Initialize the project structure and development environment for Backend (Python/FastAPI) and Frontend (Svelte/Vite).
- [x] T001 Create backend directory structure (src/api, src/core, src/plugins) in `backend/`
- [x] T002 Create frontend directory structure using Vite (Svelte template) in `frontend/`
- [x] T003 Configure Python environment (requirements.txt with FastAPI, Uvicorn, Pydantic) in `backend/requirements.txt`
- [x] T004 Configure Frontend environment (package.json with TailwindCSS) in `frontend/package.json`
## Phase 2: Foundational (Core Infrastructure)
**Goal**: Implement the core Plugin interface, Task management system, and basic Backend server.
- [x] T005 [P] Define `PluginBase` abstract class and Pydantic models in `backend/src/core/plugin_base.py`
- [x] T006 [P] Implement `PluginLoader` to scan and load plugins from directory in `backend/src/core/plugin_loader.py`
- [x] T007 Implement `TaskManager` to handle async task execution and state in `backend/src/core/task_manager.py`
- [x] T008 [P] Implement `Logger` with WebSocket streaming support in `backend/src/core/logger.py`
- [x] T009 Create basic FastAPI application entry point with CORS in `backend/src/app.py`
- [x] T010 [P] Implement ADFS Authentication middleware in `backend/src/api/auth.py`
## Phase 3: User Story 1 - Web Interface (Priority: P1)
**Goal**: Enable users to interact with tools via a web dashboard.
**Independent Test**: Launch web server, navigate to dashboard, run a dummy task, view logs.
- [x] T011 [US1] Implement REST API endpoints for Plugin listing (`GET /plugins`) in `backend/src/api/routes/plugins.py`
- [x] T012 [US1] Implement REST API endpoints for Task management (`POST /tasks`, `GET /tasks/{id}`) in `backend/src/api/routes/tasks.py`
- [x] T013 [P] [US1] Create Svelte store for Plugin and Task state in `frontend/src/lib/stores.js`
- [x] T014 [P] [US1] Create `Dashboard` page component listing available tools in `frontend/src/pages/Dashboard.svelte`
- [x] T015 [P] [US1] Create `TaskRunner` component with real-time log viewer (WebSocket) in `frontend/src/components/TaskRunner.svelte`
- [x] T016 [US1] Integrate Frontend with Backend API using `fetch` client in `frontend/src/lib/api.js`
## Phase 4: User Story 2 - Dynamic Plugin System (Priority: P2)
**Goal**: Allow developers to add new functionality by dropping files.
**Independent Test**: Add `hello_world.py` to plugins dir, verify it appears in UI.
- [x] T017 [US2] Implement dynamic form generation component based on JSON Schema in `frontend/src/components/DynamicForm.svelte`
- [x] T018 [US2] Update `PluginLoader` to validate plugin schema on load in `backend/src/core/plugin_loader.py`
- [x] T019 [P] [US2] Refactor existing `backup_script.py` into a Plugin (`BackupPlugin`) in `backend/src/plugins/backup.py`
- [x] T020 [P] [US2] Refactor existing `migration_script.py` into a Plugin (`MigrationPlugin`) in `backend/src/plugins/migration.py`
## Final Phase: Polish
**Goal**: Ensure production readiness.
- [x] T021 Add error handling and user notifications (Toasts) in Frontend
- [x] T022 Write documentation for Plugin Development in `docs/plugin_dev.md`
- [ ] T023 Final integration test: Run full Backup and Migration flow via UI

View File

@@ -0,0 +1,34 @@
# Specification Quality Checklist: Add web application settings mechanism
**Purpose**: Validate specification completeness and quality before proceeding to planning
**Created**: 2025-12-20
**Feature**: [specs/002-app-settings/spec.md](specs/002-app-settings/spec.md)
## Content Quality
- [x] No implementation details (languages, frameworks, APIs)
- [x] Focused on user value and business needs
- [x] Written for non-technical stakeholders
- [x] All mandatory sections completed
## Requirement Completeness
- [x] No [NEEDS CLARIFICATION] markers remain
- [x] Requirements are testable and unambiguous
- [x] Success criteria are measurable
- [x] Success criteria are technology-agnostic (no implementation details)
- [x] All acceptance scenarios are defined
- [x] Edge cases are identified
- [x] Scope is clearly bounded
- [x] Dependencies and assumptions identified
## Feature Readiness
- [x] All functional requirements have clear acceptance criteria
- [x] User scenarios cover primary flows
- [x] Feature meets measurable outcomes defined in Success Criteria
- [x] No implementation details leak into specification
## Notes
- Initial specification covers all requested points with reasonable defaults for authentication and storage validation.

102
specs/002-app-settings/plan.md Executable file
View File

@@ -0,0 +1,102 @@
# Technical Plan: Web Application Settings Mechanism
This plan outlines the implementation of a settings management system for the Superset Tools application, allowing users to configure multiple Superset environments and global application settings (like backup storage) via the web UI.
## 1. Backend Architecture
### 1.1 Data Models (Pydantic)
We will define models in `backend/src/core/config_models.py`:
```python
from pydantic import BaseModel, Field
from typing import List, Optional
class Environment(BaseModel):
id: str
name: str
url: str
username: str
password: str # Will be masked in UI
is_default: bool = False
class GlobalSettings(BaseModel):
backup_path: str
default_environment_id: Optional[str] = None
class AppConfig(BaseModel):
environments: List[Environment] = []
settings: GlobalSettings
```
### 1.2 Configuration Manager
A new class `ConfigManager` in `backend/src/core/config_manager.py` will handle:
- Loading/saving `AppConfig` to `config.json`.
- CRUD operations for environments.
- Updating global settings.
- Validating backup paths and Superset URLs.
### 1.3 API Endpoints
New router `backend/src/api/routes/settings.py`:
- `GET /settings`: Retrieve all settings (masking passwords).
- `PATCH /settings/global`: Update global settings (backup path, etc.).
- `GET /settings/environments`: List all environments.
- `POST /settings/environments`: Add a new environment.
- `PUT /settings/environments/{id}`: Update an environment.
- `DELETE /settings/environments/{id}`: Remove an environment.
- `POST /settings/environments/{id}/test`: Test connection to a specific environment.
### 1.4 Integration
- Update `backend/src/dependencies.py` to provide a singleton `ConfigManager`.
- Refactor `superset_tool/utils/init_clients.py` to fetch environment details from `ConfigManager` instead of hardcoded values.
## 2. Frontend Implementation
### 2.1 Settings Page
- Create `frontend/src/pages/Settings.svelte`.
- Add a "Settings" link to the main navigation (likely in `App.svelte`).
### 2.2 Components
- **EnvironmentList**: Displays a table/list of configured environments with Edit/Delete buttons.
- **EnvironmentForm**: A modal or inline form for adding/editing environments.
- **GlobalSettingsForm**: Form for editing the backup storage path.
### 2.3 API Integration
- Add functions to `frontend/src/lib/api.js` for interacting with the new settings endpoints.
## 3. Workflow Diagram
```mermaid
graph TD
UI[Web UI - Settings Page] --> API[FastAPI Settings Router]
API --> CM[Config Manager]
CM --> JSON[(config.json)]
CM --> SS[Superset Instance] : Test Connection
Plugins[Plugins - Backup/Migration] --> CM : Get Env/Path
```
## 4. Implementation Steps
1. **Backend Core**:
- Create `config_models.py` and `config_manager.py`.
- Implement file-based persistence.
2. **Backend API**:
- Implement `settings.py` router.
- Register router in `app.py`.
3. **Frontend UI**:
- Create `Settings.svelte` and necessary components.
- Implement API calls and state management.
4. **Refactoring**:
- Update `init_clients.py` to use the new configuration system.
- Ensure existing plugins (Backup, Migration) use the configured settings.
5. **Validation**:
- Add path existence/write checks for backup storage.
- Add URL/Connection checks for Superset environments.

77
specs/002-app-settings/spec.md Executable file
View File

@@ -0,0 +1,77 @@
# Feature Specification: Add web application settings mechanism
**Feature Branch**: `002-app-settings`
**Created**: 2025-12-20
**Status**: Draft
**Input**: User description: "давай внесем полноценный механизм настройки веб приложения. Что нужно точно - 1. Интерфейс для добавления enviroments (разные сервера суперсета) 2. Интерфейс для настройки файлового хранилища бекапов"
## User Scenarios & Testing *(mandatory)*
### User Story 1 - Manage Superset Environments (Priority: P1)
As an administrator, I want to add, edit, and remove Superset environment configurations (URL, credentials, name) so that the application can interact with multiple Superset instances.
**Why this priority**: This is the core functionality required for the tool to be useful across different stages (dev/prod) or different Superset clusters.
**Independent Test**: Can be fully tested by adding a new environment, verifying it appears in the list, and then deleting it.
**Acceptance Scenarios**:
1. **Given** the settings page is open, **When** I enter valid Superset connection details and save, **Then** the new environment is added to the list of available targets.
2. **Given** an existing environment, **When** I update its URL and save, **Then** the system uses the new URL for subsequent operations.
3. **Given** an existing environment, **When** I delete it, **Then** it is no longer available for selection in other parts of the application.
---
### User Story 2 - Configure Backup Storage (Priority: P1)
As an administrator, I want to configure the file path or storage location for backups so that I can control where system backups are stored.
**Why this priority**: Essential for the backup plugin to function correctly and for users to manage disk space/storage locations.
**Independent Test**: Can be tested by setting a backup path and verifying that the system validates the path's existence or accessibility.
**Acceptance Scenarios**:
1. **Given** the storage settings section, **When** I provide a valid local or network path, **Then** the system saves this as the default backup location.
2. **Given** an invalid or inaccessible path, **When** I try to save, **Then** the system displays an error message and does not update the setting.
---
### Edge Cases
- **Duplicate Environments**: What happens when a user tries to add an environment with a name that already exists? (System should prevent duplicates).
- **Invalid Credentials**: How does the system handle saving environments with incorrect credentials? (System should ideally validate connection on save).
- **Path Permissions**: How does the system handle a backup path that is valid but the application lacks write permissions for? (System should check write permissions).
## Requirements *(mandatory)*
### Functional Requirements
- **FR-001**: System MUST provide a dedicated settings interface in the web UI.
- **FR-002**: System MUST allow users to create multiple named "Environments" for Superset.
- **FR-003**: Each Environment MUST include: Name, Base URL, and Authentication details (e.g., Username/Password or API Key).
- **FR-004**: System MUST allow setting a global "Backup Storage Path".
- **FR-005**: System MUST persist these settings across application restarts.
- **FR-006**: System MUST validate the Superset URL format before saving.
- **FR-007**: System MUST verify that the Backup Storage Path is writable by the application.
- **FR-008**: System MUST allow selecting a "Default" environment for operations.
### System Invariants (Constitution Check)
- **INV-001**: Sensitive credentials (passwords/keys) MUST NOT be displayed in plain text after being saved.
- **INV-002**: At least one environment MUST be configured for the application to perform Superset-related tasks.
### Key Entities *(include if feature involves data)*
- **Environment**: Represents a Superset instance. Attributes: Unique ID, Name, URL, Credentials, IsDefault flag.
- **AppConfiguration**: Singleton entity representing global settings. Attributes: BackupPath, DefaultEnvironmentID.
## Success Criteria *(mandatory)*
### Measurable Outcomes
- **SC-001**: Users can add a new Superset environment in under 30 seconds.
- **SC-002**: 100% of saved environments are immediately available for use in backup/migration tasks.
- **SC-003**: System prevents saving invalid backup paths 100% of the time.
- **SC-004**: Configuration changes take effect without requiring a manual restart of the backend services.

View File

@@ -0,0 +1,141 @@
---
description: "Task list for implementing the web application settings mechanism"
---
# Tasks: Web Application Settings Mechanism
**Input**: Design documents from `specs/002-app-settings/`
**Prerequisites**: plan.md (required), spec.md (required for user stories)
**Organization**: Tasks are grouped by user story to enable independent implementation and testing of each story.
## Format: `[ID] [P?] [Story] Description`
- **[P]**: Can run in parallel (different files, no dependencies)
- **[Story]**: Which user story this task belongs to (e.g., US1, US2, US3)
- Include exact file paths in descriptions
## Phase 1: Setup (Shared Infrastructure)
**Purpose**: Project initialization and basic structure
- [x] T001 Create project structure for settings management in `backend/src/core/` and `backend/src/api/routes/`
- [x] T002 [P] Initialize `frontend/src/pages/Settings.svelte` placeholder
---
## Phase 2: Foundational (Blocking Prerequisites)
**Purpose**: Core infrastructure that MUST be complete before ANY user story can be implemented
**⚠️ CRITICAL**: No user story work can begin until this phase is complete
- [x] T003 Implement configuration models in `backend/src/core/config_models.py`
- [x] T004 Implement `ConfigManager` for JSON persistence in `backend/src/core/config_manager.py`
- [x] T005 [P] Update `backend/src/dependencies.py` to provide `ConfigManager` singleton
- [x] T006 [P] Setup API routing for settings in `backend/src/api/routes/settings.py` and register in `backend/src/app.py`
**Checkpoint**: Foundation ready - user story implementation can now begin in parallel
---
## Phase 3: User Story 1 - Manage Superset Environments (Priority: P1) 🎯 MVP
**Goal**: Add, edit, and remove Superset environment configurations (URL, credentials, name) so that the application can interact with multiple Superset instances.
**Independent Test**: Add a new environment, verify it appears in the list, and then delete it.
### Implementation for User Story 1
- [x] T007 [P] [US1] Implement environment CRUD logic in `backend/src/core/config_manager.py`
- [x] T008 [US1] Implement environment API endpoints in `backend/src/api/routes/settings.py`
- [x] T009 [P] [US1] Add environment API methods to `frontend/src/lib/api.js`
- [x] T010 [US1] Implement environment list and form UI in `frontend/src/pages/Settings.svelte`
- [x] T011 [US1] Implement connection test logic in `backend/src/api/routes/settings.py`
**Checkpoint**: At this point, User Story 1 should be fully functional and testable independently
---
## Phase 4: User Story 2 - Configure Backup Storage (Priority: P1)
**Goal**: Configure the file path or storage location for backups so that I can control where system backups are stored.
**Independent Test**: Set a backup path and verify that the system validates the path's existence or accessibility.
### Implementation for User Story 2
- [x] T012 [P] [US2] Implement global settings update logic in `backend/src/core/config_manager.py`
- [x] T013 [US2] Implement global settings API endpoints in `backend/src/api/routes/settings.py`
- [x] T014 [P] [US2] Add global settings API methods to `frontend/src/lib/api.js`
- [x] T015 [US2] Implement backup storage configuration UI in `frontend/src/pages/Settings.svelte`
- [x] T016 [US2] Add path validation and write permission checks in `backend/src/api/routes/settings.py`
**Checkpoint**: At this point, User Stories 1 AND 2 should both work independently
---
## Phase 5: Polish & Cross-Cutting Concerns
**Purpose**: Improvements that affect multiple user stories
- [x] T017 Refactor `superset_tool/utils/init_clients.py` to use `ConfigManager` for environment details
- [x] T018 Update existing plugins (Backup, Migration) to fetch settings from `ConfigManager`
- [x] T019 [P] Add password masking in `backend/src/api/routes/settings.py` and UI
- [x] T020 [P] Add "Settings" link to navigation in `frontend/src/App.svelte`
- [x] T021 [P] Documentation updates for settings mechanism in `docs/`
---
## Dependencies & Execution Order
### Phase Dependencies
- **Setup (Phase 1)**: No dependencies - can start immediately
- **Foundational (Phase 2)**: Depends on Setup completion - BLOCKS all user stories
- **User Stories (Phase 3+)**: All depend on Foundational phase completion
- User stories can then proceed in parallel (if staffed)
- Or sequentially in priority order (P1 → P2 → P3)
- **Polish (Final Phase)**: Depends on all desired user stories being complete
### User Story Dependencies
- **User Story 1 (P1)**: Can start after Foundational (Phase 2) - No dependencies on other stories
- **User Story 2 (P1)**: Can start after Foundational (Phase 2) - Independent of US1
### Parallel Opportunities
- All Setup tasks marked [P] can run in parallel
- All Foundational tasks marked [P] can run in parallel (within Phase 2)
- Once Foundational phase completes, all user stories can start in parallel
- Models and API methods within a story marked [P] can run in parallel
---
## Parallel Example: User Story 1
```bash
# Launch backend and frontend tasks for User Story 1 together:
Task: "Implement environment CRUD logic in backend/src/core/config_manager.py"
Task: "Add environment API methods to frontend/src/lib/api.js"
```
---
## Implementation Strategy
### MVP First (User Story 1 Only)
1. Complete Phase 1: Setup
2. Complete Phase 2: Foundational (CRITICAL - blocks all stories)
3. Complete Phase 3: User Story 1
4. **STOP and VALIDATE**: Test User Story 1 independently
5. Deploy/demo if ready
### Incremental Delivery
1. Complete Setup + Foundational → Foundation ready
2. Add User Story 1 → Test independently → Deploy/Demo (MVP!)
3. Add User Story 2 → Test independently → Deploy/Demo
4. Each story adds value without breaking previous stories

28
superset_tool/__init__.py Normal file → Executable file
View File

@@ -1,14 +1,14 @@
# [DEF:superset_tool:Module]
# @SEMANTICS: package, root
# @PURPOSE: Root package for superset_tool.
# @LAYER: Domain
# @PUBLIC_API: SupersetClient, SupersetConfig
# [SECTION: IMPORTS]
from .client import SupersetClient
from .models import SupersetConfig
# [/SECTION]
__all__ = ["SupersetClient", "SupersetConfig"]
# [/DEF:superset_tool]
# [DEF:superset_tool:Module]
# @SEMANTICS: package, root
# @PURPOSE: Root package for superset_tool.
# @LAYER: Domain
# @PUBLIC_API: SupersetClient, SupersetConfig
# [SECTION: IMPORTS]
from .client import SupersetClient
from .models import SupersetConfig
# [/SECTION]
__all__ = ["SupersetClient", "SupersetConfig"]
# [/DEF:superset_tool]

936
superset_tool/client.py Normal file → Executable file
View File

@@ -1,468 +1,468 @@
# [DEF:superset_tool.client:Module]
#
# @SEMANTICS: superset, api, client, rest, http, dashboard, dataset, import, export
# @PURPOSE: Предоставляет высокоуровневый клиент для взаимодействия с Superset REST API, инкапсулируя логику запросов, обработку ошибок и пагинацию.
# @LAYER: Domain
# @RELATION: DEPENDS_ON -> superset_tool.models
# @RELATION: DEPENDS_ON -> superset_tool.exceptions
# @RELATION: DEPENDS_ON -> superset_tool.utils
#
# @INVARIANT: All network operations must use the internal APIClient instance.
# @CONSTRAINT: No direct use of 'requests' library outside of APIClient.
# @PUBLIC_API: SupersetClient
# [SECTION: IMPORTS]
import json
import zipfile
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple, Union, cast
from requests import Response
from superset_tool.models import SupersetConfig
from superset_tool.exceptions import ExportError, InvalidZipFormatError
from superset_tool.utils.fileio import get_filename_from_headers
from superset_tool.utils.logger import SupersetLogger
from superset_tool.utils.network import APIClient
# [/SECTION]
# [DEF:SupersetClient:Class]
# @PURPOSE: Класс-обёртка над Superset REST API, предоставляющий методы для работы с дашбордами и датасетами.
# @RELATION: CREATES_INSTANCE_OF -> APIClient
# @RELATION: USES -> SupersetConfig
class SupersetClient:
# [DEF:SupersetClient.__init__:Function]
# @PURPOSE: Инициализирует клиент, проверяет конфигурацию и создает сетевой клиент.
# @PRE: `config` должен быть валидным объектом SupersetConfig.
# @POST: Атрибуты `logger`, `config`, и `network` созданы и готовы к работе.
# @PARAM: config (SupersetConfig) - Конфигурация подключения.
# @PARAM: logger (Optional[SupersetLogger]) - Экземпляр логгера.
def __init__(self, config: SupersetConfig, logger: Optional[SupersetLogger] = None):
self.logger = logger or SupersetLogger(name="SupersetClient")
self.logger.info("[SupersetClient.__init__][Enter] Initializing SupersetClient.")
self._validate_config(config)
self.config = config
self.network = APIClient(
config=config.dict(),
verify_ssl=config.verify_ssl,
timeout=config.timeout,
logger=self.logger,
)
self.delete_before_reimport: bool = False
self.logger.info("[SupersetClient.__init__][Exit] SupersetClient initialized.")
# [/DEF:SupersetClient.__init__]
# [DEF:SupersetClient._validate_config:Function]
# @PURPOSE: Проверяет, что переданный объект конфигурации имеет корректный тип.
# @PRE: `config` должен быть передан.
# @POST: Если проверка пройдена, выполнение продолжается.
# @THROW: TypeError - Если `config` не является экземпляром `SupersetConfig`.
# @PARAM: config (SupersetConfig) - Объект для проверки.
def _validate_config(self, config: SupersetConfig) -> None:
self.logger.debug("[_validate_config][Enter] Validating SupersetConfig.")
assert isinstance(config, SupersetConfig), "Конфигурация должна быть экземпляром SupersetConfig"
self.logger.debug("[_validate_config][Exit] Config is valid.")
# [/DEF:SupersetClient._validate_config]
@property
def headers(self) -> dict:
# [DEF:SupersetClient.headers:Function]
# @PURPOSE: Возвращает базовые HTTP-заголовки, используемые сетевым клиентом.
# @PRE: self.network должен быть инициализирован.
# @POST: Возвращаемый словарь содержит актуальные заголовки, включая токен авторизации.
return self.network.headers
# [/DEF:SupersetClient.headers]
# [DEF:SupersetClient.get_dashboards:Function]
# @PURPOSE: Получает полный список дашбордов, автоматически обрабатывая пагинацию.
# @RELATION: CALLS -> self._fetch_total_object_count
# @RELATION: CALLS -> self._fetch_all_pages
# @PRE: self.network должен быть инициализирован.
# @POST: Возвращаемый список содержит все дашборды, доступные по API.
# @THROW: APIError - В случае ошибки сетевого запроса.
# @PARAM: query (Optional[Dict]) - Дополнительные параметры запроса для API.
# @RETURN: Tuple[int, List[Dict]] - Кортеж (общее количество, список дашбордов).
def get_dashboards(self, query: Optional[Dict] = None) -> Tuple[int, List[Dict]]:
assert self.network, "[get_dashboards][PRE] Network client must be initialized."
self.logger.info("[get_dashboards][Enter] Fetching dashboards.")
validated_query = self._validate_query_params(query or {})
if 'columns' not in validated_query:
validated_query['columns'] = ["slug", "id", "changed_on_utc", "dashboard_title", "published"]
total_count = self._fetch_total_object_count(endpoint="/dashboard/")
paginated_data = self._fetch_all_pages(
endpoint="/dashboard/",
pagination_options={"base_query": validated_query, "total_count": total_count, "results_field": "result"},
)
self.logger.info("[get_dashboards][Exit] Found %d dashboards.", total_count)
return total_count, paginated_data
# [/DEF:SupersetClient.get_dashboards]
# [DEF:SupersetClient.export_dashboard:Function]
# @PURPOSE: Экспортирует дашборд в виде ZIP-архива.
# @RELATION: CALLS -> self.network.request
# @PRE: dashboard_id должен быть положительным целым числом.
# @POST: Возвращает бинарное содержимое ZIP-архива и имя файла.
# @THROW: ExportError - Если экспорт завершился неудачей.
# @PARAM: dashboard_id (int) - ID дашборда для экспорта.
# @RETURN: Tuple[bytes, str] - Бинарное содержимое ZIP-архива и имя файла.
def export_dashboard(self, dashboard_id: int) -> Tuple[bytes, str]:
assert isinstance(dashboard_id, int) and dashboard_id > 0, "[export_dashboard][PRE] dashboard_id must be a positive integer."
self.logger.info("[export_dashboard][Enter] Exporting dashboard %s.", dashboard_id)
response = self.network.request(
method="GET",
endpoint="/dashboard/export/",
params={"q": json.dumps([dashboard_id])},
stream=True,
raw_response=True,
)
response = cast(Response, response)
self._validate_export_response(response, dashboard_id)
filename = self._resolve_export_filename(response, dashboard_id)
self.logger.info("[export_dashboard][Exit] Exported dashboard %s to %s.", dashboard_id, filename)
return response.content, filename
# [/DEF:SupersetClient.export_dashboard]
# [DEF:SupersetClient.import_dashboard:Function]
# @PURPOSE: Импортирует дашборд из ZIP-файла с возможностью автоматического удаления и повторной попытки при ошибке.
# @RELATION: CALLS -> self._do_import
# @RELATION: CALLS -> self.delete_dashboard
# @RELATION: CALLS -> self.get_dashboards
# @PRE: Файл, указанный в `file_name`, должен существовать и быть валидным ZIP-архивом Superset.
# @POST: Дашборд успешно импортирован, возвращен ответ API.
# @THROW: FileNotFoundError - Если файл не найден.
# @THROW: InvalidZipFormatError - Если файл не является валидным ZIP-архивом Superset.
# @PARAM: file_name (Union[str, Path]) - Путь к ZIP-архиву.
# @PARAM: dash_id (Optional[int]) - ID дашборда для удаления при сбое.
# @PARAM: dash_slug (Optional[str]) - Slug дашборда для поиска ID, если ID не предоставлен.
# @RETURN: Dict - Ответ API в случае успеха.
def import_dashboard(self, file_name: Union[str, Path], dash_id: Optional[int] = None, dash_slug: Optional[str] = None) -> Dict:
assert file_name, "[import_dashboard][PRE] file_name must be provided."
file_path = str(file_name)
self._validate_import_file(file_path)
try:
return self._do_import(file_path)
except Exception as exc:
self.logger.error("[import_dashboard][Failure] First import attempt failed: %s", exc, exc_info=True)
if not self.delete_before_reimport:
raise
target_id = self._resolve_target_id_for_delete(dash_id, dash_slug)
if target_id is None:
self.logger.error("[import_dashboard][Failure] No ID available for delete-retry.")
raise
self.delete_dashboard(target_id)
self.logger.info("[import_dashboard][State] Deleted dashboard ID %s, retrying import.", target_id)
return self._do_import(file_path)
# [/DEF:SupersetClient.import_dashboard]
# [DEF:SupersetClient._resolve_target_id_for_delete:Function]
# @PURPOSE: Определяет ID дашборда для удаления, используя ID или slug.
# @PARAM: dash_id (Optional[int]) - ID дашборда.
# @PARAM: dash_slug (Optional[str]) - Slug дашборда.
# @PRE: По крайней мере один из параметров (dash_id или dash_slug) должен быть предоставлен.
# @POST: Возвращает ID дашборда, если найден, иначе None.
# @THROW: APIError - В случае ошибки сетевого запроса при поиске по slug.
# @RETURN: Optional[int] - Найденный ID или None.
def _resolve_target_id_for_delete(self, dash_id: Optional[int], dash_slug: Optional[str]) -> Optional[int]:
assert dash_id is not None or dash_slug is not None, "[_resolve_target_id_for_delete][PRE] At least one of ID or slug must be provided."
if dash_id is not None:
return dash_id
if dash_slug is not None:
self.logger.debug("[_resolve_target_id_for_delete][State] Resolving ID by slug '%s'.", dash_slug)
try:
_, candidates = self.get_dashboards(query={"filters": [{"col": "slug", "op": "eq", "value": dash_slug}]})
if candidates:
target_id = candidates[0]["id"]
self.logger.debug("[_resolve_target_id_for_delete][Success] Resolved slug to ID %s.", target_id)
return target_id
except Exception as e:
self.logger.warning("[_resolve_target_id_for_delete][Warning] Could not resolve slug '%s' to ID: %s", dash_slug, e)
return None
# [/DEF:SupersetClient._resolve_target_id_for_delete]
# [DEF:SupersetClient._do_import:Function]
# @PURPOSE: Выполняет один запрос на импорт без обработки исключений.
# @PRE: Файл должен существовать.
# @POST: Файл успешно загружен, возвращен ответ API.
# @THROW: FileNotFoundError - Если файл не существует.
# @PARAM: file_name (Union[str, Path]) - Путь к файлу.
# @RETURN: Dict - Ответ API.
def _do_import(self, file_name: Union[str, Path]) -> Dict:
self.logger.debug(f"[_do_import][State] Uploading file: {file_name}")
file_path = Path(file_name)
if file_path.exists():
self.logger.debug(f"[_do_import][State] File size: {file_path.stat().st_size} bytes")
else:
self.logger.error(f"[_do_import][Failure] File does not exist: {file_name}")
raise FileNotFoundError(f"File does not exist: {file_name}")
return self.network.upload_file(
endpoint="/dashboard/import/",
file_info={"file_obj": file_path, "file_name": file_path.name, "form_field": "formData"},
extra_data={"overwrite": "true"},
timeout=self.config.timeout * 2,
)
# [/DEF:SupersetClient._do_import]
# [DEF:SupersetClient.delete_dashboard:Function]
# @PURPOSE: Удаляет дашборд по его ID или slug.
# @RELATION: CALLS -> self.network.request
# @PRE: dashboard_id должен быть предоставлен.
# @POST: Дашборд удален или залогировано предупреждение.
# @THROW: APIError - В случае ошибки сетевого запроса.
# @PARAM: dashboard_id (Union[int, str]) - ID или slug дашборда.
def delete_dashboard(self, dashboard_id: Union[int, str]) -> None:
assert dashboard_id, "[delete_dashboard][PRE] dashboard_id must be provided."
self.logger.info("[delete_dashboard][Enter] Deleting dashboard %s.", dashboard_id)
response = self.network.request(method="DELETE", endpoint=f"/dashboard/{dashboard_id}")
response = cast(Dict, response)
if response.get("result", True) is not False:
self.logger.info("[delete_dashboard][Success] Dashboard %s deleted.", dashboard_id)
else:
self.logger.warning("[delete_dashboard][Warning] Unexpected response while deleting %s: %s", dashboard_id, response)
# [/DEF:SupersetClient.delete_dashboard]
# [DEF:SupersetClient._extract_dashboard_id_from_zip:Function]
# @PURPOSE: Извлекает ID дашборда из `metadata.yaml` внутри ZIP-архива.
# @PARAM: file_name (Union[str, Path]) - Путь к ZIP-файлу.
# @PRE: Файл, указанный в `file_name`, должен быть валидным ZIP-архивом.
# @POST: Возвращает ID дашборда, если найден в metadata.yaml, иначе None.
# @THROW: ImportError - Если не установлен `yaml`.
# @RETURN: Optional[int] - ID дашборда или None.
def _extract_dashboard_id_from_zip(self, file_name: Union[str, Path]) -> Optional[int]:
assert zipfile.is_zipfile(file_name), "[_extract_dashboard_id_from_zip][PRE] file_name must be a valid zip file."
try:
import yaml
with zipfile.ZipFile(file_name, "r") as zf:
for name in zf.namelist():
if name.endswith("metadata.yaml"):
with zf.open(name) as meta_file:
meta = yaml.safe_load(meta_file)
dash_id = meta.get("dashboard_uuid") or meta.get("dashboard_id")
if dash_id: return int(dash_id)
except Exception as exc:
self.logger.error("[_extract_dashboard_id_from_zip][Failure] %s", exc, exc_info=True)
return None
# [/DEF:SupersetClient._extract_dashboard_id_from_zip]
# [DEF:SupersetClient._extract_dashboard_slug_from_zip:Function]
# @PURPOSE: Извлекает slug дашборда из `metadata.yaml` внутри ZIP-архива.
# @PARAM: file_name (Union[str, Path]) - Путь к ZIP-файлу.
# @PRE: Файл, указанный в `file_name`, должен быть валидным ZIP-архивом.
# @POST: Возвращает slug дашборда, если найден в metadata.yaml, иначе None.
# @THROW: ImportError - Если не установлен `yaml`.
# @RETURN: Optional[str] - Slug дашборда или None.
def _extract_dashboard_slug_from_zip(self, file_name: Union[str, Path]) -> Optional[str]:
assert zipfile.is_zipfile(file_name), "[_extract_dashboard_slug_from_zip][PRE] file_name must be a valid zip file."
try:
import yaml
with zipfile.ZipFile(file_name, "r") as zf:
for name in zf.namelist():
if name.endswith("metadata.yaml"):
with zf.open(name) as meta_file:
meta = yaml.safe_load(meta_file)
if slug := meta.get("slug"):
return str(slug)
except Exception as exc:
self.logger.error("[_extract_dashboard_slug_from_zip][Failure] %s", exc, exc_info=True)
return None
# [/DEF:SupersetClient._extract_dashboard_slug_from_zip]
# [DEF:SupersetClient._validate_export_response:Function]
# @PURPOSE: Проверяет, что HTTP-ответ на экспорт является валидным ZIP-архивом.
# @PRE: response должен быть объектом requests.Response.
# @POST: Проверка пройдена, если ответ является непустым ZIP-архивом.
# @THROW: ExportError - Если ответ не является ZIP-архивом или пуст.
# @PARAM: response (Response) - HTTP ответ.
# @PARAM: dashboard_id (int) - ID дашборда.
def _validate_export_response(self, response: Response, dashboard_id: int) -> None:
assert isinstance(response, Response), "[_validate_export_response][PRE] response must be a requests.Response object."
content_type = response.headers.get("Content-Type", "")
if "application/zip" not in content_type:
raise ExportError(f"Получен не ZIP-архив (Content-Type: {content_type})")
if not response.content:
raise ExportError("Получены пустые данные при экспорте")
# [/DEF:SupersetClient._validate_export_response]
# [DEF:SupersetClient._resolve_export_filename:Function]
# @PURPOSE: Определяет имя файла для экспорта из заголовков или генерирует его.
# @PRE: response должен быть объектом requests.Response.
# @POST: Возвращает непустое имя файла.
# @PARAM: response (Response) - HTTP ответ.
# @PARAM: dashboard_id (int) - ID дашборда.
# @RETURN: str - Имя файла.
def _resolve_export_filename(self, response: Response, dashboard_id: int) -> str:
assert isinstance(response, Response), "[_resolve_export_filename][PRE] response must be a requests.Response object."
filename = get_filename_from_headers(dict(response.headers))
if not filename:
from datetime import datetime
timestamp = datetime.now().strftime("%Y%m%dT%H%M%S")
filename = f"dashboard_export_{dashboard_id}_{timestamp}.zip"
self.logger.warning("[_resolve_export_filename][Warning] Generated filename: %s", filename)
return filename
# [/DEF:SupersetClient._resolve_export_filename]
# [DEF:SupersetClient._validate_query_params:Function]
# @PURPOSE: Формирует корректный набор параметров запроса с пагинацией.
# @PARAM: query (Optional[Dict]) - Исходные параметры.
# @PRE: query, если предоставлен, должен быть словарем.
# @POST: Возвращает словарь, содержащий базовые параметры пагинации, объединенные с `query`.
# @RETURN: Dict - Валидированные параметры.
def _validate_query_params(self, query: Optional[Dict]) -> Dict:
assert query is None or isinstance(query, dict), "[_validate_query_params][PRE] query must be a dictionary or None."
base_query = {"page": 0, "page_size": 1000}
return {**base_query, **(query or {})}
# [/DEF:SupersetClient._validate_query_params]
# [DEF:SupersetClient._fetch_total_object_count:Function]
# @PURPOSE: Получает общее количество объектов по указанному эндпоинту для пагинации.
# @PARAM: endpoint (str) - API эндпоинт.
# @PRE: endpoint должен быть непустой строкой.
# @POST: Возвращает общее количество объектов (>= 0).
# @THROW: APIError - В случае ошибки сетевого запроса.
# @RETURN: int - Количество объектов.
def _fetch_total_object_count(self, endpoint: str) -> int:
assert endpoint and isinstance(endpoint, str), "[_fetch_total_object_count][PRE] endpoint must be a non-empty string."
return self.network.fetch_paginated_count(
endpoint=endpoint,
query_params={"page": 0, "page_size": 1},
count_field="count",
)
# [/DEF:SupersetClient._fetch_total_object_count]
# [DEF:SupersetClient._fetch_all_pages:Function]
# @PURPOSE: Итерируется по всем страницам пагинированного API и собирает все данные.
# @PARAM: endpoint (str) - API эндпоинт.
# @PARAM: pagination_options (Dict) - Опции пагинации.
# @PRE: endpoint должен быть непустой строкой, pagination_options - словарем.
# @POST: Возвращает полный список объектов.
# @THROW: APIError - В случае ошибки сетевого запроса.
# @RETURN: List[Dict] - Список всех объектов.
def _fetch_all_pages(self, endpoint: str, pagination_options: Dict) -> List[Dict]:
assert endpoint and isinstance(endpoint, str), "[_fetch_all_pages][PRE] endpoint must be a non-empty string."
assert isinstance(pagination_options, dict), "[_fetch_all_pages][PRE] pagination_options must be a dictionary."
return self.network.fetch_paginated_data(endpoint=endpoint, pagination_options=pagination_options)
# [/DEF:SupersetClient._fetch_all_pages]
# [DEF:SupersetClient._validate_import_file:Function]
# @PURPOSE: Проверяет, что файл существует, является ZIP-архивом и содержит `metadata.yaml`.
# @PRE: zip_path должен быть предоставлен.
# @POST: Проверка пройдена, если файл существует, является ZIP и содержит `metadata.yaml`.
# @THROW: FileNotFoundError - Если файл не найден.
# @THROW: InvalidZipFormatError - Если файл не является ZIP или не содержит `metadata.yaml`.
# @PARAM: zip_path (Union[str, Path]) - Путь к файлу.
def _validate_import_file(self, zip_path: Union[str, Path]) -> None:
assert zip_path, "[_validate_import_file][PRE] zip_path must be provided."
path = Path(zip_path)
assert path.exists(), f"Файл {zip_path} не существует"
assert zipfile.is_zipfile(path), f"Файл {zip_path} не является ZIP-архивом"
with zipfile.ZipFile(path, "r") as zf:
assert any(n.endswith("metadata.yaml") for n in zf.namelist()), f"Архив {zip_path} не содержит 'metadata.yaml'"
# [/DEF:SupersetClient._validate_import_file]
# [DEF:SupersetClient.get_datasets:Function]
# @PURPOSE: Получает полный список датасетов, автоматически обрабатывая пагинацию.
# @RELATION: CALLS -> self._fetch_total_object_count
# @RELATION: CALLS -> self._fetch_all_pages
# @PARAM: query (Optional[Dict]) - Дополнительные параметры запроса.
# @PRE: self.network должен быть инициализирован.
# @POST: Возвращаемый список содержит все датасеты, доступные по API.
# @THROW: APIError - В случае ошибки сетевого запроса.
# @RETURN: Tuple[int, List[Dict]] - Кортеж (общее количество, список датасетов).
def get_datasets(self, query: Optional[Dict] = None) -> Tuple[int, List[Dict]]:
assert self.network, "[get_datasets][PRE] Network client must be initialized."
self.logger.info("[get_datasets][Enter] Fetching datasets.")
validated_query = self._validate_query_params(query)
total_count = self._fetch_total_object_count(endpoint="/dataset/")
paginated_data = self._fetch_all_pages(
endpoint="/dataset/",
pagination_options={"base_query": validated_query, "total_count": total_count, "results_field": "result"},
)
self.logger.info("[get_datasets][Exit] Found %d datasets.", total_count)
return total_count, paginated_data
# [/DEF:SupersetClient.get_datasets]
# [DEF:SupersetClient.get_databases:Function]
# @PURPOSE: Получает полный список баз данных, автоматически обрабатывая пагинацию.
# @RELATION: CALLS -> self._fetch_total_object_count
# @RELATION: CALLS -> self._fetch_all_pages
# @PARAM: query (Optional[Dict]) - Дополнительные параметры запроса.
# @PRE: self.network должен быть инициализирован.
# @POST: Возвращаемый список содержит все базы данных, доступные по API.
# @THROW: APIError - В случае ошибки сетевого запроса.
# @RETURN: Tuple[int, List[Dict]] - Кортеж (общее количество, список баз данных).
def get_databases(self, query: Optional[Dict] = None) -> Tuple[int, List[Dict]]:
assert self.network, "[get_databases][PRE] Network client must be initialized."
self.logger.info("[get_databases][Enter] Fetching databases.")
validated_query = self._validate_query_params(query or {})
if 'columns' not in validated_query:
validated_query['columns'] = []
total_count = self._fetch_total_object_count(endpoint="/database/")
paginated_data = self._fetch_all_pages(
endpoint="/database/",
pagination_options={"base_query": validated_query, "total_count": total_count, "results_field": "result"},
)
self.logger.info("[get_databases][Exit] Found %d databases.", total_count)
return total_count, paginated_data
# [/DEF:SupersetClient.get_databases]
# [DEF:SupersetClient.get_dataset:Function]
# @PURPOSE: Получает информацию о конкретном датасете по его ID.
# @RELATION: CALLS -> self.network.request
# @PARAM: dataset_id (int) - ID датасета.
# @PRE: dataset_id должен быть положительным целым числом.
# @POST: Возвращает словарь с информацией о датасете.
# @THROW: APIError - В случае ошибки сетевого запроса или если датасет не найден.
# @RETURN: Dict - Информация о датасете.
def get_dataset(self, dataset_id: int) -> Dict:
assert isinstance(dataset_id, int) and dataset_id > 0, "[get_dataset][PRE] dataset_id must be a positive integer."
self.logger.info("[get_dataset][Enter] Fetching dataset %s.", dataset_id)
response = self.network.request(method="GET", endpoint=f"/dataset/{dataset_id}")
response = cast(Dict, response)
self.logger.info("[get_dataset][Exit] Got dataset %s.", dataset_id)
return response
# [/DEF:SupersetClient.get_dataset]
# [DEF:SupersetClient.get_database:Function]
# @PURPOSE: Получает информацию о конкретной базе данных по её ID.
# @RELATION: CALLS -> self.network.request
# @PARAM: database_id (int) - ID базы данных.
# @PRE: database_id должен быть положительным целым числом.
# @POST: Возвращает словарь с информацией о базе данных.
# @THROW: APIError - В случае ошибки сетевого запроса или если база данных не найдена.
# @RETURN: Dict - Информация о базе данных.
def get_database(self, database_id: int) -> Dict:
assert isinstance(database_id, int) and database_id > 0, "[get_database][PRE] database_id must be a positive integer."
self.logger.info("[get_database][Enter] Fetching database %s.", database_id)
response = self.network.request(method="GET", endpoint=f"/database/{database_id}")
response = cast(Dict, response)
self.logger.info("[get_database][Exit] Got database %s.", database_id)
return response
# [/DEF:SupersetClient.get_database]
# [DEF:SupersetClient.update_dataset:Function]
# @PURPOSE: Обновляет данные датасета по его ID.
# @RELATION: CALLS -> self.network.request
# @PARAM: dataset_id (int) - ID датасета.
# @PARAM: data (Dict) - Данные для обновления.
# @PRE: dataset_id должен быть положительным целым числом, data - непустым словарем.
# @POST: Датасет успешно обновлен, возвращен ответ API.
# @THROW: APIError - В случае ошибки сетевого запроса.
# @RETURN: Dict - Ответ API.
def update_dataset(self, dataset_id: int, data: Dict) -> Dict:
assert isinstance(dataset_id, int) and dataset_id > 0, "[update_dataset][PRE] dataset_id must be a positive integer."
assert isinstance(data, dict) and data, "[update_dataset][PRE] data must be a non-empty dictionary."
self.logger.info("[update_dataset][Enter] Updating dataset %s.", dataset_id)
response = self.network.request(
method="PUT",
endpoint=f"/dataset/{dataset_id}",
data=json.dumps(data),
headers={'Content-Type': 'application/json'}
)
response = cast(Dict, response)
self.logger.info("[update_dataset][Exit] Updated dataset %s.", dataset_id)
return response
# [/DEF:SupersetClient.update_dataset]
# [/DEF:SupersetClient]
# [/DEF:superset_tool.client]
# [DEF:superset_tool.client:Module]
#
# @SEMANTICS: superset, api, client, rest, http, dashboard, dataset, import, export
# @PURPOSE: Предоставляет высокоуровневый клиент для взаимодействия с Superset REST API, инкапсулируя логику запросов, обработку ошибок и пагинацию.
# @LAYER: Domain
# @RELATION: DEPENDS_ON -> superset_tool.models
# @RELATION: DEPENDS_ON -> superset_tool.exceptions
# @RELATION: DEPENDS_ON -> superset_tool.utils
#
# @INVARIANT: All network operations must use the internal APIClient instance.
# @CONSTRAINT: No direct use of 'requests' library outside of APIClient.
# @PUBLIC_API: SupersetClient
# [SECTION: IMPORTS]
import json
import zipfile
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple, Union, cast
from requests import Response
from superset_tool.models import SupersetConfig
from superset_tool.exceptions import ExportError, InvalidZipFormatError
from superset_tool.utils.fileio import get_filename_from_headers
from superset_tool.utils.logger import SupersetLogger
from superset_tool.utils.network import APIClient
# [/SECTION]
# [DEF:SupersetClient:Class]
# @PURPOSE: Класс-обёртка над Superset REST API, предоставляющий методы для работы с дашбордами и датасетами.
# @RELATION: CREATES_INSTANCE_OF -> APIClient
# @RELATION: USES -> SupersetConfig
class SupersetClient:
# [DEF:SupersetClient.__init__:Function]
# @PURPOSE: Инициализирует клиент, проверяет конфигурацию и создает сетевой клиент.
# @PRE: `config` должен быть валидным объектом SupersetConfig.
# @POST: Атрибуты `logger`, `config`, и `network` созданы и готовы к работе.
# @PARAM: config (SupersetConfig) - Конфигурация подключения.
# @PARAM: logger (Optional[SupersetLogger]) - Экземпляр логгера.
def __init__(self, config: SupersetConfig, logger: Optional[SupersetLogger] = None):
self.logger = logger or SupersetLogger(name="SupersetClient")
self.logger.info("[SupersetClient.__init__][Enter] Initializing SupersetClient.")
self._validate_config(config)
self.config = config
self.network = APIClient(
config=config.dict(),
verify_ssl=config.verify_ssl,
timeout=config.timeout,
logger=self.logger,
)
self.delete_before_reimport: bool = False
self.logger.info("[SupersetClient.__init__][Exit] SupersetClient initialized.")
# [/DEF:SupersetClient.__init__]
# [DEF:SupersetClient._validate_config:Function]
# @PURPOSE: Проверяет, что переданный объект конфигурации имеет корректный тип.
# @PRE: `config` должен быть передан.
# @POST: Если проверка пройдена, выполнение продолжается.
# @THROW: TypeError - Если `config` не является экземпляром `SupersetConfig`.
# @PARAM: config (SupersetConfig) - Объект для проверки.
def _validate_config(self, config: SupersetConfig) -> None:
self.logger.debug("[_validate_config][Enter] Validating SupersetConfig.")
assert isinstance(config, SupersetConfig), "Конфигурация должна быть экземпляром SupersetConfig"
self.logger.debug("[_validate_config][Exit] Config is valid.")
# [/DEF:SupersetClient._validate_config]
@property
def headers(self) -> dict:
# [DEF:SupersetClient.headers:Function]
# @PURPOSE: Возвращает базовые HTTP-заголовки, используемые сетевым клиентом.
# @PRE: self.network должен быть инициализирован.
# @POST: Возвращаемый словарь содержит актуальные заголовки, включая токен авторизации.
return self.network.headers
# [/DEF:SupersetClient.headers]
# [DEF:SupersetClient.get_dashboards:Function]
# @PURPOSE: Получает полный список дашбордов, автоматически обрабатывая пагинацию.
# @RELATION: CALLS -> self._fetch_total_object_count
# @RELATION: CALLS -> self._fetch_all_pages
# @PRE: self.network должен быть инициализирован.
# @POST: Возвращаемый список содержит все дашборды, доступные по API.
# @THROW: APIError - В случае ошибки сетевого запроса.
# @PARAM: query (Optional[Dict]) - Дополнительные параметры запроса для API.
# @RETURN: Tuple[int, List[Dict]] - Кортеж (общее количество, список дашбордов).
def get_dashboards(self, query: Optional[Dict] = None) -> Tuple[int, List[Dict]]:
assert self.network, "[get_dashboards][PRE] Network client must be initialized."
self.logger.info("[get_dashboards][Enter] Fetching dashboards.")
validated_query = self._validate_query_params(query or {})
if 'columns' not in validated_query:
validated_query['columns'] = ["slug", "id", "changed_on_utc", "dashboard_title", "published"]
total_count = self._fetch_total_object_count(endpoint="/dashboard/")
paginated_data = self._fetch_all_pages(
endpoint="/dashboard/",
pagination_options={"base_query": validated_query, "total_count": total_count, "results_field": "result"},
)
self.logger.info("[get_dashboards][Exit] Found %d dashboards.", total_count)
return total_count, paginated_data
# [/DEF:SupersetClient.get_dashboards]
# [DEF:SupersetClient.export_dashboard:Function]
# @PURPOSE: Экспортирует дашборд в виде ZIP-архива.
# @RELATION: CALLS -> self.network.request
# @PRE: dashboard_id должен быть положительным целым числом.
# @POST: Возвращает бинарное содержимое ZIP-архива и имя файла.
# @THROW: ExportError - Если экспорт завершился неудачей.
# @PARAM: dashboard_id (int) - ID дашборда для экспорта.
# @RETURN: Tuple[bytes, str] - Бинарное содержимое ZIP-архива и имя файла.
def export_dashboard(self, dashboard_id: int) -> Tuple[bytes, str]:
assert isinstance(dashboard_id, int) and dashboard_id > 0, "[export_dashboard][PRE] dashboard_id must be a positive integer."
self.logger.info("[export_dashboard][Enter] Exporting dashboard %s.", dashboard_id)
response = self.network.request(
method="GET",
endpoint="/dashboard/export/",
params={"q": json.dumps([dashboard_id])},
stream=True,
raw_response=True,
)
response = cast(Response, response)
self._validate_export_response(response, dashboard_id)
filename = self._resolve_export_filename(response, dashboard_id)
self.logger.info("[export_dashboard][Exit] Exported dashboard %s to %s.", dashboard_id, filename)
return response.content, filename
# [/DEF:SupersetClient.export_dashboard]
# [DEF:SupersetClient.import_dashboard:Function]
# @PURPOSE: Импортирует дашборд из ZIP-файла с возможностью автоматического удаления и повторной попытки при ошибке.
# @RELATION: CALLS -> self._do_import
# @RELATION: CALLS -> self.delete_dashboard
# @RELATION: CALLS -> self.get_dashboards
# @PRE: Файл, указанный в `file_name`, должен существовать и быть валидным ZIP-архивом Superset.
# @POST: Дашборд успешно импортирован, возвращен ответ API.
# @THROW: FileNotFoundError - Если файл не найден.
# @THROW: InvalidZipFormatError - Если файл не является валидным ZIP-архивом Superset.
# @PARAM: file_name (Union[str, Path]) - Путь к ZIP-архиву.
# @PARAM: dash_id (Optional[int]) - ID дашборда для удаления при сбое.
# @PARAM: dash_slug (Optional[str]) - Slug дашборда для поиска ID, если ID не предоставлен.
# @RETURN: Dict - Ответ API в случае успеха.
def import_dashboard(self, file_name: Union[str, Path], dash_id: Optional[int] = None, dash_slug: Optional[str] = None) -> Dict:
assert file_name, "[import_dashboard][PRE] file_name must be provided."
file_path = str(file_name)
self._validate_import_file(file_path)
try:
return self._do_import(file_path)
except Exception as exc:
self.logger.error("[import_dashboard][Failure] First import attempt failed: %s", exc, exc_info=True)
if not self.delete_before_reimport:
raise
target_id = self._resolve_target_id_for_delete(dash_id, dash_slug)
if target_id is None:
self.logger.error("[import_dashboard][Failure] No ID available for delete-retry.")
raise
self.delete_dashboard(target_id)
self.logger.info("[import_dashboard][State] Deleted dashboard ID %s, retrying import.", target_id)
return self._do_import(file_path)
# [/DEF:SupersetClient.import_dashboard]
# [DEF:SupersetClient._resolve_target_id_for_delete:Function]
# @PURPOSE: Определяет ID дашборда для удаления, используя ID или slug.
# @PARAM: dash_id (Optional[int]) - ID дашборда.
# @PARAM: dash_slug (Optional[str]) - Slug дашборда.
# @PRE: По крайней мере один из параметров (dash_id или dash_slug) должен быть предоставлен.
# @POST: Возвращает ID дашборда, если найден, иначе None.
# @THROW: APIError - В случае ошибки сетевого запроса при поиске по slug.
# @RETURN: Optional[int] - Найденный ID или None.
def _resolve_target_id_for_delete(self, dash_id: Optional[int], dash_slug: Optional[str]) -> Optional[int]:
assert dash_id is not None or dash_slug is not None, "[_resolve_target_id_for_delete][PRE] At least one of ID or slug must be provided."
if dash_id is not None:
return dash_id
if dash_slug is not None:
self.logger.debug("[_resolve_target_id_for_delete][State] Resolving ID by slug '%s'.", dash_slug)
try:
_, candidates = self.get_dashboards(query={"filters": [{"col": "slug", "op": "eq", "value": dash_slug}]})
if candidates:
target_id = candidates[0]["id"]
self.logger.debug("[_resolve_target_id_for_delete][Success] Resolved slug to ID %s.", target_id)
return target_id
except Exception as e:
self.logger.warning("[_resolve_target_id_for_delete][Warning] Could not resolve slug '%s' to ID: %s", dash_slug, e)
return None
# [/DEF:SupersetClient._resolve_target_id_for_delete]
# [DEF:SupersetClient._do_import:Function]
# @PURPOSE: Выполняет один запрос на импорт без обработки исключений.
# @PRE: Файл должен существовать.
# @POST: Файл успешно загружен, возвращен ответ API.
# @THROW: FileNotFoundError - Если файл не существует.
# @PARAM: file_name (Union[str, Path]) - Путь к файлу.
# @RETURN: Dict - Ответ API.
def _do_import(self, file_name: Union[str, Path]) -> Dict:
self.logger.debug(f"[_do_import][State] Uploading file: {file_name}")
file_path = Path(file_name)
if file_path.exists():
self.logger.debug(f"[_do_import][State] File size: {file_path.stat().st_size} bytes")
else:
self.logger.error(f"[_do_import][Failure] File does not exist: {file_name}")
raise FileNotFoundError(f"File does not exist: {file_name}")
return self.network.upload_file(
endpoint="/dashboard/import/",
file_info={"file_obj": file_path, "file_name": file_path.name, "form_field": "formData"},
extra_data={"overwrite": "true"},
timeout=self.config.timeout * 2,
)
# [/DEF:SupersetClient._do_import]
# [DEF:SupersetClient.delete_dashboard:Function]
# @PURPOSE: Удаляет дашборд по его ID или slug.
# @RELATION: CALLS -> self.network.request
# @PRE: dashboard_id должен быть предоставлен.
# @POST: Дашборд удален или залогировано предупреждение.
# @THROW: APIError - В случае ошибки сетевого запроса.
# @PARAM: dashboard_id (Union[int, str]) - ID или slug дашборда.
def delete_dashboard(self, dashboard_id: Union[int, str]) -> None:
assert dashboard_id, "[delete_dashboard][PRE] dashboard_id must be provided."
self.logger.info("[delete_dashboard][Enter] Deleting dashboard %s.", dashboard_id)
response = self.network.request(method="DELETE", endpoint=f"/dashboard/{dashboard_id}")
response = cast(Dict, response)
if response.get("result", True) is not False:
self.logger.info("[delete_dashboard][Success] Dashboard %s deleted.", dashboard_id)
else:
self.logger.warning("[delete_dashboard][Warning] Unexpected response while deleting %s: %s", dashboard_id, response)
# [/DEF:SupersetClient.delete_dashboard]
# [DEF:SupersetClient._extract_dashboard_id_from_zip:Function]
# @PURPOSE: Извлекает ID дашборда из `metadata.yaml` внутри ZIP-архива.
# @PARAM: file_name (Union[str, Path]) - Путь к ZIP-файлу.
# @PRE: Файл, указанный в `file_name`, должен быть валидным ZIP-архивом.
# @POST: Возвращает ID дашборда, если найден в metadata.yaml, иначе None.
# @THROW: ImportError - Если не установлен `yaml`.
# @RETURN: Optional[int] - ID дашборда или None.
def _extract_dashboard_id_from_zip(self, file_name: Union[str, Path]) -> Optional[int]:
assert zipfile.is_zipfile(file_name), "[_extract_dashboard_id_from_zip][PRE] file_name must be a valid zip file."
try:
import yaml
with zipfile.ZipFile(file_name, "r") as zf:
for name in zf.namelist():
if name.endswith("metadata.yaml"):
with zf.open(name) as meta_file:
meta = yaml.safe_load(meta_file)
dash_id = meta.get("dashboard_uuid") or meta.get("dashboard_id")
if dash_id: return int(dash_id)
except Exception as exc:
self.logger.error("[_extract_dashboard_id_from_zip][Failure] %s", exc, exc_info=True)
return None
# [/DEF:SupersetClient._extract_dashboard_id_from_zip]
# [DEF:SupersetClient._extract_dashboard_slug_from_zip:Function]
# @PURPOSE: Извлекает slug дашборда из `metadata.yaml` внутри ZIP-архива.
# @PARAM: file_name (Union[str, Path]) - Путь к ZIP-файлу.
# @PRE: Файл, указанный в `file_name`, должен быть валидным ZIP-архивом.
# @POST: Возвращает slug дашборда, если найден в metadata.yaml, иначе None.
# @THROW: ImportError - Если не установлен `yaml`.
# @RETURN: Optional[str] - Slug дашборда или None.
def _extract_dashboard_slug_from_zip(self, file_name: Union[str, Path]) -> Optional[str]:
assert zipfile.is_zipfile(file_name), "[_extract_dashboard_slug_from_zip][PRE] file_name must be a valid zip file."
try:
import yaml
with zipfile.ZipFile(file_name, "r") as zf:
for name in zf.namelist():
if name.endswith("metadata.yaml"):
with zf.open(name) as meta_file:
meta = yaml.safe_load(meta_file)
if slug := meta.get("slug"):
return str(slug)
except Exception as exc:
self.logger.error("[_extract_dashboard_slug_from_zip][Failure] %s", exc, exc_info=True)
return None
# [/DEF:SupersetClient._extract_dashboard_slug_from_zip]
# [DEF:SupersetClient._validate_export_response:Function]
# @PURPOSE: Проверяет, что HTTP-ответ на экспорт является валидным ZIP-архивом.
# @PRE: response должен быть объектом requests.Response.
# @POST: Проверка пройдена, если ответ является непустым ZIP-архивом.
# @THROW: ExportError - Если ответ не является ZIP-архивом или пуст.
# @PARAM: response (Response) - HTTP ответ.
# @PARAM: dashboard_id (int) - ID дашборда.
def _validate_export_response(self, response: Response, dashboard_id: int) -> None:
assert isinstance(response, Response), "[_validate_export_response][PRE] response must be a requests.Response object."
content_type = response.headers.get("Content-Type", "")
if "application/zip" not in content_type:
raise ExportError(f"Получен не ZIP-архив (Content-Type: {content_type})")
if not response.content:
raise ExportError("Получены пустые данные при экспорте")
# [/DEF:SupersetClient._validate_export_response]
# [DEF:SupersetClient._resolve_export_filename:Function]
# @PURPOSE: Определяет имя файла для экспорта из заголовков или генерирует его.
# @PRE: response должен быть объектом requests.Response.
# @POST: Возвращает непустое имя файла.
# @PARAM: response (Response) - HTTP ответ.
# @PARAM: dashboard_id (int) - ID дашборда.
# @RETURN: str - Имя файла.
def _resolve_export_filename(self, response: Response, dashboard_id: int) -> str:
assert isinstance(response, Response), "[_resolve_export_filename][PRE] response must be a requests.Response object."
filename = get_filename_from_headers(dict(response.headers))
if not filename:
from datetime import datetime
timestamp = datetime.now().strftime("%Y%m%dT%H%M%S")
filename = f"dashboard_export_{dashboard_id}_{timestamp}.zip"
self.logger.warning("[_resolve_export_filename][Warning] Generated filename: %s", filename)
return filename
# [/DEF:SupersetClient._resolve_export_filename]
# [DEF:SupersetClient._validate_query_params:Function]
# @PURPOSE: Формирует корректный набор параметров запроса с пагинацией.
# @PARAM: query (Optional[Dict]) - Исходные параметры.
# @PRE: query, если предоставлен, должен быть словарем.
# @POST: Возвращает словарь, содержащий базовые параметры пагинации, объединенные с `query`.
# @RETURN: Dict - Валидированные параметры.
def _validate_query_params(self, query: Optional[Dict]) -> Dict:
assert query is None or isinstance(query, dict), "[_validate_query_params][PRE] query must be a dictionary or None."
base_query = {"page": 0, "page_size": 1000}
return {**base_query, **(query or {})}
# [/DEF:SupersetClient._validate_query_params]
# [DEF:SupersetClient._fetch_total_object_count:Function]
# @PURPOSE: Получает общее количество объектов по указанному эндпоинту для пагинации.
# @PARAM: endpoint (str) - API эндпоинт.
# @PRE: endpoint должен быть непустой строкой.
# @POST: Возвращает общее количество объектов (>= 0).
# @THROW: APIError - В случае ошибки сетевого запроса.
# @RETURN: int - Количество объектов.
def _fetch_total_object_count(self, endpoint: str) -> int:
assert endpoint and isinstance(endpoint, str), "[_fetch_total_object_count][PRE] endpoint must be a non-empty string."
return self.network.fetch_paginated_count(
endpoint=endpoint,
query_params={"page": 0, "page_size": 1},
count_field="count",
)
# [/DEF:SupersetClient._fetch_total_object_count]
# [DEF:SupersetClient._fetch_all_pages:Function]
# @PURPOSE: Итерируется по всем страницам пагинированного API и собирает все данные.
# @PARAM: endpoint (str) - API эндпоинт.
# @PARAM: pagination_options (Dict) - Опции пагинации.
# @PRE: endpoint должен быть непустой строкой, pagination_options - словарем.
# @POST: Возвращает полный список объектов.
# @THROW: APIError - В случае ошибки сетевого запроса.
# @RETURN: List[Dict] - Список всех объектов.
def _fetch_all_pages(self, endpoint: str, pagination_options: Dict) -> List[Dict]:
assert endpoint and isinstance(endpoint, str), "[_fetch_all_pages][PRE] endpoint must be a non-empty string."
assert isinstance(pagination_options, dict), "[_fetch_all_pages][PRE] pagination_options must be a dictionary."
return self.network.fetch_paginated_data(endpoint=endpoint, pagination_options=pagination_options)
# [/DEF:SupersetClient._fetch_all_pages]
# [DEF:SupersetClient._validate_import_file:Function]
# @PURPOSE: Проверяет, что файл существует, является ZIP-архивом и содержит `metadata.yaml`.
# @PRE: zip_path должен быть предоставлен.
# @POST: Проверка пройдена, если файл существует, является ZIP и содержит `metadata.yaml`.
# @THROW: FileNotFoundError - Если файл не найден.
# @THROW: InvalidZipFormatError - Если файл не является ZIP или не содержит `metadata.yaml`.
# @PARAM: zip_path (Union[str, Path]) - Путь к файлу.
def _validate_import_file(self, zip_path: Union[str, Path]) -> None:
assert zip_path, "[_validate_import_file][PRE] zip_path must be provided."
path = Path(zip_path)
assert path.exists(), f"Файл {zip_path} не существует"
assert zipfile.is_zipfile(path), f"Файл {zip_path} не является ZIP-архивом"
with zipfile.ZipFile(path, "r") as zf:
assert any(n.endswith("metadata.yaml") for n in zf.namelist()), f"Архив {zip_path} не содержит 'metadata.yaml'"
# [/DEF:SupersetClient._validate_import_file]
# [DEF:SupersetClient.get_datasets:Function]
# @PURPOSE: Получает полный список датасетов, автоматически обрабатывая пагинацию.
# @RELATION: CALLS -> self._fetch_total_object_count
# @RELATION: CALLS -> self._fetch_all_pages
# @PARAM: query (Optional[Dict]) - Дополнительные параметры запроса.
# @PRE: self.network должен быть инициализирован.
# @POST: Возвращаемый список содержит все датасеты, доступные по API.
# @THROW: APIError - В случае ошибки сетевого запроса.
# @RETURN: Tuple[int, List[Dict]] - Кортеж (общее количество, список датасетов).
def get_datasets(self, query: Optional[Dict] = None) -> Tuple[int, List[Dict]]:
assert self.network, "[get_datasets][PRE] Network client must be initialized."
self.logger.info("[get_datasets][Enter] Fetching datasets.")
validated_query = self._validate_query_params(query)
total_count = self._fetch_total_object_count(endpoint="/dataset/")
paginated_data = self._fetch_all_pages(
endpoint="/dataset/",
pagination_options={"base_query": validated_query, "total_count": total_count, "results_field": "result"},
)
self.logger.info("[get_datasets][Exit] Found %d datasets.", total_count)
return total_count, paginated_data
# [/DEF:SupersetClient.get_datasets]
# [DEF:SupersetClient.get_databases:Function]
# @PURPOSE: Получает полный список баз данных, автоматически обрабатывая пагинацию.
# @RELATION: CALLS -> self._fetch_total_object_count
# @RELATION: CALLS -> self._fetch_all_pages
# @PARAM: query (Optional[Dict]) - Дополнительные параметры запроса.
# @PRE: self.network должен быть инициализирован.
# @POST: Возвращаемый список содержит все базы данных, доступные по API.
# @THROW: APIError - В случае ошибки сетевого запроса.
# @RETURN: Tuple[int, List[Dict]] - Кортеж (общее количество, список баз данных).
def get_databases(self, query: Optional[Dict] = None) -> Tuple[int, List[Dict]]:
assert self.network, "[get_databases][PRE] Network client must be initialized."
self.logger.info("[get_databases][Enter] Fetching databases.")
validated_query = self._validate_query_params(query or {})
if 'columns' not in validated_query:
validated_query['columns'] = []
total_count = self._fetch_total_object_count(endpoint="/database/")
paginated_data = self._fetch_all_pages(
endpoint="/database/",
pagination_options={"base_query": validated_query, "total_count": total_count, "results_field": "result"},
)
self.logger.info("[get_databases][Exit] Found %d databases.", total_count)
return total_count, paginated_data
# [/DEF:SupersetClient.get_databases]
# [DEF:SupersetClient.get_dataset:Function]
# @PURPOSE: Получает информацию о конкретном датасете по его ID.
# @RELATION: CALLS -> self.network.request
# @PARAM: dataset_id (int) - ID датасета.
# @PRE: dataset_id должен быть положительным целым числом.
# @POST: Возвращает словарь с информацией о датасете.
# @THROW: APIError - В случае ошибки сетевого запроса или если датасет не найден.
# @RETURN: Dict - Информация о датасете.
def get_dataset(self, dataset_id: int) -> Dict:
assert isinstance(dataset_id, int) and dataset_id > 0, "[get_dataset][PRE] dataset_id must be a positive integer."
self.logger.info("[get_dataset][Enter] Fetching dataset %s.", dataset_id)
response = self.network.request(method="GET", endpoint=f"/dataset/{dataset_id}")
response = cast(Dict, response)
self.logger.info("[get_dataset][Exit] Got dataset %s.", dataset_id)
return response
# [/DEF:SupersetClient.get_dataset]
# [DEF:SupersetClient.get_database:Function]
# @PURPOSE: Получает информацию о конкретной базе данных по её ID.
# @RELATION: CALLS -> self.network.request
# @PARAM: database_id (int) - ID базы данных.
# @PRE: database_id должен быть положительным целым числом.
# @POST: Возвращает словарь с информацией о базе данных.
# @THROW: APIError - В случае ошибки сетевого запроса или если база данных не найдена.
# @RETURN: Dict - Информация о базе данных.
def get_database(self, database_id: int) -> Dict:
assert isinstance(database_id, int) and database_id > 0, "[get_database][PRE] database_id must be a positive integer."
self.logger.info("[get_database][Enter] Fetching database %s.", database_id)
response = self.network.request(method="GET", endpoint=f"/database/{database_id}")
response = cast(Dict, response)
self.logger.info("[get_database][Exit] Got database %s.", database_id)
return response
# [/DEF:SupersetClient.get_database]
# [DEF:SupersetClient.update_dataset:Function]
# @PURPOSE: Обновляет данные датасета по его ID.
# @RELATION: CALLS -> self.network.request
# @PARAM: dataset_id (int) - ID датасета.
# @PARAM: data (Dict) - Данные для обновления.
# @PRE: dataset_id должен быть положительным целым числом, data - непустым словарем.
# @POST: Датасет успешно обновлен, возвращен ответ API.
# @THROW: APIError - В случае ошибки сетевого запроса.
# @RETURN: Dict - Ответ API.
def update_dataset(self, dataset_id: int, data: Dict) -> Dict:
assert isinstance(dataset_id, int) and dataset_id > 0, "[update_dataset][PRE] dataset_id must be a positive integer."
assert isinstance(data, dict) and data, "[update_dataset][PRE] data must be a non-empty dictionary."
self.logger.info("[update_dataset][Enter] Updating dataset %s.", dataset_id)
response = self.network.request(
method="PUT",
endpoint=f"/dataset/{dataset_id}",
data=json.dumps(data),
headers={'Content-Type': 'application/json'}
)
response = cast(Dict, response)
self.logger.info("[update_dataset][Exit] Updated dataset %s.", dataset_id)
return response
# [/DEF:SupersetClient.update_dataset]
# [/DEF:SupersetClient]
# [/DEF:superset_tool.client]

254
superset_tool/exceptions.py Normal file → Executable file
View File

@@ -1,128 +1,128 @@
# [DEF:superset_tool.exceptions:Module]
# @PURPOSE: Определяет иерархию пользовательских исключений для всего инструмента, обеспечивая единую точку обработки ошибок.
# @SEMANTICS: exception, error, hierarchy
# @LAYER: Infra
# [SECTION: IMPORTS]
from pathlib import Path
from typing import Optional, Dict, Any, Union
# [/SECTION]
# [DEF:SupersetToolError:Class]
# @PURPOSE: Базовый класс для всех ошибок, генерируемых инструментом.
# @RELATION: INHERITS_FROM -> Exception
# @PARAM: message (str) - Сообщение об ошибке.
# @PARAM: context (Optional[Dict[str, Any]]) - Дополнительный контекст ошибки.
class SupersetToolError(Exception):
def __init__(self, message: str, context: Optional[Dict[str, Any]] = None):
self.context = context or {}
super().__init__(f"{message} | Context: {self.context}")
# [/DEF:SupersetToolError]
# [DEF:AuthenticationError:Class]
# @PURPOSE: Ошибки, связанные с аутентификацией или авторизацией.
# @RELATION: INHERITS_FROM -> SupersetToolError
# @PARAM: message (str) - Сообщение об ошибке.
# @PARAM: context (Any) - Дополнительный контекст ошибки.
class AuthenticationError(SupersetToolError):
def __init__(self, message: str = "Authentication failed", **context: Any):
super().__init__(f"[AUTH_FAILURE] {message}", context={"type": "authentication", **context})
# [/DEF:AuthenticationError]
# [DEF:PermissionDeniedError:Class]
# @PURPOSE: Ошибка, возникающая при отказе в доступе к ресурсу.
# @RELATION: INHERITS_FROM -> AuthenticationError
# @PARAM: message (str) - Сообщение об ошибке.
# @PARAM: required_permission (Optional[str]) - Требуемое разрешение.
# @PARAM: context (Any) - Дополнительный контекст ошибки.
class PermissionDeniedError(AuthenticationError):
def __init__(self, message: str = "Permission denied", required_permission: Optional[str] = None, **context: Any):
full_message = f"Permission denied: {required_permission}" if required_permission else message
super().__init__(full_message, context={"required_permission": required_permission, **context})
# [/DEF:PermissionDeniedError]
# [DEF:SupersetAPIError:Class]
# @PURPOSE: Общие ошибки при взаимодействии с Superset API.
# @RELATION: INHERITS_FROM -> SupersetToolError
# @PARAM: message (str) - Сообщение об ошибке.
# @PARAM: context (Any) - Дополнительный контекст ошибки.
class SupersetAPIError(SupersetToolError):
def __init__(self, message: str = "Superset API error", **context: Any):
super().__init__(f"[API_FAILURE] {message}", context={"type": "api_call", **context})
# [/DEF:SupersetAPIError]
# [DEF:ExportError:Class]
# @PURPOSE: Ошибки, специфичные для операций экспорта.
# @RELATION: INHERITS_FROM -> SupersetAPIError
# @PARAM: message (str) - Сообщение об ошибке.
# @PARAM: context (Any) - Дополнительный контекст ошибки.
class ExportError(SupersetAPIError):
def __init__(self, message: str = "Dashboard export failed", **context: Any):
super().__init__(f"[EXPORT_FAILURE] {message}", context={"subtype": "export", **context})
# [/DEF:ExportError]
# [DEF:DashboardNotFoundError:Class]
# @PURPOSE: Ошибка, когда запрошенный дашборд или ресурс не найден (404).
# @RELATION: INHERITS_FROM -> SupersetAPIError
# @PARAM: dashboard_id_or_slug (Union[int, str]) - ID или slug дашборда.
# @PARAM: message (str) - Сообщение об ошибке.
# @PARAM: context (Any) - Дополнительный контекст ошибки.
class DashboardNotFoundError(SupersetAPIError):
def __init__(self, dashboard_id_or_slug: Union[int, str], message: str = "Dashboard not found", **context: Any):
super().__init__(f"[NOT_FOUND] Dashboard '{dashboard_id_or_slug}' {message}", context={"subtype": "not_found", "resource_id": dashboard_id_or_slug, **context})
# [/DEF:DashboardNotFoundError]
# [DEF:DatasetNotFoundError:Class]
# @PURPOSE: Ошибка, когда запрашиваемый набор данных не существует (404).
# @RELATION: INHERITS_FROM -> SupersetAPIError
# @PARAM: dataset_id_or_slug (Union[int, str]) - ID или slug набора данных.
# @PARAM: message (str) - Сообщение об ошибке.
# @PARAM: context (Any) - Дополнительный контекст ошибки.
class DatasetNotFoundError(SupersetAPIError):
def __init__(self, dataset_id_or_slug: Union[int, str], message: str = "Dataset not found", **context: Any):
super().__init__(f"[NOT_FOUND] Dataset '{dataset_id_or_slug}' {message}", context={"subtype": "not_found", "resource_id": dataset_id_or_slug, **context})
# [/DEF:DatasetNotFoundError]
# [DEF:InvalidZipFormatError:Class]
# @PURPOSE: Ошибка, указывающая на некорректный формат или содержимое ZIP-архива.
# @RELATION: INHERITS_FROM -> SupersetToolError
# @PARAM: message (str) - Сообщение об ошибке.
# @PARAM: file_path (Optional[Union[str, Path]]) - Путь к файлу.
# @PARAM: context (Any) - Дополнительный контекст ошибки.
class InvalidZipFormatError(SupersetToolError):
def __init__(self, message: str = "Invalid ZIP format or content", file_path: Optional[Union[str, Path]] = None, **context: Any):
super().__init__(f"[FILE_ERROR] {message}", context={"type": "file_validation", "file_path": str(file_path) if file_path else "N/A", **context})
# [/DEF:InvalidZipFormatError]
# [DEF:NetworkError:Class]
# @PURPOSE: Ошибки, связанные с сетевым соединением.
# @RELATION: INHERITS_FROM -> SupersetToolError
# @PARAM: message (str) - Сообщение об ошибке.
# @PARAM: context (Any) - Дополнительный контекст ошибки.
class NetworkError(SupersetToolError):
def __init__(self, message: str = "Network connection failed", **context: Any):
super().__init__(f"[NETWORK_FAILURE] {message}", context={"type": "network", **context})
# [/DEF:NetworkError]
# [DEF:FileOperationError:Class]
# @PURPOSE: Общие ошибки файловых операций (I/O).
# @RELATION: INHERITS_FROM -> SupersetToolError
class FileOperationError(SupersetToolError):
pass
# [/DEF:FileOperationError]
# [DEF:InvalidFileStructureError:Class]
# @PURPOSE: Ошибка, указывающая на некорректную структуру файлов или директорий.
# @RELATION: INHERITS_FROM -> FileOperationError
class InvalidFileStructureError(FileOperationError):
pass
# [/DEF:InvalidFileStructureError]
# [DEF:ConfigurationError:Class]
# @PURPOSE: Ошибки, связанные с неверной конфигурацией инструмента.
# @RELATION: INHERITS_FROM -> SupersetToolError
class ConfigurationError(SupersetToolError):
pass
# [/DEF:ConfigurationError]
# [DEF:superset_tool.exceptions:Module]
# @PURPOSE: Определяет иерархию пользовательских исключений для всего инструмента, обеспечивая единую точку обработки ошибок.
# @SEMANTICS: exception, error, hierarchy
# @LAYER: Infra
# [SECTION: IMPORTS]
from pathlib import Path
from typing import Optional, Dict, Any, Union
# [/SECTION]
# [DEF:SupersetToolError:Class]
# @PURPOSE: Базовый класс для всех ошибок, генерируемых инструментом.
# @RELATION: INHERITS_FROM -> Exception
# @PARAM: message (str) - Сообщение об ошибке.
# @PARAM: context (Optional[Dict[str, Any]]) - Дополнительный контекст ошибки.
class SupersetToolError(Exception):
def __init__(self, message: str, context: Optional[Dict[str, Any]] = None):
self.context = context or {}
super().__init__(f"{message} | Context: {self.context}")
# [/DEF:SupersetToolError]
# [DEF:AuthenticationError:Class]
# @PURPOSE: Ошибки, связанные с аутентификацией или авторизацией.
# @RELATION: INHERITS_FROM -> SupersetToolError
# @PARAM: message (str) - Сообщение об ошибке.
# @PARAM: context (Any) - Дополнительный контекст ошибки.
class AuthenticationError(SupersetToolError):
def __init__(self, message: str = "Authentication failed", **context: Any):
super().__init__(f"[AUTH_FAILURE] {message}", context={"type": "authentication", **context})
# [/DEF:AuthenticationError]
# [DEF:PermissionDeniedError:Class]
# @PURPOSE: Ошибка, возникающая при отказе в доступе к ресурсу.
# @RELATION: INHERITS_FROM -> AuthenticationError
# @PARAM: message (str) - Сообщение об ошибке.
# @PARAM: required_permission (Optional[str]) - Требуемое разрешение.
# @PARAM: context (Any) - Дополнительный контекст ошибки.
class PermissionDeniedError(AuthenticationError):
def __init__(self, message: str = "Permission denied", required_permission: Optional[str] = None, **context: Any):
full_message = f"Permission denied: {required_permission}" if required_permission else message
super().__init__(full_message, context={"required_permission": required_permission, **context})
# [/DEF:PermissionDeniedError]
# [DEF:SupersetAPIError:Class]
# @PURPOSE: Общие ошибки при взаимодействии с Superset API.
# @RELATION: INHERITS_FROM -> SupersetToolError
# @PARAM: message (str) - Сообщение об ошибке.
# @PARAM: context (Any) - Дополнительный контекст ошибки.
class SupersetAPIError(SupersetToolError):
def __init__(self, message: str = "Superset API error", **context: Any):
super().__init__(f"[API_FAILURE] {message}", context={"type": "api_call", **context})
# [/DEF:SupersetAPIError]
# [DEF:ExportError:Class]
# @PURPOSE: Ошибки, специфичные для операций экспорта.
# @RELATION: INHERITS_FROM -> SupersetAPIError
# @PARAM: message (str) - Сообщение об ошибке.
# @PARAM: context (Any) - Дополнительный контекст ошибки.
class ExportError(SupersetAPIError):
def __init__(self, message: str = "Dashboard export failed", **context: Any):
super().__init__(f"[EXPORT_FAILURE] {message}", context={"subtype": "export", **context})
# [/DEF:ExportError]
# [DEF:DashboardNotFoundError:Class]
# @PURPOSE: Ошибка, когда запрошенный дашборд или ресурс не найден (404).
# @RELATION: INHERITS_FROM -> SupersetAPIError
# @PARAM: dashboard_id_or_slug (Union[int, str]) - ID или slug дашборда.
# @PARAM: message (str) - Сообщение об ошибке.
# @PARAM: context (Any) - Дополнительный контекст ошибки.
class DashboardNotFoundError(SupersetAPIError):
def __init__(self, dashboard_id_or_slug: Union[int, str], message: str = "Dashboard not found", **context: Any):
super().__init__(f"[NOT_FOUND] Dashboard '{dashboard_id_or_slug}' {message}", context={"subtype": "not_found", "resource_id": dashboard_id_or_slug, **context})
# [/DEF:DashboardNotFoundError]
# [DEF:DatasetNotFoundError:Class]
# @PURPOSE: Ошибка, когда запрашиваемый набор данных не существует (404).
# @RELATION: INHERITS_FROM -> SupersetAPIError
# @PARAM: dataset_id_or_slug (Union[int, str]) - ID или slug набора данных.
# @PARAM: message (str) - Сообщение об ошибке.
# @PARAM: context (Any) - Дополнительный контекст ошибки.
class DatasetNotFoundError(SupersetAPIError):
def __init__(self, dataset_id_or_slug: Union[int, str], message: str = "Dataset not found", **context: Any):
super().__init__(f"[NOT_FOUND] Dataset '{dataset_id_or_slug}' {message}", context={"subtype": "not_found", "resource_id": dataset_id_or_slug, **context})
# [/DEF:DatasetNotFoundError]
# [DEF:InvalidZipFormatError:Class]
# @PURPOSE: Ошибка, указывающая на некорректный формат или содержимое ZIP-архива.
# @RELATION: INHERITS_FROM -> SupersetToolError
# @PARAM: message (str) - Сообщение об ошибке.
# @PARAM: file_path (Optional[Union[str, Path]]) - Путь к файлу.
# @PARAM: context (Any) - Дополнительный контекст ошибки.
class InvalidZipFormatError(SupersetToolError):
def __init__(self, message: str = "Invalid ZIP format or content", file_path: Optional[Union[str, Path]] = None, **context: Any):
super().__init__(f"[FILE_ERROR] {message}", context={"type": "file_validation", "file_path": str(file_path) if file_path else "N/A", **context})
# [/DEF:InvalidZipFormatError]
# [DEF:NetworkError:Class]
# @PURPOSE: Ошибки, связанные с сетевым соединением.
# @RELATION: INHERITS_FROM -> SupersetToolError
# @PARAM: message (str) - Сообщение об ошибке.
# @PARAM: context (Any) - Дополнительный контекст ошибки.
class NetworkError(SupersetToolError):
def __init__(self, message: str = "Network connection failed", **context: Any):
super().__init__(f"[NETWORK_FAILURE] {message}", context={"type": "network", **context})
# [/DEF:NetworkError]
# [DEF:FileOperationError:Class]
# @PURPOSE: Общие ошибки файловых операций (I/O).
# @RELATION: INHERITS_FROM -> SupersetToolError
class FileOperationError(SupersetToolError):
pass
# [/DEF:FileOperationError]
# [DEF:InvalidFileStructureError:Class]
# @PURPOSE: Ошибка, указывающая на некорректную структуру файлов или директорий.
# @RELATION: INHERITS_FROM -> FileOperationError
class InvalidFileStructureError(FileOperationError):
pass
# [/DEF:InvalidFileStructureError]
# [DEF:ConfigurationError:Class]
# @PURPOSE: Ошибки, связанные с неверной конфигурацией инструмента.
# @RELATION: INHERITS_FROM -> SupersetToolError
class ConfigurationError(SupersetToolError):
pass
# [/DEF:ConfigurationError]
# [/DEF:superset_tool.exceptions]

168
superset_tool/models.py Normal file → Executable file
View File

@@ -1,84 +1,84 @@
# [DEF:superset_tool.models:Module]
#
# @SEMANTICS: pydantic, model, config, validation, data-structure
# @PURPOSE: Определяет Pydantic-модели для конфигурации инструмента, обеспечивая валидацию данных.
# @LAYER: Infra
# @RELATION: DEPENDS_ON -> pydantic
# @RELATION: DEPENDS_ON -> superset_tool.utils.logger
# @PUBLIC_API: SupersetConfig, DatabaseConfig
# [SECTION: IMPORTS]
import re
from typing import Optional, Dict, Any
from pydantic import BaseModel, validator, Field
from .utils.logger import SupersetLogger
# [/SECTION]
# [DEF:SupersetConfig:Class]
# @PURPOSE: Модель конфигурации для подключения к одному экземпляру Superset API.
# @RELATION: INHERITS_FROM -> pydantic.BaseModel
class SupersetConfig(BaseModel):
env: str = Field(..., description="Название окружения (например, dev, prod).")
base_url: str = Field(..., description="Базовый URL Superset API, включая /api/v1.")
auth: Dict[str, str] = Field(..., description="Словарь с данными для аутентификации (provider, username, password, refresh).")
verify_ssl: bool = Field(True, description="Флаг для проверки SSL-сертификатов.")
timeout: int = Field(30, description="Таймаут в секундах для HTTP-запросов.")
logger: Optional[SupersetLogger] = Field(None, description="Экземпляр логгера для логирования.")
# [DEF:SupersetConfig.validate_auth:Function]
# @PURPOSE: Проверяет, что словарь `auth` содержит все необходимые для аутентификации поля.
# @PRE: `v` должен быть словарем.
# @POST: Возвращает `v`, если все обязательные поля (`provider`, `username`, `password`, `refresh`) присутствуют.
# @THROW: ValueError - Если отсутствуют обязательные поля.
# @PARAM: v (Dict[str, str]) - Значение поля auth.
@validator('auth')
def validate_auth(cls, v: Dict[str, str]) -> Dict[str, str]:
required = {'provider', 'username', 'password', 'refresh'}
if not required.issubset(v.keys()):
raise ValueError(f"Словарь 'auth' должен содержать поля: {required}. Отсутствующие: {required - v.keys()}")
return v
# [/DEF:SupersetConfig.validate_auth]
# [DEF:SupersetConfig.check_base_url_format:Function]
# @PURPOSE: Проверяет, что `base_url` соответствует формату URL и содержит `/api/v1`.
# @PRE: `v` должна быть строкой.
# @POST: Возвращает очищенный `v`, если формат корректен.
# @THROW: ValueError - Если формат URL невалиден.
# @PARAM: v (str) - Значение поля base_url.
@validator('base_url')
def check_base_url_format(cls, v: str) -> str:
v = v.strip()
if not re.fullmatch(r'https?://.+/api/v1/?(?:.*)?', v):
raise ValueError(f"Invalid URL format: {v}. Must include '/api/v1'.")
return v
# [/DEF:SupersetConfig.check_base_url_format]
class Config:
arbitrary_types_allowed = True
# [/DEF:SupersetConfig]
# [DEF:DatabaseConfig:Class]
# @PURPOSE: Модель для параметров трансформации баз данных при миграции дашбордов.
# @RELATION: INHERITS_FROM -> pydantic.BaseModel
class DatabaseConfig(BaseModel):
database_config: Dict[str, Dict[str, Any]] = Field(..., description="Словарь, содержащий 'old' и 'new' конфигурации базы данных.")
logger: Optional[SupersetLogger] = Field(None, description="Экземпляр логгера для логирования.")
# [DEF:DatabaseConfig.validate_config:Function]
# @PURPOSE: Проверяет, что словарь `database_config` содержит ключи 'old' и 'new'.
# @PRE: `v` должен быть словарем.
# @POST: Возвращает `v`, если ключи 'old' и 'new' присутствуют.
# @THROW: ValueError - Если отсутствуют обязательные ключи.
# @PARAM: v (Dict[str, Dict[str, Any]]) - Значение поля database_config.
@validator('database_config')
def validate_config(cls, v: Dict[str, Dict[str, Any]]) -> Dict[str, Dict[str, Any]]:
if not {'old', 'new'}.issubset(v.keys()):
raise ValueError("'database_config' должен содержать ключи 'old' и 'new'.")
return v
# [/DEF:DatabaseConfig.validate_config]
class Config:
arbitrary_types_allowed = True
# [/DEF:DatabaseConfig]
# [/DEF:superset_tool.models]
# [DEF:superset_tool.models:Module]
#
# @SEMANTICS: pydantic, model, config, validation, data-structure
# @PURPOSE: Определяет Pydantic-модели для конфигурации инструмента, обеспечивая валидацию данных.
# @LAYER: Infra
# @RELATION: DEPENDS_ON -> pydantic
# @RELATION: DEPENDS_ON -> superset_tool.utils.logger
# @PUBLIC_API: SupersetConfig, DatabaseConfig
# [SECTION: IMPORTS]
import re
from typing import Optional, Dict, Any
from pydantic import BaseModel, validator, Field
from .utils.logger import SupersetLogger
# [/SECTION]
# [DEF:SupersetConfig:Class]
# @PURPOSE: Модель конфигурации для подключения к одному экземпляру Superset API.
# @RELATION: INHERITS_FROM -> pydantic.BaseModel
class SupersetConfig(BaseModel):
env: str = Field(..., description="Название окружения (например, dev, prod).")
base_url: str = Field(..., description="Базовый URL Superset API, включая /api/v1.")
auth: Dict[str, str] = Field(..., description="Словарь с данными для аутентификации (provider, username, password, refresh).")
verify_ssl: bool = Field(True, description="Флаг для проверки SSL-сертификатов.")
timeout: int = Field(30, description="Таймаут в секундах для HTTP-запросов.")
logger: Optional[SupersetLogger] = Field(None, description="Экземпляр логгера для логирования.")
# [DEF:SupersetConfig.validate_auth:Function]
# @PURPOSE: Проверяет, что словарь `auth` содержит все необходимые для аутентификации поля.
# @PRE: `v` должен быть словарем.
# @POST: Возвращает `v`, если все обязательные поля (`provider`, `username`, `password`, `refresh`) присутствуют.
# @THROW: ValueError - Если отсутствуют обязательные поля.
# @PARAM: v (Dict[str, str]) - Значение поля auth.
@validator('auth')
def validate_auth(cls, v: Dict[str, str]) -> Dict[str, str]:
required = {'provider', 'username', 'password', 'refresh'}
if not required.issubset(v.keys()):
raise ValueError(f"Словарь 'auth' должен содержать поля: {required}. Отсутствующие: {required - v.keys()}")
return v
# [/DEF:SupersetConfig.validate_auth]
# [DEF:SupersetConfig.check_base_url_format:Function]
# @PURPOSE: Проверяет, что `base_url` соответствует формату URL и содержит `/api/v1`.
# @PRE: `v` должна быть строкой.
# @POST: Возвращает очищенный `v`, если формат корректен.
# @THROW: ValueError - Если формат URL невалиден.
# @PARAM: v (str) - Значение поля base_url.
@validator('base_url')
def check_base_url_format(cls, v: str) -> str:
v = v.strip()
if not re.fullmatch(r'https?://.+/api/v1/?(?:.*)?', v):
raise ValueError(f"Invalid URL format: {v}. Must include '/api/v1'.")
return v
# [/DEF:SupersetConfig.check_base_url_format]
class Config:
arbitrary_types_allowed = True
# [/DEF:SupersetConfig]
# [DEF:DatabaseConfig:Class]
# @PURPOSE: Модель для параметров трансформации баз данных при миграции дашбордов.
# @RELATION: INHERITS_FROM -> pydantic.BaseModel
class DatabaseConfig(BaseModel):
database_config: Dict[str, Dict[str, Any]] = Field(..., description="Словарь, содержащий 'old' и 'new' конфигурации базы данных.")
logger: Optional[SupersetLogger] = Field(None, description="Экземпляр логгера для логирования.")
# [DEF:DatabaseConfig.validate_config:Function]
# @PURPOSE: Проверяет, что словарь `database_config` содержит ключи 'old' и 'new'.
# @PRE: `v` должен быть словарем.
# @POST: Возвращает `v`, если ключи 'old' и 'new' присутствуют.
# @THROW: ValueError - Если отсутствуют обязательные ключи.
# @PARAM: v (Dict[str, Dict[str, Any]]) - Значение поля database_config.
@validator('database_config')
def validate_config(cls, v: Dict[str, Dict[str, Any]]) -> Dict[str, Dict[str, Any]]:
if not {'old', 'new'}.issubset(v.keys()):
raise ValueError("'database_config' должен содержать ключи 'old' и 'new'.")
return v
# [/DEF:DatabaseConfig.validate_config]
class Config:
arbitrary_types_allowed = True
# [/DEF:DatabaseConfig]
# [/DEF:superset_tool.models]

0
superset_tool/requirements.txt Normal file → Executable file
View File

10
superset_tool/utils/__init__.py Normal file → Executable file
View File

@@ -1,5 +1,5 @@
# [DEF:superset_tool.utils:Module]
# @SEMANTICS: package, utils
# @PURPOSE: Utility package for superset_tool.
# @LAYER: Infra
# [/DEF:superset_tool.utils]
# [DEF:superset_tool.utils:Module]
# @SEMANTICS: package, utils
# @PURPOSE: Utility package for superset_tool.
# @LAYER: Infra
# [/DEF:superset_tool.utils]

458
superset_tool/utils/dataset_mapper.py Normal file → Executable file
View File

@@ -1,229 +1,229 @@
# [DEF:superset_tool.utils.dataset_mapper:Module]
#
# @SEMANTICS: dataset, mapping, postgresql, xlsx, superset
# @PURPOSE: Этот модуль отвечает за обновление метаданных (verbose_map) в датасетах Superset, извлекая их из PostgreSQL или XLSX-файлов.
# @LAYER: Domain
# @RELATION: DEPENDS_ON -> superset_tool.client
# @RELATION: DEPENDS_ON -> pandas
# @RELATION: DEPENDS_ON -> psycopg2
# @PUBLIC_API: DatasetMapper
# [SECTION: IMPORTS]
import pandas as pd # type: ignore
import psycopg2 # type: ignore
from superset_tool.client import SupersetClient
from superset_tool.utils.init_clients import setup_clients
from superset_tool.utils.logger import SupersetLogger
from typing import Dict, List, Optional, Any
# [/SECTION]
# [DEF:DatasetMapper:Class]
# @PURPOSE: Класс для меппинга и обновления verbose_map в датасетах Superset.
class DatasetMapper:
def __init__(self, logger: SupersetLogger):
self.logger = logger
# [DEF:DatasetMapper.get_postgres_comments:Function]
# @PURPOSE: Извлекает комментарии к колонкам из системного каталога PostgreSQL.
# @PRE: `db_config` должен содержать валидные креды для подключения к PostgreSQL.
# @PRE: `table_name` и `table_schema` должны быть строками.
# @POST: Возвращается словарь с меппингом `column_name` -> `column_comment`.
# @THROW: Exception - При ошибках подключения или выполнения запроса к БД.
# @PARAM: db_config (Dict) - Конфигурация для подключения к БД.
# @PARAM: table_name (str) - Имя таблицы.
# @PARAM: table_schema (str) - Схема таблицы.
# @RETURN: Dict[str, str] - Словарь с комментариями к колонкам.
def get_postgres_comments(self, db_config: Dict, table_name: str, table_schema: str) -> Dict[str, str]:
self.logger.info("[get_postgres_comments][Enter] Fetching comments from PostgreSQL for %s.%s.", table_schema, table_name)
query = f"""
SELECT
cols.column_name,
CASE
WHEN pg_catalog.col_description(
(SELECT c.oid
FROM pg_catalog.pg_class c
JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname = cols.table_name
AND n.nspname = cols.table_schema),
cols.ordinal_position::int
) LIKE '%|%' THEN
split_part(
pg_catalog.col_description(
(SELECT c.oid
FROM pg_catalog.pg_class c
JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname = cols.table_name
AND n.nspname = cols.table_schema),
cols.ordinal_position::int
),
'|',
1
)
ELSE
pg_catalog.col_description(
(SELECT c.oid
FROM pg_catalog.pg_class c
JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname = cols.table_name
AND n.nspname = cols.table_schema),
cols.ordinal_position::int
)
END AS column_comment
FROM
information_schema.columns cols
WHERE cols.table_catalog = '{db_config.get('dbname')}' AND cols.table_name = '{table_name}' AND cols.table_schema = '{table_schema}';
"""
comments = {}
try:
with psycopg2.connect(**db_config) as conn, conn.cursor() as cursor:
cursor.execute(query)
for row in cursor.fetchall():
if row[1]:
comments[row[0]] = row[1]
self.logger.info("[get_postgres_comments][Success] Fetched %d comments.", len(comments))
except Exception as e:
self.logger.error("[get_postgres_comments][Failure] %s", e, exc_info=True)
raise
return comments
# [/DEF:DatasetMapper.get_postgres_comments]
# [DEF:DatasetMapper.load_excel_mappings:Function]
# @PURPOSE: Загружает меппинги 'column_name' -> 'column_comment' из XLSX файла.
# @PRE: `file_path` должен быть валидным путем к XLSX файлу с колонками 'column_name' и 'column_comment'.
# @POST: Возвращается словарь с меппингами.
# @THROW: Exception - При ошибках чтения файла или парсинга.
# @PARAM: file_path (str) - Путь к XLSX файлу.
# @RETURN: Dict[str, str] - Словарь с меппингами.
def load_excel_mappings(self, file_path: str) -> Dict[str, str]:
self.logger.info("[load_excel_mappings][Enter] Loading mappings from %s.", file_path)
try:
df = pd.read_excel(file_path)
mappings = df.set_index('column_name')['verbose_name'].to_dict()
self.logger.info("[load_excel_mappings][Success] Loaded %d mappings.", len(mappings))
return mappings
except Exception as e:
self.logger.error("[load_excel_mappings][Failure] %s", e, exc_info=True)
raise
# [/DEF:DatasetMapper.load_excel_mappings]
# [DEF:DatasetMapper.run_mapping:Function]
# @PURPOSE: Основная функция для выполнения меппинга и обновления verbose_map датасета в Superset.
# @RELATION: CALLS -> self.get_postgres_comments
# @RELATION: CALLS -> self.load_excel_mappings
# @RELATION: CALLS -> superset_client.get_dataset
# @RELATION: CALLS -> superset_client.update_dataset
# @PARAM: superset_client (SupersetClient) - Клиент Superset.
# @PARAM: dataset_id (int) - ID датасета для обновления.
# @PARAM: source (str) - Источник данных ('postgres', 'excel', 'both').
# @PARAM: postgres_config (Optional[Dict]) - Конфигурация для подключения к PostgreSQL.
# @PARAM: excel_path (Optional[str]) - Путь к XLSX файлу.
# @PARAM: table_name (Optional[str]) - Имя таблицы в PostgreSQL.
# @PARAM: table_schema (Optional[str]) - Схема таблицы в PostgreSQL.
def run_mapping(self, superset_client: SupersetClient, dataset_id: int, source: str, postgres_config: Optional[Dict] = None, excel_path: Optional[str] = None, table_name: Optional[str] = None, table_schema: Optional[str] = None):
self.logger.info("[run_mapping][Enter] Starting dataset mapping for ID %d from source '%s'.", dataset_id, source)
mappings: Dict[str, str] = {}
try:
if source in ['postgres', 'both']:
assert postgres_config and table_name and table_schema, "Postgres config is required."
mappings.update(self.get_postgres_comments(postgres_config, table_name, table_schema))
if source in ['excel', 'both']:
assert excel_path, "Excel path is required."
mappings.update(self.load_excel_mappings(excel_path))
if source not in ['postgres', 'excel', 'both']:
self.logger.error("[run_mapping][Failure] Invalid source: %s.", source)
return
dataset_response = superset_client.get_dataset(dataset_id)
dataset_data = dataset_response['result']
original_columns = dataset_data.get('columns', [])
updated_columns = []
changes_made = False
for column in original_columns:
col_name = column.get('column_name')
new_column = {
"column_name": col_name,
"id": column.get("id"),
"advanced_data_type": column.get("advanced_data_type"),
"description": column.get("description"),
"expression": column.get("expression"),
"extra": column.get("extra"),
"filterable": column.get("filterable"),
"groupby": column.get("groupby"),
"is_active": column.get("is_active"),
"is_dttm": column.get("is_dttm"),
"python_date_format": column.get("python_date_format"),
"type": column.get("type"),
"uuid": column.get("uuid"),
"verbose_name": column.get("verbose_name"),
}
new_column = {k: v for k, v in new_column.items() if v is not None}
if col_name in mappings:
mapping_value = mappings[col_name]
if isinstance(mapping_value, str) and new_column.get('verbose_name') != mapping_value:
new_column['verbose_name'] = mapping_value
changes_made = True
updated_columns.append(new_column)
updated_metrics = []
for metric in dataset_data.get("metrics", []):
new_metric = {
"id": metric.get("id"),
"metric_name": metric.get("metric_name"),
"expression": metric.get("expression"),
"verbose_name": metric.get("verbose_name"),
"description": metric.get("description"),
"d3format": metric.get("d3format"),
"currency": metric.get("currency"),
"extra": metric.get("extra"),
"warning_text": metric.get("warning_text"),
"metric_type": metric.get("metric_type"),
"uuid": metric.get("uuid"),
}
updated_metrics.append({k: v for k, v in new_metric.items() if v is not None})
if changes_made:
payload_for_update = {
"database_id": dataset_data.get("database", {}).get("id"),
"table_name": dataset_data.get("table_name"),
"schema": dataset_data.get("schema"),
"columns": updated_columns,
"owners": [owner["id"] for owner in dataset_data.get("owners", [])],
"metrics": updated_metrics,
"extra": dataset_data.get("extra"),
"description": dataset_data.get("description"),
"sql": dataset_data.get("sql"),
"cache_timeout": dataset_data.get("cache_timeout"),
"catalog": dataset_data.get("catalog"),
"default_endpoint": dataset_data.get("default_endpoint"),
"external_url": dataset_data.get("external_url"),
"fetch_values_predicate": dataset_data.get("fetch_values_predicate"),
"filter_select_enabled": dataset_data.get("filter_select_enabled"),
"is_managed_externally": dataset_data.get("is_managed_externally"),
"is_sqllab_view": dataset_data.get("is_sqllab_view"),
"main_dttm_col": dataset_data.get("main_dttm_col"),
"normalize_columns": dataset_data.get("normalize_columns"),
"offset": dataset_data.get("offset"),
"template_params": dataset_data.get("template_params"),
}
payload_for_update = {k: v for k, v in payload_for_update.items() if v is not None}
superset_client.update_dataset(dataset_id, payload_for_update)
self.logger.info("[run_mapping][Success] Dataset %d columns' verbose_name updated.", dataset_id)
else:
self.logger.info("[run_mapping][State] No changes in columns' verbose_name, skipping update.")
except (AssertionError, FileNotFoundError, Exception) as e:
self.logger.error("[run_mapping][Failure] %s", e, exc_info=True)
return
# [/DEF:DatasetMapper.run_mapping]
# [/DEF:DatasetMapper]
# [/DEF:superset_tool.utils.dataset_mapper]
# [DEF:superset_tool.utils.dataset_mapper:Module]
#
# @SEMANTICS: dataset, mapping, postgresql, xlsx, superset
# @PURPOSE: Этот модуль отвечает за обновление метаданных (verbose_map) в датасетах Superset, извлекая их из PostgreSQL или XLSX-файлов.
# @LAYER: Domain
# @RELATION: DEPENDS_ON -> superset_tool.client
# @RELATION: DEPENDS_ON -> pandas
# @RELATION: DEPENDS_ON -> psycopg2
# @PUBLIC_API: DatasetMapper
# [SECTION: IMPORTS]
import pandas as pd # type: ignore
import psycopg2 # type: ignore
from superset_tool.client import SupersetClient
from superset_tool.utils.init_clients import setup_clients
from superset_tool.utils.logger import SupersetLogger
from typing import Dict, List, Optional, Any
# [/SECTION]
# [DEF:DatasetMapper:Class]
# @PURPOSE: Класс для меппинга и обновления verbose_map в датасетах Superset.
class DatasetMapper:
def __init__(self, logger: SupersetLogger):
self.logger = logger
# [DEF:DatasetMapper.get_postgres_comments:Function]
# @PURPOSE: Извлекает комментарии к колонкам из системного каталога PostgreSQL.
# @PRE: `db_config` должен содержать валидные креды для подключения к PostgreSQL.
# @PRE: `table_name` и `table_schema` должны быть строками.
# @POST: Возвращается словарь с меппингом `column_name` -> `column_comment`.
# @THROW: Exception - При ошибках подключения или выполнения запроса к БД.
# @PARAM: db_config (Dict) - Конфигурация для подключения к БД.
# @PARAM: table_name (str) - Имя таблицы.
# @PARAM: table_schema (str) - Схема таблицы.
# @RETURN: Dict[str, str] - Словарь с комментариями к колонкам.
def get_postgres_comments(self, db_config: Dict, table_name: str, table_schema: str) -> Dict[str, str]:
self.logger.info("[get_postgres_comments][Enter] Fetching comments from PostgreSQL for %s.%s.", table_schema, table_name)
query = f"""
SELECT
cols.column_name,
CASE
WHEN pg_catalog.col_description(
(SELECT c.oid
FROM pg_catalog.pg_class c
JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname = cols.table_name
AND n.nspname = cols.table_schema),
cols.ordinal_position::int
) LIKE '%|%' THEN
split_part(
pg_catalog.col_description(
(SELECT c.oid
FROM pg_catalog.pg_class c
JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname = cols.table_name
AND n.nspname = cols.table_schema),
cols.ordinal_position::int
),
'|',
1
)
ELSE
pg_catalog.col_description(
(SELECT c.oid
FROM pg_catalog.pg_class c
JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname = cols.table_name
AND n.nspname = cols.table_schema),
cols.ordinal_position::int
)
END AS column_comment
FROM
information_schema.columns cols
WHERE cols.table_catalog = '{db_config.get('dbname')}' AND cols.table_name = '{table_name}' AND cols.table_schema = '{table_schema}';
"""
comments = {}
try:
with psycopg2.connect(**db_config) as conn, conn.cursor() as cursor:
cursor.execute(query)
for row in cursor.fetchall():
if row[1]:
comments[row[0]] = row[1]
self.logger.info("[get_postgres_comments][Success] Fetched %d comments.", len(comments))
except Exception as e:
self.logger.error("[get_postgres_comments][Failure] %s", e, exc_info=True)
raise
return comments
# [/DEF:DatasetMapper.get_postgres_comments]
# [DEF:DatasetMapper.load_excel_mappings:Function]
# @PURPOSE: Загружает меппинги 'column_name' -> 'column_comment' из XLSX файла.
# @PRE: `file_path` должен быть валидным путем к XLSX файлу с колонками 'column_name' и 'column_comment'.
# @POST: Возвращается словарь с меппингами.
# @THROW: Exception - При ошибках чтения файла или парсинга.
# @PARAM: file_path (str) - Путь к XLSX файлу.
# @RETURN: Dict[str, str] - Словарь с меппингами.
def load_excel_mappings(self, file_path: str) -> Dict[str, str]:
self.logger.info("[load_excel_mappings][Enter] Loading mappings from %s.", file_path)
try:
df = pd.read_excel(file_path)
mappings = df.set_index('column_name')['verbose_name'].to_dict()
self.logger.info("[load_excel_mappings][Success] Loaded %d mappings.", len(mappings))
return mappings
except Exception as e:
self.logger.error("[load_excel_mappings][Failure] %s", e, exc_info=True)
raise
# [/DEF:DatasetMapper.load_excel_mappings]
# [DEF:DatasetMapper.run_mapping:Function]
# @PURPOSE: Основная функция для выполнения меппинга и обновления verbose_map датасета в Superset.
# @RELATION: CALLS -> self.get_postgres_comments
# @RELATION: CALLS -> self.load_excel_mappings
# @RELATION: CALLS -> superset_client.get_dataset
# @RELATION: CALLS -> superset_client.update_dataset
# @PARAM: superset_client (SupersetClient) - Клиент Superset.
# @PARAM: dataset_id (int) - ID датасета для обновления.
# @PARAM: source (str) - Источник данных ('postgres', 'excel', 'both').
# @PARAM: postgres_config (Optional[Dict]) - Конфигурация для подключения к PostgreSQL.
# @PARAM: excel_path (Optional[str]) - Путь к XLSX файлу.
# @PARAM: table_name (Optional[str]) - Имя таблицы в PostgreSQL.
# @PARAM: table_schema (Optional[str]) - Схема таблицы в PostgreSQL.
def run_mapping(self, superset_client: SupersetClient, dataset_id: int, source: str, postgres_config: Optional[Dict] = None, excel_path: Optional[str] = None, table_name: Optional[str] = None, table_schema: Optional[str] = None):
self.logger.info("[run_mapping][Enter] Starting dataset mapping for ID %d from source '%s'.", dataset_id, source)
mappings: Dict[str, str] = {}
try:
if source in ['postgres', 'both']:
assert postgres_config and table_name and table_schema, "Postgres config is required."
mappings.update(self.get_postgres_comments(postgres_config, table_name, table_schema))
if source in ['excel', 'both']:
assert excel_path, "Excel path is required."
mappings.update(self.load_excel_mappings(excel_path))
if source not in ['postgres', 'excel', 'both']:
self.logger.error("[run_mapping][Failure] Invalid source: %s.", source)
return
dataset_response = superset_client.get_dataset(dataset_id)
dataset_data = dataset_response['result']
original_columns = dataset_data.get('columns', [])
updated_columns = []
changes_made = False
for column in original_columns:
col_name = column.get('column_name')
new_column = {
"column_name": col_name,
"id": column.get("id"),
"advanced_data_type": column.get("advanced_data_type"),
"description": column.get("description"),
"expression": column.get("expression"),
"extra": column.get("extra"),
"filterable": column.get("filterable"),
"groupby": column.get("groupby"),
"is_active": column.get("is_active"),
"is_dttm": column.get("is_dttm"),
"python_date_format": column.get("python_date_format"),
"type": column.get("type"),
"uuid": column.get("uuid"),
"verbose_name": column.get("verbose_name"),
}
new_column = {k: v for k, v in new_column.items() if v is not None}
if col_name in mappings:
mapping_value = mappings[col_name]
if isinstance(mapping_value, str) and new_column.get('verbose_name') != mapping_value:
new_column['verbose_name'] = mapping_value
changes_made = True
updated_columns.append(new_column)
updated_metrics = []
for metric in dataset_data.get("metrics", []):
new_metric = {
"id": metric.get("id"),
"metric_name": metric.get("metric_name"),
"expression": metric.get("expression"),
"verbose_name": metric.get("verbose_name"),
"description": metric.get("description"),
"d3format": metric.get("d3format"),
"currency": metric.get("currency"),
"extra": metric.get("extra"),
"warning_text": metric.get("warning_text"),
"metric_type": metric.get("metric_type"),
"uuid": metric.get("uuid"),
}
updated_metrics.append({k: v for k, v in new_metric.items() if v is not None})
if changes_made:
payload_for_update = {
"database_id": dataset_data.get("database", {}).get("id"),
"table_name": dataset_data.get("table_name"),
"schema": dataset_data.get("schema"),
"columns": updated_columns,
"owners": [owner["id"] for owner in dataset_data.get("owners", [])],
"metrics": updated_metrics,
"extra": dataset_data.get("extra"),
"description": dataset_data.get("description"),
"sql": dataset_data.get("sql"),
"cache_timeout": dataset_data.get("cache_timeout"),
"catalog": dataset_data.get("catalog"),
"default_endpoint": dataset_data.get("default_endpoint"),
"external_url": dataset_data.get("external_url"),
"fetch_values_predicate": dataset_data.get("fetch_values_predicate"),
"filter_select_enabled": dataset_data.get("filter_select_enabled"),
"is_managed_externally": dataset_data.get("is_managed_externally"),
"is_sqllab_view": dataset_data.get("is_sqllab_view"),
"main_dttm_col": dataset_data.get("main_dttm_col"),
"normalize_columns": dataset_data.get("normalize_columns"),
"offset": dataset_data.get("offset"),
"template_params": dataset_data.get("template_params"),
}
payload_for_update = {k: v for k, v in payload_for_update.items() if v is not None}
superset_client.update_dataset(dataset_id, payload_for_update)
self.logger.info("[run_mapping][Success] Dataset %d columns' verbose_name updated.", dataset_id)
else:
self.logger.info("[run_mapping][State] No changes in columns' verbose_name, skipping update.")
except (AssertionError, FileNotFoundError, Exception) as e:
self.logger.error("[run_mapping][Failure] %s", e, exc_info=True)
return
# [/DEF:DatasetMapper.run_mapping]
# [/DEF:DatasetMapper]
# [/DEF:superset_tool.utils.dataset_mapper]

916
superset_tool/utils/fileio.py Normal file → Executable file
View File

@@ -1,458 +1,458 @@
# [DEF:superset_tool.utils.fileio:Module]
#
# @SEMANTICS: file, io, zip, yaml, temp, archive, utility
# @PURPOSE: Предоставляет набор утилит для управления файловыми операциями, включая работу с временными файлами, архивами ZIP, файлами YAML и очистку директорий.
# @LAYER: Infra
# @RELATION: DEPENDS_ON -> superset_tool.exceptions
# @RELATION: DEPENDS_ON -> superset_tool.utils.logger
# @RELATION: DEPENDS_ON -> pyyaml
# @PUBLIC_API: create_temp_file, remove_empty_directories, read_dashboard_from_disk, calculate_crc32, RetentionPolicy, archive_exports, save_and_unpack_dashboard, update_yamls, create_dashboard_export, sanitize_filename, get_filename_from_headers, consolidate_archive_folders
# [SECTION: IMPORTS]
import os
import re
import zipfile
from pathlib import Path
from typing import Any, Optional, Tuple, Dict, List, Union, LiteralString, Generator
from contextlib import contextmanager
import tempfile
from datetime import date, datetime
import glob
import shutil
import zlib
from dataclasses import dataclass
import yaml
from superset_tool.exceptions import InvalidZipFormatError
from superset_tool.utils.logger import SupersetLogger
# [/SECTION]
# [DEF:create_temp_file:Function]
# @PURPOSE: Контекстный менеджер для создания временного файла или директории с гарантированным удалением.
# @PARAM: content (Optional[bytes]) - Бинарное содержимое для записи во временный файл.
# @PARAM: suffix (str) - Суффикс ресурса. Если `.dir`, создается директория.
# @PARAM: mode (str) - Режим записи в файл (e.g., 'wb').
# @PARAM: logger (Optional[SupersetLogger]) - Экземпляр логгера.
# @YIELDS: Path - Путь к временному ресурсу.
# @THROW: IOError - При ошибках создания ресурса.
@contextmanager
def create_temp_file(content: Optional[bytes] = None, suffix: str = ".zip", mode: str = 'wb', dry_run = False, logger: Optional[SupersetLogger] = None) -> Generator[Path, None, None]:
logger = logger or SupersetLogger(name="fileio")
resource_path = None
is_dir = suffix.startswith('.dir')
try:
if is_dir:
with tempfile.TemporaryDirectory(suffix=suffix) as temp_dir:
resource_path = Path(temp_dir)
logger.debug("[create_temp_file][State] Created temporary directory: %s", resource_path)
yield resource_path
else:
fd, temp_path_str = tempfile.mkstemp(suffix=suffix)
resource_path = Path(temp_path_str)
os.close(fd)
if content:
resource_path.write_bytes(content)
logger.debug("[create_temp_file][State] Created temporary file: %s", resource_path)
yield resource_path
finally:
if resource_path and resource_path.exists() and not dry_run:
try:
if resource_path.is_dir():
shutil.rmtree(resource_path)
logger.debug("[create_temp_file][Cleanup] Removed temporary directory: %s", resource_path)
else:
resource_path.unlink()
logger.debug("[create_temp_file][Cleanup] Removed temporary file: %s", resource_path)
except OSError as e:
logger.error("[create_temp_file][Failure] Error during cleanup of %s: %s", resource_path, e)
# [/DEF:create_temp_file]
# [DEF:remove_empty_directories:Function]
# @PURPOSE: Рекурсивно удаляет все пустые поддиректории, начиная с указанного пути.
# @PARAM: root_dir (str) - Путь к корневой директории для очистки.
# @PARAM: logger (Optional[SupersetLogger]) - Экземпляр логгера.
# @RETURN: int - Количество удаленных директорий.
def remove_empty_directories(root_dir: str, logger: Optional[SupersetLogger] = None) -> int:
logger = logger or SupersetLogger(name="fileio")
logger.info("[remove_empty_directories][Enter] Starting cleanup of empty directories in %s", root_dir)
removed_count = 0
if not os.path.isdir(root_dir):
logger.error("[remove_empty_directories][Failure] Directory not found: %s", root_dir)
return 0
for current_dir, _, _ in os.walk(root_dir, topdown=False):
if not os.listdir(current_dir):
try:
os.rmdir(current_dir)
removed_count += 1
logger.info("[remove_empty_directories][State] Removed empty directory: %s", current_dir)
except OSError as e:
logger.error("[remove_empty_directories][Failure] Failed to remove %s: %s", current_dir, e)
logger.info("[remove_empty_directories][Exit] Removed %d empty directories.", removed_count)
return removed_count
# [/DEF:remove_empty_directories]
# [DEF:read_dashboard_from_disk:Function]
# @PURPOSE: Читает бинарное содержимое файла с диска.
# @PARAM: file_path (str) - Путь к файлу.
# @PARAM: logger (Optional[SupersetLogger]) - Экземпляр логгера.
# @RETURN: Tuple[bytes, str] - Кортеж (содержимое, имя файла).
# @THROW: FileNotFoundError - Если файл не найден.
def read_dashboard_from_disk(file_path: str, logger: Optional[SupersetLogger] = None) -> Tuple[bytes, str]:
logger = logger or SupersetLogger(name="fileio")
path = Path(file_path)
assert path.is_file(), f"Файл дашборда не найден: {file_path}"
logger.info("[read_dashboard_from_disk][Enter] Reading file: %s", file_path)
content = path.read_bytes()
if not content:
logger.warning("[read_dashboard_from_disk][Warning] File is empty: %s", file_path)
return content, path.name
# [/DEF:read_dashboard_from_disk]
# [DEF:calculate_crc32:Function]
# @PURPOSE: Вычисляет контрольную сумму CRC32 для файла.
# @PARAM: file_path (Path) - Путь к файлу.
# @RETURN: str - 8-значное шестнадцатеричное представление CRC32.
# @THROW: IOError - При ошибках чтения файла.
def calculate_crc32(file_path: Path) -> str:
with open(file_path, 'rb') as f:
crc32_value = zlib.crc32(f.read())
return f"{crc32_value:08x}"
# [/DEF:calculate_crc32]
# [DEF:RetentionPolicy:DataClass]
# @PURPOSE: Определяет политику хранения для архивов (ежедневные, еженедельные, ежемесячные).
@dataclass
class RetentionPolicy:
daily: int = 7
weekly: int = 4
monthly: int = 12
# [/DEF:RetentionPolicy]
# [DEF:archive_exports:Function]
# @PURPOSE: Управляет архивом экспортированных файлов, применяя политику хранения и дедупликацию.
# @RELATION: CALLS -> apply_retention_policy
# @RELATION: CALLS -> calculate_crc32
# @PARAM: output_dir (str) - Директория с архивами.
# @PARAM: policy (RetentionPolicy) - Политика хранения.
# @PARAM: deduplicate (bool) - Флаг для включения удаления дубликатов по CRC32.
# @PARAM: logger (Optional[SupersetLogger]) - Экземпляр логгера.
def archive_exports(output_dir: str, policy: RetentionPolicy, deduplicate: bool = False, logger: Optional[SupersetLogger] = None) -> None:
logger = logger or SupersetLogger(name="fileio")
output_path = Path(output_dir)
if not output_path.is_dir():
logger.warning("[archive_exports][Skip] Archive directory not found: %s", output_dir)
return
logger.info("[archive_exports][Enter] Managing archive in %s", output_dir)
# 1. Collect all zip files
zip_files = list(output_path.glob("*.zip"))
if not zip_files:
logger.info("[archive_exports][State] No zip files found in %s", output_dir)
return
# 2. Deduplication
if deduplicate:
logger.info("[archive_exports][State] Starting deduplication...")
checksums = {}
files_to_remove = []
# Sort by modification time (newest first) to keep the latest version
zip_files.sort(key=lambda f: f.stat().st_mtime, reverse=True)
for file_path in zip_files:
try:
crc = calculate_crc32(file_path)
if crc in checksums:
files_to_remove.append(file_path)
logger.debug("[archive_exports][State] Duplicate found: %s (same as %s)", file_path.name, checksums[crc].name)
else:
checksums[crc] = file_path
except Exception as e:
logger.error("[archive_exports][Failure] Failed to calculate CRC32 for %s: %s", file_path, e)
for f in files_to_remove:
try:
f.unlink()
zip_files.remove(f)
logger.info("[archive_exports][State] Removed duplicate: %s", f.name)
except OSError as e:
logger.error("[archive_exports][Failure] Failed to remove duplicate %s: %s", f, e)
# 3. Retention Policy
files_with_dates = []
for file_path in zip_files:
# Try to extract date from filename
# Pattern: ..._YYYYMMDD_HHMMSS.zip or ..._YYYYMMDD.zip
match = re.search(r'_(\d{8})_', file_path.name)
file_date = None
if match:
try:
date_str = match.group(1)
file_date = datetime.strptime(date_str, "%Y%m%d").date()
except ValueError:
pass
if not file_date:
# Fallback to modification time
file_date = datetime.fromtimestamp(file_path.stat().st_mtime).date()
files_with_dates.append((file_path, file_date))
files_to_keep = apply_retention_policy(files_with_dates, policy, logger)
for file_path, _ in files_with_dates:
if file_path not in files_to_keep:
try:
file_path.unlink()
logger.info("[archive_exports][State] Removed by retention policy: %s", file_path.name)
except OSError as e:
logger.error("[archive_exports][Failure] Failed to remove %s: %s", file_path, e)
# [/DEF:archive_exports]
# [DEF:apply_retention_policy:Function]
# @PURPOSE: (Helper) Применяет политику хранения к списку файлов, возвращая те, что нужно сохранить.
# @PARAM: files_with_dates (List[Tuple[Path, date]]) - Список файлов с датами.
# @PARAM: policy (RetentionPolicy) - Политика хранения.
# @PARAM: logger (SupersetLogger) - Логгер.
# @RETURN: set - Множество путей к файлам, которые должны быть сохранены.
def apply_retention_policy(files_with_dates: List[Tuple[Path, date]], policy: RetentionPolicy, logger: SupersetLogger) -> set:
# Сортируем по дате (от новой к старой)
sorted_files = sorted(files_with_dates, key=lambda x: x[1], reverse=True)
# Словарь для хранения файлов по категориям
daily_files = []
weekly_files = []
monthly_files = []
today = date.today()
for file_path, file_date in sorted_files:
# Ежедневные
if (today - file_date).days < policy.daily:
daily_files.append(file_path)
# Еженедельные
elif (today - file_date).days < policy.weekly * 7:
weekly_files.append(file_path)
# Ежемесячные
elif (today - file_date).days < policy.monthly * 30:
monthly_files.append(file_path)
# Возвращаем множество файлов, которые нужно сохранить
files_to_keep = set()
files_to_keep.update(daily_files)
files_to_keep.update(weekly_files[:policy.weekly])
files_to_keep.update(monthly_files[:policy.monthly])
logger.debug("[apply_retention_policy][State] Keeping %d files according to retention policy", len(files_to_keep))
return files_to_keep
# [/DEF:apply_retention_policy]
# [DEF:save_and_unpack_dashboard:Function]
# @PURPOSE: Сохраняет бинарное содержимое ZIP-архива на диск и опционально распаковывает его.
# @PARAM: zip_content (bytes) - Содержимое ZIP-архива.
# @PARAM: output_dir (Union[str, Path]) - Директория для сохранения.
# @PARAM: unpack (bool) - Флаг, нужно ли распаковывать архив.
# @PARAM: original_filename (Optional[str]) - Исходное имя файла для сохранения.
# @PARAM: logger (Optional[SupersetLogger]) - Экземпляр логгера.
# @RETURN: Tuple[Path, Optional[Path]] - Путь к ZIP-файлу и, если применимо, путь к директории с распаковкой.
# @THROW: InvalidZipFormatError - При ошибке формата ZIP.
def save_and_unpack_dashboard(zip_content: bytes, output_dir: Union[str, Path], unpack: bool = False, original_filename: Optional[str] = None, logger: Optional[SupersetLogger] = None) -> Tuple[Path, Optional[Path]]:
logger = logger or SupersetLogger(name="fileio")
logger.info("[save_and_unpack_dashboard][Enter] Processing dashboard. Unpack: %s", unpack)
try:
output_path = Path(output_dir)
output_path.mkdir(parents=True, exist_ok=True)
zip_name = sanitize_filename(original_filename) if original_filename else f"dashboard_export_{datetime.now().strftime('%Y%m%d_%H%M%S')}.zip"
zip_path = output_path / zip_name
zip_path.write_bytes(zip_content)
logger.info("[save_and_unpack_dashboard][State] Dashboard saved to: %s", zip_path)
if unpack:
with zipfile.ZipFile(zip_path, 'r') as zip_ref:
zip_ref.extractall(output_path)
logger.info("[save_and_unpack_dashboard][State] Dashboard unpacked to: %s", output_path)
return zip_path, output_path
return zip_path, None
except zipfile.BadZipFile as e:
logger.error("[save_and_unpack_dashboard][Failure] Invalid ZIP archive: %s", e)
raise InvalidZipFormatError(f"Invalid ZIP file: {e}") from e
# [/DEF:save_and_unpack_dashboard]
# [DEF:update_yamls:Function]
# @PURPOSE: Обновляет конфигурации в YAML-файлах, заменяя значения или применяя regex.
# @RELATION: CALLS -> _update_yaml_file
# @THROW: FileNotFoundError - Если `path` не существует.
# @PARAM: db_configs (Optional[List[Dict]]) - Список конфигураций для замены.
# @PARAM: path (str) - Путь к директории с YAML файлами.
# @PARAM: regexp_pattern (Optional[LiteralString]) - Паттерн для поиска.
# @PARAM: replace_string (Optional[LiteralString]) - Строка для замены.
# @PARAM: logger (Optional[SupersetLogger]) - Экземпляр логгера.
def update_yamls(db_configs: Optional[List[Dict[str, Any]]] = None, path: str = "dashboards", regexp_pattern: Optional[LiteralString] = None, replace_string: Optional[LiteralString] = None, logger: Optional[SupersetLogger] = None) -> None:
logger = logger or SupersetLogger(name="fileio")
logger.info("[update_yamls][Enter] Starting YAML configuration update.")
dir_path = Path(path)
assert dir_path.is_dir(), f"Путь {path} не существует или не является директорией"
configs: List[Dict[str, Any]] = db_configs or []
for file_path in dir_path.rglob("*.yaml"):
_update_yaml_file(file_path, configs, regexp_pattern, replace_string, logger)
# [/DEF:update_yamls]
# [DEF:_update_yaml_file:Function]
# @PURPOSE: (Helper) Обновляет один YAML файл.
# @PARAM: file_path (Path) - Путь к файлу.
# @PARAM: db_configs (List[Dict]) - Конфигурации.
# @PARAM: regexp_pattern (Optional[str]) - Паттерн.
# @PARAM: replace_string (Optional[str]) - Замена.
# @PARAM: logger (SupersetLogger) - Логгер.
def _update_yaml_file(file_path: Path, db_configs: List[Dict[str, Any]], regexp_pattern: Optional[str], replace_string: Optional[str], logger: SupersetLogger) -> None:
# Читаем содержимое файла
try:
with open(file_path, 'r', encoding='utf-8') as f:
content = f.read()
except Exception as e:
logger.error("[_update_yaml_file][Failure] Failed to read %s: %s", file_path, e)
return
# Если задан pattern и replace_string, применяем замену по регулярному выражению
if regexp_pattern and replace_string:
try:
new_content = re.sub(regexp_pattern, replace_string, content)
if new_content != content:
with open(file_path, 'w', encoding='utf-8') as f:
f.write(new_content)
logger.info("[_update_yaml_file][State] Updated %s using regex pattern", file_path)
except Exception as e:
logger.error("[_update_yaml_file][Failure] Error applying regex to %s: %s", file_path, e)
# Если заданы конфигурации, заменяем значения (поддержка old/new)
if db_configs:
try:
# Прямой текстовый заменитель для старых/новых значений, чтобы сохранить структуру файла
modified_content = content
for cfg in db_configs:
# Ожидаем структуру: {'old': {...}, 'new': {...}}
old_cfg = cfg.get('old', {})
new_cfg = cfg.get('new', {})
for key, old_val in old_cfg.items():
if key in new_cfg:
new_val = new_cfg[key]
# Заменяем только точные совпадения старого значения в тексте YAML, используя ключ для контекста
if isinstance(old_val, str):
# Ищем паттерн: key: "value" или key: value
key_pattern = re.escape(key)
val_pattern = re.escape(old_val)
# Группы: 1=ключ+разделитель, 2=открывающая кавычка (опц), 3=значение, 4=закрывающая кавычка (опц)
pattern = rf'({key_pattern}\s*:\s*)(["\']?)({val_pattern})(["\']?)'
# Функция замены, сохраняющая кавычки если они были
def replacer(match):
prefix = match.group(1)
quote_open = match.group(2)
quote_close = match.group(4)
return f"{prefix}{quote_open}{new_val}{quote_close}"
modified_content = re.sub(pattern, replacer, modified_content)
logger.info("[_update_yaml_file][State] Replaced '%s' with '%s' for key %s in %s", old_val, new_val, key, file_path)
# Записываем обратно изменённый контент без парсинга YAML, сохраняем оригинальное форматирование
with open(file_path, 'w', encoding='utf-8') as f:
f.write(modified_content)
except Exception as e:
logger.error("[_update_yaml_file][Failure] Error performing raw replacement in %s: %s", file_path, e)
# [/DEF:_update_yaml_file]
# [DEF:create_dashboard_export:Function]
# @PURPOSE: Создает ZIP-архив из указанных исходных путей.
# @PARAM: zip_path (Union[str, Path]) - Путь для сохранения ZIP архива.
# @PARAM: source_paths (List[Union[str, Path]]) - Список исходных путей для архивации.
# @PARAM: exclude_extensions (Optional[List[str]]) - Список расширений для исключения.
# @PARAM: logger (Optional[SupersetLogger]) - Экземпляр логгера.
# @RETURN: bool - `True` при успехе, `False` при ошибке.
def create_dashboard_export(zip_path: Union[str, Path], source_paths: List[Union[str, Path]], exclude_extensions: Optional[List[str]] = None, logger: Optional[SupersetLogger] = None) -> bool:
logger = logger or SupersetLogger(name="fileio")
logger.info("[create_dashboard_export][Enter] Packing dashboard: %s -> %s", source_paths, zip_path)
try:
exclude_ext = [ext.lower() for ext in exclude_extensions or []]
with zipfile.ZipFile(zip_path, 'w', zipfile.ZIP_DEFLATED) as zipf:
for src_path_str in source_paths:
src_path = Path(src_path_str)
assert src_path.exists(), f"Путь не найден: {src_path}"
for item in src_path.rglob('*'):
if item.is_file() and item.suffix.lower() not in exclude_ext:
arcname = item.relative_to(src_path.parent)
zipf.write(item, arcname)
logger.info("[create_dashboard_export][Exit] Archive created: %s", zip_path)
return True
except (IOError, zipfile.BadZipFile, AssertionError) as e:
logger.error("[create_dashboard_export][Failure] Error: %s", e, exc_info=True)
return False
# [/DEF:create_dashboard_export]
# [DEF:sanitize_filename:Function]
# @PURPOSE: Очищает строку от символов, недопустимых в именах файлов.
# @PARAM: filename (str) - Исходное имя файла.
# @RETURN: str - Очищенная строка.
def sanitize_filename(filename: str) -> str:
return re.sub(r'[\\/*?:"<>|]', "_", filename).strip()
# [/DEF:sanitize_filename]
# [DEF:get_filename_from_headers:Function]
# @PURPOSE: Извлекает имя файла из HTTP заголовка 'Content-Disposition'.
# @PARAM: headers (dict) - Словарь HTTP заголовков.
# @RETURN: Optional[str] - Имя файла или `None`.
def get_filename_from_headers(headers: dict) -> Optional[str]:
content_disposition = headers.get("Content-Disposition", "")
if match := re.search(r'filename="?([^"]+)"?', content_disposition):
return match.group(1).strip()
return None
# [/DEF:get_filename_from_headers]
# [DEF:consolidate_archive_folders:Function]
# @PURPOSE: Консолидирует директории архивов на основе общего слага в имени.
# @THROW: TypeError, ValueError - Если `root_directory` невалиден.
# @PARAM: root_directory (Path) - Корневая директория для консолидации.
# @PARAM: logger (Optional[SupersetLogger]) - Экземпляр логгера.
def consolidate_archive_folders(root_directory: Path, logger: Optional[SupersetLogger] = None) -> None:
logger = logger or SupersetLogger(name="fileio")
assert isinstance(root_directory, Path), "root_directory must be a Path object."
assert root_directory.is_dir(), "root_directory must be an existing directory."
logger.info("[consolidate_archive_folders][Enter] Consolidating archives in %s", root_directory)
# Собираем все директории с архивами
archive_dirs = []
for item in root_directory.iterdir():
if item.is_dir():
# Проверяем, есть ли в директории ZIP-архивы
if any(item.glob("*.zip")):
archive_dirs.append(item)
# Группируем по слагу (части имени до первого '_')
slug_groups = {}
for dir_path in archive_dirs:
dir_name = dir_path.name
slug = dir_name.split('_')[0] if '_' in dir_name else dir_name
if slug not in slug_groups:
slug_groups[slug] = []
slug_groups[slug].append(dir_path)
# Для каждой группы консолидируем
for slug, dirs in slug_groups.items():
if len(dirs) <= 1:
continue
# Создаем целевую директорию
target_dir = root_directory / slug
target_dir.mkdir(exist_ok=True)
logger.info("[consolidate_archive_folders][State] Consolidating %d directories under %s", len(dirs), target_dir)
# Перемещаем содержимое
for source_dir in dirs:
if source_dir == target_dir:
continue
for item in source_dir.iterdir():
dest_item = target_dir / item.name
try:
if item.is_dir():
shutil.move(str(item), str(dest_item))
else:
shutil.move(str(item), str(dest_item))
except Exception as e:
logger.error("[consolidate_archive_folders][Failure] Failed to move %s to %s: %s", item, dest_item, e)
# Удаляем исходную директорию
try:
source_dir.rmdir()
logger.info("[consolidate_archive_folders][State] Removed source directory: %s", source_dir)
except Exception as e:
logger.error("[consolidate_archive_folders][Failure] Failed to remove source directory %s: %s", source_dir, e)
# [/DEF:consolidate_archive_folders]
# [/DEF:superset_tool.utils.fileio]
# [DEF:superset_tool.utils.fileio:Module]
#
# @SEMANTICS: file, io, zip, yaml, temp, archive, utility
# @PURPOSE: Предоставляет набор утилит для управления файловыми операциями, включая работу с временными файлами, архивами ZIP, файлами YAML и очистку директорий.
# @LAYER: Infra
# @RELATION: DEPENDS_ON -> superset_tool.exceptions
# @RELATION: DEPENDS_ON -> superset_tool.utils.logger
# @RELATION: DEPENDS_ON -> pyyaml
# @PUBLIC_API: create_temp_file, remove_empty_directories, read_dashboard_from_disk, calculate_crc32, RetentionPolicy, archive_exports, save_and_unpack_dashboard, update_yamls, create_dashboard_export, sanitize_filename, get_filename_from_headers, consolidate_archive_folders
# [SECTION: IMPORTS]
import os
import re
import zipfile
from pathlib import Path
from typing import Any, Optional, Tuple, Dict, List, Union, LiteralString, Generator
from contextlib import contextmanager
import tempfile
from datetime import date, datetime
import glob
import shutil
import zlib
from dataclasses import dataclass
import yaml
from superset_tool.exceptions import InvalidZipFormatError
from superset_tool.utils.logger import SupersetLogger
# [/SECTION]
# [DEF:create_temp_file:Function]
# @PURPOSE: Контекстный менеджер для создания временного файла или директории с гарантированным удалением.
# @PARAM: content (Optional[bytes]) - Бинарное содержимое для записи во временный файл.
# @PARAM: suffix (str) - Суффикс ресурса. Если `.dir`, создается директория.
# @PARAM: mode (str) - Режим записи в файл (e.g., 'wb').
# @PARAM: logger (Optional[SupersetLogger]) - Экземпляр логгера.
# @YIELDS: Path - Путь к временному ресурсу.
# @THROW: IOError - При ошибках создания ресурса.
@contextmanager
def create_temp_file(content: Optional[bytes] = None, suffix: str = ".zip", mode: str = 'wb', dry_run = False, logger: Optional[SupersetLogger] = None) -> Generator[Path, None, None]:
logger = logger or SupersetLogger(name="fileio")
resource_path = None
is_dir = suffix.startswith('.dir')
try:
if is_dir:
with tempfile.TemporaryDirectory(suffix=suffix) as temp_dir:
resource_path = Path(temp_dir)
logger.debug("[create_temp_file][State] Created temporary directory: %s", resource_path)
yield resource_path
else:
fd, temp_path_str = tempfile.mkstemp(suffix=suffix)
resource_path = Path(temp_path_str)
os.close(fd)
if content:
resource_path.write_bytes(content)
logger.debug("[create_temp_file][State] Created temporary file: %s", resource_path)
yield resource_path
finally:
if resource_path and resource_path.exists() and not dry_run:
try:
if resource_path.is_dir():
shutil.rmtree(resource_path)
logger.debug("[create_temp_file][Cleanup] Removed temporary directory: %s", resource_path)
else:
resource_path.unlink()
logger.debug("[create_temp_file][Cleanup] Removed temporary file: %s", resource_path)
except OSError as e:
logger.error("[create_temp_file][Failure] Error during cleanup of %s: %s", resource_path, e)
# [/DEF:create_temp_file]
# [DEF:remove_empty_directories:Function]
# @PURPOSE: Рекурсивно удаляет все пустые поддиректории, начиная с указанного пути.
# @PARAM: root_dir (str) - Путь к корневой директории для очистки.
# @PARAM: logger (Optional[SupersetLogger]) - Экземпляр логгера.
# @RETURN: int - Количество удаленных директорий.
def remove_empty_directories(root_dir: str, logger: Optional[SupersetLogger] = None) -> int:
logger = logger or SupersetLogger(name="fileio")
logger.info("[remove_empty_directories][Enter] Starting cleanup of empty directories in %s", root_dir)
removed_count = 0
if not os.path.isdir(root_dir):
logger.error("[remove_empty_directories][Failure] Directory not found: %s", root_dir)
return 0
for current_dir, _, _ in os.walk(root_dir, topdown=False):
if not os.listdir(current_dir):
try:
os.rmdir(current_dir)
removed_count += 1
logger.info("[remove_empty_directories][State] Removed empty directory: %s", current_dir)
except OSError as e:
logger.error("[remove_empty_directories][Failure] Failed to remove %s: %s", current_dir, e)
logger.info("[remove_empty_directories][Exit] Removed %d empty directories.", removed_count)
return removed_count
# [/DEF:remove_empty_directories]
# [DEF:read_dashboard_from_disk:Function]
# @PURPOSE: Читает бинарное содержимое файла с диска.
# @PARAM: file_path (str) - Путь к файлу.
# @PARAM: logger (Optional[SupersetLogger]) - Экземпляр логгера.
# @RETURN: Tuple[bytes, str] - Кортеж (содержимое, имя файла).
# @THROW: FileNotFoundError - Если файл не найден.
def read_dashboard_from_disk(file_path: str, logger: Optional[SupersetLogger] = None) -> Tuple[bytes, str]:
logger = logger or SupersetLogger(name="fileio")
path = Path(file_path)
assert path.is_file(), f"Файл дашборда не найден: {file_path}"
logger.info("[read_dashboard_from_disk][Enter] Reading file: %s", file_path)
content = path.read_bytes()
if not content:
logger.warning("[read_dashboard_from_disk][Warning] File is empty: %s", file_path)
return content, path.name
# [/DEF:read_dashboard_from_disk]
# [DEF:calculate_crc32:Function]
# @PURPOSE: Вычисляет контрольную сумму CRC32 для файла.
# @PARAM: file_path (Path) - Путь к файлу.
# @RETURN: str - 8-значное шестнадцатеричное представление CRC32.
# @THROW: IOError - При ошибках чтения файла.
def calculate_crc32(file_path: Path) -> str:
with open(file_path, 'rb') as f:
crc32_value = zlib.crc32(f.read())
return f"{crc32_value:08x}"
# [/DEF:calculate_crc32]
# [DEF:RetentionPolicy:DataClass]
# @PURPOSE: Определяет политику хранения для архивов (ежедневные, еженедельные, ежемесячные).
@dataclass
class RetentionPolicy:
daily: int = 7
weekly: int = 4
monthly: int = 12
# [/DEF:RetentionPolicy]
# [DEF:archive_exports:Function]
# @PURPOSE: Управляет архивом экспортированных файлов, применяя политику хранения и дедупликацию.
# @RELATION: CALLS -> apply_retention_policy
# @RELATION: CALLS -> calculate_crc32
# @PARAM: output_dir (str) - Директория с архивами.
# @PARAM: policy (RetentionPolicy) - Политика хранения.
# @PARAM: deduplicate (bool) - Флаг для включения удаления дубликатов по CRC32.
# @PARAM: logger (Optional[SupersetLogger]) - Экземпляр логгера.
def archive_exports(output_dir: str, policy: RetentionPolicy, deduplicate: bool = False, logger: Optional[SupersetLogger] = None) -> None:
logger = logger or SupersetLogger(name="fileio")
output_path = Path(output_dir)
if not output_path.is_dir():
logger.warning("[archive_exports][Skip] Archive directory not found: %s", output_dir)
return
logger.info("[archive_exports][Enter] Managing archive in %s", output_dir)
# 1. Collect all zip files
zip_files = list(output_path.glob("*.zip"))
if not zip_files:
logger.info("[archive_exports][State] No zip files found in %s", output_dir)
return
# 2. Deduplication
if deduplicate:
logger.info("[archive_exports][State] Starting deduplication...")
checksums = {}
files_to_remove = []
# Sort by modification time (newest first) to keep the latest version
zip_files.sort(key=lambda f: f.stat().st_mtime, reverse=True)
for file_path in zip_files:
try:
crc = calculate_crc32(file_path)
if crc in checksums:
files_to_remove.append(file_path)
logger.debug("[archive_exports][State] Duplicate found: %s (same as %s)", file_path.name, checksums[crc].name)
else:
checksums[crc] = file_path
except Exception as e:
logger.error("[archive_exports][Failure] Failed to calculate CRC32 for %s: %s", file_path, e)
for f in files_to_remove:
try:
f.unlink()
zip_files.remove(f)
logger.info("[archive_exports][State] Removed duplicate: %s", f.name)
except OSError as e:
logger.error("[archive_exports][Failure] Failed to remove duplicate %s: %s", f, e)
# 3. Retention Policy
files_with_dates = []
for file_path in zip_files:
# Try to extract date from filename
# Pattern: ..._YYYYMMDD_HHMMSS.zip or ..._YYYYMMDD.zip
match = re.search(r'_(\d{8})_', file_path.name)
file_date = None
if match:
try:
date_str = match.group(1)
file_date = datetime.strptime(date_str, "%Y%m%d").date()
except ValueError:
pass
if not file_date:
# Fallback to modification time
file_date = datetime.fromtimestamp(file_path.stat().st_mtime).date()
files_with_dates.append((file_path, file_date))
files_to_keep = apply_retention_policy(files_with_dates, policy, logger)
for file_path, _ in files_with_dates:
if file_path not in files_to_keep:
try:
file_path.unlink()
logger.info("[archive_exports][State] Removed by retention policy: %s", file_path.name)
except OSError as e:
logger.error("[archive_exports][Failure] Failed to remove %s: %s", file_path, e)
# [/DEF:archive_exports]
# [DEF:apply_retention_policy:Function]
# @PURPOSE: (Helper) Применяет политику хранения к списку файлов, возвращая те, что нужно сохранить.
# @PARAM: files_with_dates (List[Tuple[Path, date]]) - Список файлов с датами.
# @PARAM: policy (RetentionPolicy) - Политика хранения.
# @PARAM: logger (SupersetLogger) - Логгер.
# @RETURN: set - Множество путей к файлам, которые должны быть сохранены.
def apply_retention_policy(files_with_dates: List[Tuple[Path, date]], policy: RetentionPolicy, logger: SupersetLogger) -> set:
# Сортируем по дате (от новой к старой)
sorted_files = sorted(files_with_dates, key=lambda x: x[1], reverse=True)
# Словарь для хранения файлов по категориям
daily_files = []
weekly_files = []
monthly_files = []
today = date.today()
for file_path, file_date in sorted_files:
# Ежедневные
if (today - file_date).days < policy.daily:
daily_files.append(file_path)
# Еженедельные
elif (today - file_date).days < policy.weekly * 7:
weekly_files.append(file_path)
# Ежемесячные
elif (today - file_date).days < policy.monthly * 30:
monthly_files.append(file_path)
# Возвращаем множество файлов, которые нужно сохранить
files_to_keep = set()
files_to_keep.update(daily_files)
files_to_keep.update(weekly_files[:policy.weekly])
files_to_keep.update(monthly_files[:policy.monthly])
logger.debug("[apply_retention_policy][State] Keeping %d files according to retention policy", len(files_to_keep))
return files_to_keep
# [/DEF:apply_retention_policy]
# [DEF:save_and_unpack_dashboard:Function]
# @PURPOSE: Сохраняет бинарное содержимое ZIP-архива на диск и опционально распаковывает его.
# @PARAM: zip_content (bytes) - Содержимое ZIP-архива.
# @PARAM: output_dir (Union[str, Path]) - Директория для сохранения.
# @PARAM: unpack (bool) - Флаг, нужно ли распаковывать архив.
# @PARAM: original_filename (Optional[str]) - Исходное имя файла для сохранения.
# @PARAM: logger (Optional[SupersetLogger]) - Экземпляр логгера.
# @RETURN: Tuple[Path, Optional[Path]] - Путь к ZIP-файлу и, если применимо, путь к директории с распаковкой.
# @THROW: InvalidZipFormatError - При ошибке формата ZIP.
def save_and_unpack_dashboard(zip_content: bytes, output_dir: Union[str, Path], unpack: bool = False, original_filename: Optional[str] = None, logger: Optional[SupersetLogger] = None) -> Tuple[Path, Optional[Path]]:
logger = logger or SupersetLogger(name="fileio")
logger.info("[save_and_unpack_dashboard][Enter] Processing dashboard. Unpack: %s", unpack)
try:
output_path = Path(output_dir)
output_path.mkdir(parents=True, exist_ok=True)
zip_name = sanitize_filename(original_filename) if original_filename else f"dashboard_export_{datetime.now().strftime('%Y%m%d_%H%M%S')}.zip"
zip_path = output_path / zip_name
zip_path.write_bytes(zip_content)
logger.info("[save_and_unpack_dashboard][State] Dashboard saved to: %s", zip_path)
if unpack:
with zipfile.ZipFile(zip_path, 'r') as zip_ref:
zip_ref.extractall(output_path)
logger.info("[save_and_unpack_dashboard][State] Dashboard unpacked to: %s", output_path)
return zip_path, output_path
return zip_path, None
except zipfile.BadZipFile as e:
logger.error("[save_and_unpack_dashboard][Failure] Invalid ZIP archive: %s", e)
raise InvalidZipFormatError(f"Invalid ZIP file: {e}") from e
# [/DEF:save_and_unpack_dashboard]
# [DEF:update_yamls:Function]
# @PURPOSE: Обновляет конфигурации в YAML-файлах, заменяя значения или применяя regex.
# @RELATION: CALLS -> _update_yaml_file
# @THROW: FileNotFoundError - Если `path` не существует.
# @PARAM: db_configs (Optional[List[Dict]]) - Список конфигураций для замены.
# @PARAM: path (str) - Путь к директории с YAML файлами.
# @PARAM: regexp_pattern (Optional[LiteralString]) - Паттерн для поиска.
# @PARAM: replace_string (Optional[LiteralString]) - Строка для замены.
# @PARAM: logger (Optional[SupersetLogger]) - Экземпляр логгера.
def update_yamls(db_configs: Optional[List[Dict[str, Any]]] = None, path: str = "dashboards", regexp_pattern: Optional[LiteralString] = None, replace_string: Optional[LiteralString] = None, logger: Optional[SupersetLogger] = None) -> None:
logger = logger or SupersetLogger(name="fileio")
logger.info("[update_yamls][Enter] Starting YAML configuration update.")
dir_path = Path(path)
assert dir_path.is_dir(), f"Путь {path} не существует или не является директорией"
configs: List[Dict[str, Any]] = db_configs or []
for file_path in dir_path.rglob("*.yaml"):
_update_yaml_file(file_path, configs, regexp_pattern, replace_string, logger)
# [/DEF:update_yamls]
# [DEF:_update_yaml_file:Function]
# @PURPOSE: (Helper) Обновляет один YAML файл.
# @PARAM: file_path (Path) - Путь к файлу.
# @PARAM: db_configs (List[Dict]) - Конфигурации.
# @PARAM: regexp_pattern (Optional[str]) - Паттерн.
# @PARAM: replace_string (Optional[str]) - Замена.
# @PARAM: logger (SupersetLogger) - Логгер.
def _update_yaml_file(file_path: Path, db_configs: List[Dict[str, Any]], regexp_pattern: Optional[str], replace_string: Optional[str], logger: SupersetLogger) -> None:
# Читаем содержимое файла
try:
with open(file_path, 'r', encoding='utf-8') as f:
content = f.read()
except Exception as e:
logger.error("[_update_yaml_file][Failure] Failed to read %s: %s", file_path, e)
return
# Если задан pattern и replace_string, применяем замену по регулярному выражению
if regexp_pattern and replace_string:
try:
new_content = re.sub(regexp_pattern, replace_string, content)
if new_content != content:
with open(file_path, 'w', encoding='utf-8') as f:
f.write(new_content)
logger.info("[_update_yaml_file][State] Updated %s using regex pattern", file_path)
except Exception as e:
logger.error("[_update_yaml_file][Failure] Error applying regex to %s: %s", file_path, e)
# Если заданы конфигурации, заменяем значения (поддержка old/new)
if db_configs:
try:
# Прямой текстовый заменитель для старых/новых значений, чтобы сохранить структуру файла
modified_content = content
for cfg in db_configs:
# Ожидаем структуру: {'old': {...}, 'new': {...}}
old_cfg = cfg.get('old', {})
new_cfg = cfg.get('new', {})
for key, old_val in old_cfg.items():
if key in new_cfg:
new_val = new_cfg[key]
# Заменяем только точные совпадения старого значения в тексте YAML, используя ключ для контекста
if isinstance(old_val, str):
# Ищем паттерн: key: "value" или key: value
key_pattern = re.escape(key)
val_pattern = re.escape(old_val)
# Группы: 1=ключ+разделитель, 2=открывающая кавычка (опц), 3=значение, 4=закрывающая кавычка (опц)
pattern = rf'({key_pattern}\s*:\s*)(["\']?)({val_pattern})(["\']?)'
# Функция замены, сохраняющая кавычки если они были
def replacer(match):
prefix = match.group(1)
quote_open = match.group(2)
quote_close = match.group(4)
return f"{prefix}{quote_open}{new_val}{quote_close}"
modified_content = re.sub(pattern, replacer, modified_content)
logger.info("[_update_yaml_file][State] Replaced '%s' with '%s' for key %s in %s", old_val, new_val, key, file_path)
# Записываем обратно изменённый контент без парсинга YAML, сохраняем оригинальное форматирование
with open(file_path, 'w', encoding='utf-8') as f:
f.write(modified_content)
except Exception as e:
logger.error("[_update_yaml_file][Failure] Error performing raw replacement in %s: %s", file_path, e)
# [/DEF:_update_yaml_file]
# [DEF:create_dashboard_export:Function]
# @PURPOSE: Создает ZIP-архив из указанных исходных путей.
# @PARAM: zip_path (Union[str, Path]) - Путь для сохранения ZIP архива.
# @PARAM: source_paths (List[Union[str, Path]]) - Список исходных путей для архивации.
# @PARAM: exclude_extensions (Optional[List[str]]) - Список расширений для исключения.
# @PARAM: logger (Optional[SupersetLogger]) - Экземпляр логгера.
# @RETURN: bool - `True` при успехе, `False` при ошибке.
def create_dashboard_export(zip_path: Union[str, Path], source_paths: List[Union[str, Path]], exclude_extensions: Optional[List[str]] = None, logger: Optional[SupersetLogger] = None) -> bool:
logger = logger or SupersetLogger(name="fileio")
logger.info("[create_dashboard_export][Enter] Packing dashboard: %s -> %s", source_paths, zip_path)
try:
exclude_ext = [ext.lower() for ext in exclude_extensions or []]
with zipfile.ZipFile(zip_path, 'w', zipfile.ZIP_DEFLATED) as zipf:
for src_path_str in source_paths:
src_path = Path(src_path_str)
assert src_path.exists(), f"Путь не найден: {src_path}"
for item in src_path.rglob('*'):
if item.is_file() and item.suffix.lower() not in exclude_ext:
arcname = item.relative_to(src_path.parent)
zipf.write(item, arcname)
logger.info("[create_dashboard_export][Exit] Archive created: %s", zip_path)
return True
except (IOError, zipfile.BadZipFile, AssertionError) as e:
logger.error("[create_dashboard_export][Failure] Error: %s", e, exc_info=True)
return False
# [/DEF:create_dashboard_export]
# [DEF:sanitize_filename:Function]
# @PURPOSE: Очищает строку от символов, недопустимых в именах файлов.
# @PARAM: filename (str) - Исходное имя файла.
# @RETURN: str - Очищенная строка.
def sanitize_filename(filename: str) -> str:
return re.sub(r'[\\/*?:"<>|]', "_", filename).strip()
# [/DEF:sanitize_filename]
# [DEF:get_filename_from_headers:Function]
# @PURPOSE: Извлекает имя файла из HTTP заголовка 'Content-Disposition'.
# @PARAM: headers (dict) - Словарь HTTP заголовков.
# @RETURN: Optional[str] - Имя файла или `None`.
def get_filename_from_headers(headers: dict) -> Optional[str]:
content_disposition = headers.get("Content-Disposition", "")
if match := re.search(r'filename="?([^"]+)"?', content_disposition):
return match.group(1).strip()
return None
# [/DEF:get_filename_from_headers]
# [DEF:consolidate_archive_folders:Function]
# @PURPOSE: Консолидирует директории архивов на основе общего слага в имени.
# @THROW: TypeError, ValueError - Если `root_directory` невалиден.
# @PARAM: root_directory (Path) - Корневая директория для консолидации.
# @PARAM: logger (Optional[SupersetLogger]) - Экземпляр логгера.
def consolidate_archive_folders(root_directory: Path, logger: Optional[SupersetLogger] = None) -> None:
logger = logger or SupersetLogger(name="fileio")
assert isinstance(root_directory, Path), "root_directory must be a Path object."
assert root_directory.is_dir(), "root_directory must be an existing directory."
logger.info("[consolidate_archive_folders][Enter] Consolidating archives in %s", root_directory)
# Собираем все директории с архивами
archive_dirs = []
for item in root_directory.iterdir():
if item.is_dir():
# Проверяем, есть ли в директории ZIP-архивы
if any(item.glob("*.zip")):
archive_dirs.append(item)
# Группируем по слагу (части имени до первого '_')
slug_groups = {}
for dir_path in archive_dirs:
dir_name = dir_path.name
slug = dir_name.split('_')[0] if '_' in dir_name else dir_name
if slug not in slug_groups:
slug_groups[slug] = []
slug_groups[slug].append(dir_path)
# Для каждой группы консолидируем
for slug, dirs in slug_groups.items():
if len(dirs) <= 1:
continue
# Создаем целевую директорию
target_dir = root_directory / slug
target_dir.mkdir(exist_ok=True)
logger.info("[consolidate_archive_folders][State] Consolidating %d directories under %s", len(dirs), target_dir)
# Перемещаем содержимое
for source_dir in dirs:
if source_dir == target_dir:
continue
for item in source_dir.iterdir():
dest_item = target_dir / item.name
try:
if item.is_dir():
shutil.move(str(item), str(dest_item))
else:
shutil.move(str(item), str(dest_item))
except Exception as e:
logger.error("[consolidate_archive_folders][Failure] Failed to move %s to %s: %s", item, dest_item, e)
# Удаляем исходную директорию
try:
source_dir.rmdir()
logger.info("[consolidate_archive_folders][State] Removed source directory: %s", source_dir)
except Exception as e:
logger.error("[consolidate_archive_folders][Failure] Failed to remove source directory %s: %s", source_dir, e)
# [/DEF:consolidate_archive_folders]
# [/DEF:superset_tool.utils.fileio]

178
superset_tool/utils/init_clients.py Normal file → Executable file
View File

@@ -1,68 +1,110 @@
# [DEF:superset_tool.utils.init_clients:Module]
#
# @SEMANTICS: utility, factory, client, initialization, configuration
# @PURPOSE: Централизованно инициализирует клиенты Superset для различных окружений (DEV, PROD, SBX, PREPROD), используя `keyring` для безопасного доступа к паролям.
# @LAYER: Infra
# @RELATION: DEPENDS_ON -> superset_tool.models
# @RELATION: DEPENDS_ON -> superset_tool.client
# @RELATION: DEPENDS_ON -> keyring
# @PUBLIC_API: setup_clients
# [SECTION: IMPORTS]
import keyring
from typing import Dict
from superset_tool.models import SupersetConfig
from superset_tool.client import SupersetClient
from superset_tool.utils.logger import SupersetLogger
# [/SECTION]
# [DEF:setup_clients:Function]
# @PURPOSE: Инициализирует и возвращает словарь клиентов `SupersetClient` для всех предопределенных окружений.
# @PRE: `keyring` должен содержать пароли для систем "dev migrate", "prod migrate", "sbx migrate", "preprod migrate".
# @PRE: `logger` должен быть валидным экземпляром `SupersetLogger`.
# @POST: Возвращает словарь с инициализированными клиентами.
# @THROW: ValueError - Если пароль для окружения не найден в `keyring`.
# @THROW: Exception - При любых других ошибках инициализации.
# @RELATION: CREATES_INSTANCE_OF -> SupersetConfig
# @RELATION: CREATES_INSTANCE_OF -> SupersetClient
# @PARAM: logger (SupersetLogger) - Экземпляр логгера для записи процесса.
# @RETURN: Dict[str, SupersetClient] - Словарь, где ключ - имя окружения, значение - `SupersetClient`.
def setup_clients(logger: SupersetLogger) -> Dict[str, SupersetClient]:
logger.info("[setup_clients][Enter] Starting Superset clients initialization.")
clients = {}
environments = {
"dev": "https://devta.bi.dwh.rusal.com/api/v1",
"prod": "https://prodta.bi.dwh.rusal.com/api/v1",
"sbx": "https://sandboxta.bi.dwh.rusal.com/api/v1",
"preprod": "https://preprodta.bi.dwh.rusal.com/api/v1",
"uatta": "https://uatta.bi.dwh.rusal.com/api/v1",
"dev5":"https://dev.bi.dwh.rusal.com/api/v1"
}
try:
for env_name, base_url in environments.items():
logger.debug("[setup_clients][State] Creating config for environment: %s", env_name.upper())
password = keyring.get_password("system", f"{env_name} migrate")
if not password:
raise ValueError(f"Пароль для '{env_name} migrate' не найден в keyring.")
config = SupersetConfig(
env=env_name,
base_url=base_url,
auth={"provider": "db", "username": "migrate_user", "password": password, "refresh": True},
verify_ssl=False
)
clients[env_name] = SupersetClient(config, logger)
logger.debug("[setup_clients][State] Client for %s created successfully.", env_name.upper())
logger.info("[setup_clients][Exit] All clients (%s) initialized successfully.", ', '.join(clients.keys()))
return clients
except Exception as e:
logger.critical("[setup_clients][Failure] Critical error during client initialization: %s", e, exc_info=True)
raise
# [/DEF:setup_clients]
# [/DEF:superset_tool.utils.init_clients]
# [DEF:superset_tool.utils.init_clients:Module]
#
# @SEMANTICS: utility, factory, client, initialization, configuration
# @PURPOSE: Централизованно инициализирует клиенты Superset для различных окружений (DEV, PROD, SBX, PREPROD), используя `keyring` для безопасного доступа к паролям.
# @LAYER: Infra
# @RELATION: DEPENDS_ON -> superset_tool.models
# @RELATION: DEPENDS_ON -> superset_tool.client
# @RELATION: DEPENDS_ON -> keyring
# @PUBLIC_API: setup_clients
# [SECTION: IMPORTS]
import keyring
import os
from typing import Dict, List, Optional, Any
from superset_tool.models import SupersetConfig
from superset_tool.client import SupersetClient
from superset_tool.utils.logger import SupersetLogger
# [/SECTION]
# [DEF:setup_clients:Function]
# @PURPOSE: Инициализирует и возвращает словарь клиентов `SupersetClient`.
# @PRE: `logger` должен быть валидным экземпляром `SupersetLogger`.
# @POST: Возвращает словарь с инициализированными клиентами.
# @THROW: Exception - При любых других ошибках инициализации.
# @RELATION: CREATES_INSTANCE_OF -> SupersetConfig
# @RELATION: CREATES_INSTANCE_OF -> SupersetClient
# @PARAM: logger (SupersetLogger) - Экземпляр логгера для записи процесса.
# @PARAM: custom_envs (List[Dict[str, Any]]) - Список пользовательских настроек окружений.
# @RETURN: Dict[str, SupersetClient] - Словарь, где ключ - имя окружения, значение - `SupersetClient`.
def setup_clients(logger: SupersetLogger, custom_envs: Optional[List[Any]] = None) -> Dict[str, SupersetClient]:
logger.info("[setup_clients][Enter] Starting Superset clients initialization.")
clients = {}
try:
# Try to load from ConfigManager if available
try:
from backend.src.dependencies import get_config_manager
config_manager = get_config_manager()
envs = config_manager.get_environments()
if envs:
logger.info("[setup_clients][Action] Loading environments from ConfigManager")
for env in envs:
logger.debug("[setup_clients][State] Creating config for environment: %s", env.name)
config = SupersetConfig(
env=env.name,
base_url=env.url,
auth={"provider": "db", "username": env.username, "password": env.password, "refresh": "true"},
verify_ssl=False,
timeout=30,
logger=logger
)
clients[env.name] = SupersetClient(config, logger)
return clients
except (ImportError, Exception) as e:
logger.debug(f"[setup_clients][State] ConfigManager not available or failed: {e}")
if custom_envs:
for env in custom_envs:
# Handle both dict and object (like Pydantic model)
env_name = str(getattr(env, 'name', env.get('name') if isinstance(env, dict) else "unknown"))
base_url = str(getattr(env, 'url', env.get('url') if isinstance(env, dict) else ""))
username = str(getattr(env, 'username', env.get('username') if isinstance(env, dict) else ""))
password = str(getattr(env, 'password', env.get('password') if isinstance(env, dict) else ""))
logger.debug("[setup_clients][State] Creating config for custom environment: %s", env_name)
config = SupersetConfig(
env=env_name,
base_url=base_url,
auth={"provider": "db", "username": username, "password": password, "refresh": "true"},
verify_ssl=False,
timeout=30,
logger=logger
)
clients[env_name] = SupersetClient(config, logger)
else:
# Fallback to hardcoded environments with keyring
environments = {
"dev": "https://devta.bi.dwh.rusal.com/api/v1",
"prod": "https://prodta.bi.dwh.rusal.com/api/v1",
"sbx": "https://sandboxta.bi.dwh.rusal.com/api/v1",
"preprod": "https://preprodta.bi.dwh.rusal.com/api/v1",
"uatta": "https://uatta.bi.dwh.rusal.com/api/v1",
"dev5":"https://dev.bi.dwh.rusal.com/api/v1"
}
for env_name, base_url in environments.items():
logger.debug("[setup_clients][State] Creating config for environment: %s", env_name.upper())
password = keyring.get_password("system", f"{env_name} migrate")
if not password:
logger.warning(f"Пароль для '{env_name} migrate' не найден в keyring. Пропускаем.")
continue
config = SupersetConfig(
env=env_name,
base_url=base_url,
auth={"provider": "db", "username": "migrate_user", "password": password, "refresh": "true"},
verify_ssl=False,
timeout=30,
logger=logger
)
clients[env_name] = SupersetClient(config, logger)
logger.info("[setup_clients][Exit] All clients (%s) initialized successfully.", ', '.join(clients.keys()))
return clients
except Exception as e:
logger.critical("[setup_clients][Failure] Critical error during client initialization: %s", e, exc_info=True)
raise
# [/DEF:setup_clients]
# [/DEF:superset_tool.utils.init_clients]

206
superset_tool/utils/logger.py Normal file → Executable file
View File

@@ -1,103 +1,103 @@
# [DEF:superset_tool.utils.logger:Module]
#
# @SEMANTICS: logging, utility, infrastructure, wrapper
# @PURPOSE: Предоставляет универсальную обёртку над стандартным `logging.Logger` для унифицированного создания и управления логгерами с выводом в консоль и/или файл.
# @LAYER: Infra
# @RELATION: WRAPS -> logging.Logger
#
# @INVARIANT: Логгер всегда должен иметь имя.
# @PUBLIC_API: SupersetLogger
# [SECTION: IMPORTS]
import logging
import sys
from datetime import datetime
from pathlib import Path
from typing import Optional, Any, Mapping
# [/SECTION]
# [DEF:SupersetLogger:Class]
# @PURPOSE: Обёртка над `logging.Logger`, которая упрощает конфигурацию и использование логгеров.
# @RELATION: WRAPS -> logging.Logger
class SupersetLogger:
# [DEF:SupersetLogger.__init__:Function]
# @PURPOSE: Конфигурирует и инициализирует логгер, добавляя обработчики для файла и/или консоли.
# @PRE: Если log_dir указан, путь должен быть валидным (или создаваемым).
# @POST: `self.logger` готов к использованию с настроенными обработчиками.
# @PARAM: name (str) - Идентификатор логгера.
# @PARAM: log_dir (Optional[Path]) - Директория для сохранения лог-файлов.
# @PARAM: level (int) - Уровень логирования (e.g., `logging.INFO`).
# @PARAM: console (bool) - Флаг для включения вывода в консоль.
def __init__(self, name: str = "superset_tool", log_dir: Optional[Path] = None, level: int = logging.INFO, console: bool = True) -> None:
self.logger = logging.getLogger(name)
self.logger.setLevel(level)
self.logger.propagate = False
formatter = logging.Formatter("%(asctime)s - %(levelname)s - %(message)s")
if self.logger.hasHandlers():
self.logger.handlers.clear()
if log_dir:
log_dir.mkdir(parents=True, exist_ok=True)
timestamp = datetime.now().strftime("%Y%m%d")
file_handler = logging.FileHandler(log_dir / f"{name}_{timestamp}.log", encoding="utf-8")
file_handler.setFormatter(formatter)
self.logger.addHandler(file_handler)
if console:
console_handler = logging.StreamHandler(sys.stdout)
console_handler.setFormatter(formatter)
self.logger.addHandler(console_handler)
# [/DEF:SupersetLogger.__init__]
# [DEF:SupersetLogger._log:Function]
# @PURPOSE: (Helper) Универсальный метод для вызова соответствующего уровня логирования.
# @PARAM: level_method (Any) - Метод логгера (info, debug, etc).
# @PARAM: msg (str) - Сообщение.
# @PARAM: args (Any) - Аргументы форматирования.
# @PARAM: extra (Optional[Mapping[str, Any]]) - Дополнительные данные.
# @PARAM: exc_info (bool) - Добавлять ли информацию об исключении.
def _log(self, level_method: Any, msg: str, *args: Any, extra: Optional[Mapping[str, Any]] = None, exc_info: bool = False) -> None:
level_method(msg, *args, extra=extra, exc_info=exc_info)
# [/DEF:SupersetLogger._log]
# [DEF:SupersetLogger.info:Function]
# @PURPOSE: Записывает сообщение уровня INFO.
def info(self, msg: str, *args: Any, extra: Optional[Mapping[str, Any]] = None, exc_info: bool = False) -> None:
self._log(self.logger.info, msg, *args, extra=extra, exc_info=exc_info)
# [/DEF:SupersetLogger.info]
# [DEF:SupersetLogger.debug:Function]
# @PURPOSE: Записывает сообщение уровня DEBUG.
def debug(self, msg: str, *args: Any, extra: Optional[Mapping[str, Any]] = None, exc_info: bool = False) -> None:
self._log(self.logger.debug, msg, *args, extra=extra, exc_info=exc_info)
# [/DEF:SupersetLogger.debug]
# [DEF:SupersetLogger.warning:Function]
# @PURPOSE: Записывает сообщение уровня WARNING.
def warning(self, msg: str, *args: Any, extra: Optional[Mapping[str, Any]] = None, exc_info: bool = False) -> None:
self._log(self.logger.warning, msg, *args, extra=extra, exc_info=exc_info)
# [/DEF:SupersetLogger.warning]
# [DEF:SupersetLogger.error:Function]
# @PURPOSE: Записывает сообщение уровня ERROR.
def error(self, msg: str, *args: Any, extra: Optional[Mapping[str, Any]] = None, exc_info: bool = False) -> None:
self._log(self.logger.error, msg, *args, extra=extra, exc_info=exc_info)
# [/DEF:SupersetLogger.error]
# [DEF:SupersetLogger.critical:Function]
# @PURPOSE: Записывает сообщение уровня CRITICAL.
def critical(self, msg: str, *args: Any, extra: Optional[Mapping[str, Any]] = None, exc_info: bool = False) -> None:
self._log(self.logger.critical, msg, *args, extra=extra, exc_info=exc_info)
# [/DEF:SupersetLogger.critical]
# [DEF:SupersetLogger.exception:Function]
# @PURPOSE: Записывает сообщение уровня ERROR вместе с трассировкой стека текущего исключения.
def exception(self, msg: str, *args: Any, **kwargs: Any) -> None:
self.logger.exception(msg, *args, **kwargs)
# [/DEF:SupersetLogger.exception]
# [/DEF:SupersetLogger]
# [/DEF:superset_tool.utils.logger]
# [DEF:superset_tool.utils.logger:Module]
#
# @SEMANTICS: logging, utility, infrastructure, wrapper
# @PURPOSE: Предоставляет универсальную обёртку над стандартным `logging.Logger` для унифицированного создания и управления логгерами с выводом в консоль и/или файл.
# @LAYER: Infra
# @RELATION: WRAPS -> logging.Logger
#
# @INVARIANT: Логгер всегда должен иметь имя.
# @PUBLIC_API: SupersetLogger
# [SECTION: IMPORTS]
import logging
import sys
from datetime import datetime
from pathlib import Path
from typing import Optional, Any, Mapping
# [/SECTION]
# [DEF:SupersetLogger:Class]
# @PURPOSE: Обёртка над `logging.Logger`, которая упрощает конфигурацию и использование логгеров.
# @RELATION: WRAPS -> logging.Logger
class SupersetLogger:
# [DEF:SupersetLogger.__init__:Function]
# @PURPOSE: Конфигурирует и инициализирует логгер, добавляя обработчики для файла и/или консоли.
# @PRE: Если log_dir указан, путь должен быть валидным (или создаваемым).
# @POST: `self.logger` готов к использованию с настроенными обработчиками.
# @PARAM: name (str) - Идентификатор логгера.
# @PARAM: log_dir (Optional[Path]) - Директория для сохранения лог-файлов.
# @PARAM: level (int) - Уровень логирования (e.g., `logging.INFO`).
# @PARAM: console (bool) - Флаг для включения вывода в консоль.
def __init__(self, name: str = "superset_tool", log_dir: Optional[Path] = None, level: int = logging.INFO, console: bool = True) -> None:
self.logger = logging.getLogger(name)
self.logger.setLevel(level)
self.logger.propagate = False
formatter = logging.Formatter("%(asctime)s - %(levelname)s - %(message)s")
if self.logger.hasHandlers():
self.logger.handlers.clear()
if log_dir:
log_dir.mkdir(parents=True, exist_ok=True)
timestamp = datetime.now().strftime("%Y%m%d")
file_handler = logging.FileHandler(log_dir / f"{name}_{timestamp}.log", encoding="utf-8")
file_handler.setFormatter(formatter)
self.logger.addHandler(file_handler)
if console:
console_handler = logging.StreamHandler(sys.stdout)
console_handler.setFormatter(formatter)
self.logger.addHandler(console_handler)
# [/DEF:SupersetLogger.__init__]
# [DEF:SupersetLogger._log:Function]
# @PURPOSE: (Helper) Универсальный метод для вызова соответствующего уровня логирования.
# @PARAM: level_method (Any) - Метод логгера (info, debug, etc).
# @PARAM: msg (str) - Сообщение.
# @PARAM: args (Any) - Аргументы форматирования.
# @PARAM: extra (Optional[Mapping[str, Any]]) - Дополнительные данные.
# @PARAM: exc_info (bool) - Добавлять ли информацию об исключении.
def _log(self, level_method: Any, msg: str, *args: Any, extra: Optional[Mapping[str, Any]] = None, exc_info: bool = False) -> None:
level_method(msg, *args, extra=extra, exc_info=exc_info)
# [/DEF:SupersetLogger._log]
# [DEF:SupersetLogger.info:Function]
# @PURPOSE: Записывает сообщение уровня INFO.
def info(self, msg: str, *args: Any, extra: Optional[Mapping[str, Any]] = None, exc_info: bool = False) -> None:
self._log(self.logger.info, msg, *args, extra=extra, exc_info=exc_info)
# [/DEF:SupersetLogger.info]
# [DEF:SupersetLogger.debug:Function]
# @PURPOSE: Записывает сообщение уровня DEBUG.
def debug(self, msg: str, *args: Any, extra: Optional[Mapping[str, Any]] = None, exc_info: bool = False) -> None:
self._log(self.logger.debug, msg, *args, extra=extra, exc_info=exc_info)
# [/DEF:SupersetLogger.debug]
# [DEF:SupersetLogger.warning:Function]
# @PURPOSE: Записывает сообщение уровня WARNING.
def warning(self, msg: str, *args: Any, extra: Optional[Mapping[str, Any]] = None, exc_info: bool = False) -> None:
self._log(self.logger.warning, msg, *args, extra=extra, exc_info=exc_info)
# [/DEF:SupersetLogger.warning]
# [DEF:SupersetLogger.error:Function]
# @PURPOSE: Записывает сообщение уровня ERROR.
def error(self, msg: str, *args: Any, extra: Optional[Mapping[str, Any]] = None, exc_info: bool = False) -> None:
self._log(self.logger.error, msg, *args, extra=extra, exc_info=exc_info)
# [/DEF:SupersetLogger.error]
# [DEF:SupersetLogger.critical:Function]
# @PURPOSE: Записывает сообщение уровня CRITICAL.
def critical(self, msg: str, *args: Any, extra: Optional[Mapping[str, Any]] = None, exc_info: bool = False) -> None:
self._log(self.logger.critical, msg, *args, extra=extra, exc_info=exc_info)
# [/DEF:SupersetLogger.critical]
# [DEF:SupersetLogger.exception:Function]
# @PURPOSE: Записывает сообщение уровня ERROR вместе с трассировкой стека текущего исключения.
def exception(self, msg: str, *args: Any, **kwargs: Any) -> None:
self.logger.exception(msg, *args, **kwargs)
# [/DEF:SupersetLogger.exception]
# [/DEF:SupersetLogger]
# [/DEF:superset_tool.utils.logger]

464
superset_tool/utils/network.py Normal file → Executable file
View File

@@ -1,232 +1,232 @@
# [DEF:superset_tool.utils.network:Module]
#
# @SEMANTICS: network, http, client, api, requests, session, authentication
# @PURPOSE: Инкапсулирует низкоуровневую HTTP-логику для взаимодействия с Superset API, включая аутентификацию, управление сессией, retry-логику и обработку ошибок.
# @LAYER: Infra
# @RELATION: DEPENDS_ON -> superset_tool.exceptions
# @RELATION: DEPENDS_ON -> superset_tool.utils.logger
# @RELATION: DEPENDS_ON -> requests
# @PUBLIC_API: APIClient
# [SECTION: IMPORTS]
from typing import Optional, Dict, Any, List, Union, cast
import json
import io
from pathlib import Path
import requests
from requests.adapters import HTTPAdapter
import urllib3
from urllib3.util.retry import Retry
from superset_tool.exceptions import AuthenticationError, NetworkError, DashboardNotFoundError, SupersetAPIError, PermissionDeniedError
from superset_tool.utils.logger import SupersetLogger
# [/SECTION]
# [DEF:APIClient:Class]
# @PURPOSE: Инкапсулирует HTTP-логику для работы с API, включая сессии, аутентификацию, и обработку запросов.
class APIClient:
DEFAULT_TIMEOUT = 30
# [DEF:APIClient.__init__:Function]
# @PURPOSE: Инициализирует API клиент с конфигурацией, сессией и логгером.
# @PARAM: config (Dict[str, Any]) - Конфигурация.
# @PARAM: verify_ssl (bool) - Проверять ли SSL.
# @PARAM: timeout (int) - Таймаут запросов.
# @PARAM: logger (Optional[SupersetLogger]) - Логгер.
def __init__(self, config: Dict[str, Any], verify_ssl: bool = True, timeout: int = DEFAULT_TIMEOUT, logger: Optional[SupersetLogger] = None):
self.logger = logger or SupersetLogger(name="APIClient")
self.logger.info("[APIClient.__init__][Entry] Initializing APIClient.")
self.base_url: str = config.get("base_url", "")
self.auth = config.get("auth")
self.request_settings = {"verify_ssl": verify_ssl, "timeout": timeout}
self.session = self._init_session()
self._tokens: Dict[str, str] = {}
self._authenticated = False
self.logger.info("[APIClient.__init__][Exit] APIClient initialized.")
# [/DEF:APIClient.__init__]
# [DEF:APIClient._init_session:Function]
# @PURPOSE: Создает и настраивает `requests.Session` с retry-логикой.
# @RETURN: requests.Session - Настроенная сессия.
def _init_session(self) -> requests.Session:
session = requests.Session()
retries = Retry(total=3, backoff_factor=0.5, status_forcelist=[500, 502, 503, 504])
adapter = HTTPAdapter(max_retries=retries)
session.mount('http://', adapter)
session.mount('https://', adapter)
if not self.request_settings["verify_ssl"]:
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
self.logger.warning("[_init_session][State] SSL verification disabled.")
session.verify = self.request_settings["verify_ssl"]
return session
# [/DEF:APIClient._init_session]
# [DEF:APIClient.authenticate:Function]
# @PURPOSE: Выполняет аутентификацию в Superset API и получает access и CSRF токены.
# @POST: `self._tokens` заполнен, `self._authenticated` установлен в `True`.
# @RETURN: Dict[str, str] - Словарь с токенами.
# @THROW: AuthenticationError, NetworkError - при ошибках.
def authenticate(self) -> Dict[str, str]:
self.logger.info("[authenticate][Enter] Authenticating to %s", self.base_url)
try:
login_url = f"{self.base_url}/security/login"
response = self.session.post(login_url, json=self.auth, timeout=self.request_settings["timeout"])
response.raise_for_status()
access_token = response.json()["access_token"]
csrf_url = f"{self.base_url}/security/csrf_token/"
csrf_response = self.session.get(csrf_url, headers={"Authorization": f"Bearer {access_token}"}, timeout=self.request_settings["timeout"])
csrf_response.raise_for_status()
self._tokens = {"access_token": access_token, "csrf_token": csrf_response.json()["result"]}
self._authenticated = True
self.logger.info("[authenticate][Exit] Authenticated successfully.")
return self._tokens
except requests.exceptions.HTTPError as e:
raise AuthenticationError(f"Authentication failed: {e}") from e
except (requests.exceptions.RequestException, KeyError) as e:
raise NetworkError(f"Network or parsing error during authentication: {e}") from e
# [/DEF:APIClient.authenticate]
@property
def headers(self) -> Dict[str, str]:
# [DEF:APIClient.headers:Function]
# @PURPOSE: Возвращает HTTP-заголовки для аутентифицированных запросов.
if not self._authenticated: self.authenticate()
return {
"Authorization": f"Bearer {self._tokens['access_token']}",
"X-CSRFToken": self._tokens.get("csrf_token", ""),
"Referer": self.base_url,
"Content-Type": "application/json"
}
# [/DEF:APIClient.headers]
# [DEF:APIClient.request:Function]
# @PURPOSE: Выполняет универсальный HTTP-запрос к API.
# @RETURN: `requests.Response` если `raw_response=True`, иначе `dict`.
# @THROW: SupersetAPIError, NetworkError и их подклассы.
# @PARAM: method (str) - HTTP метод.
# @PARAM: endpoint (str) - API эндпоинт.
# @PARAM: headers (Optional[Dict]) - Дополнительные заголовки.
# @PARAM: raw_response (bool) - Возвращать ли сырой ответ.
def request(self, method: str, endpoint: str, headers: Optional[Dict] = None, raw_response: bool = False, **kwargs) -> Union[requests.Response, Dict[str, Any]]:
full_url = f"{self.base_url}{endpoint}"
_headers = self.headers.copy()
if headers: _headers.update(headers)
try:
response = self.session.request(method, full_url, headers=_headers, **kwargs)
response.raise_for_status()
return response if raw_response else response.json()
except requests.exceptions.HTTPError as e:
self._handle_http_error(e, endpoint)
except requests.exceptions.RequestException as e:
self._handle_network_error(e, full_url)
# [/DEF:APIClient.request]
# [DEF:APIClient._handle_http_error:Function]
# @PURPOSE: (Helper) Преобразует HTTP ошибки в кастомные исключения.
# @PARAM: e (requests.exceptions.HTTPError) - Ошибка.
# @PARAM: endpoint (str) - Эндпоинт.
def _handle_http_error(self, e: requests.exceptions.HTTPError, endpoint: str):
status_code = e.response.status_code
if status_code == 404: raise DashboardNotFoundError(endpoint) from e
if status_code == 403: raise PermissionDeniedError() from e
if status_code == 401: raise AuthenticationError() from e
raise SupersetAPIError(f"API Error {status_code}: {e.response.text}") from e
# [/DEF:APIClient._handle_http_error]
# [DEF:APIClient._handle_network_error:Function]
# @PURPOSE: (Helper) Преобразует сетевые ошибки в `NetworkError`.
# @PARAM: e (requests.exceptions.RequestException) - Ошибка.
# @PARAM: url (str) - URL.
def _handle_network_error(self, e: requests.exceptions.RequestException, url: str):
if isinstance(e, requests.exceptions.Timeout): msg = "Request timeout"
elif isinstance(e, requests.exceptions.ConnectionError): msg = "Connection error"
else: msg = f"Unknown network error: {e}"
raise NetworkError(msg, url=url) from e
# [/DEF:APIClient._handle_network_error]
# [DEF:APIClient.upload_file:Function]
# @PURPOSE: Загружает файл на сервер через multipart/form-data.
# @RETURN: Ответ API в виде словаря.
# @THROW: SupersetAPIError, NetworkError, TypeError.
# @PARAM: endpoint (str) - Эндпоинт.
# @PARAM: file_info (Dict[str, Any]) - Информация о файле.
# @PARAM: extra_data (Optional[Dict]) - Дополнительные данные.
# @PARAM: timeout (Optional[int]) - Таймаут.
def upload_file(self, endpoint: str, file_info: Dict[str, Any], extra_data: Optional[Dict] = None, timeout: Optional[int] = None) -> Dict:
full_url = f"{self.base_url}{endpoint}"
_headers = self.headers.copy(); _headers.pop('Content-Type', None)
file_obj, file_name, form_field = file_info.get("file_obj"), file_info.get("file_name"), file_info.get("form_field", "file")
files_payload = {}
if isinstance(file_obj, (str, Path)):
with open(file_obj, 'rb') as f:
files_payload = {form_field: (file_name, f.read(), 'application/x-zip-compressed')}
elif isinstance(file_obj, io.BytesIO):
files_payload = {form_field: (file_name, file_obj.getvalue(), 'application/x-zip-compressed')}
else:
raise TypeError(f"Unsupported file_obj type: {type(file_obj)}")
return self._perform_upload(full_url, files_payload, extra_data, _headers, timeout)
# [/DEF:APIClient.upload_file]
# [DEF:APIClient._perform_upload:Function]
# @PURPOSE: (Helper) Выполняет POST запрос с файлом.
# @PARAM: url (str) - URL.
# @PARAM: files (Dict) - Файлы.
# @PARAM: data (Optional[Dict]) - Данные.
# @PARAM: headers (Dict) - Заголовки.
# @PARAM: timeout (Optional[int]) - Таймаут.
# @RETURN: Dict - Ответ.
def _perform_upload(self, url: str, files: Dict, data: Optional[Dict], headers: Dict, timeout: Optional[int]) -> Dict:
try:
response = self.session.post(url, files=files, data=data or {}, headers=headers, timeout=timeout or self.request_settings["timeout"])
response.raise_for_status()
# Добавляем логирование для отладки
if response.status_code == 200:
try:
return response.json()
except Exception as json_e:
self.logger.debug(f"[_perform_upload][Debug] Response is not valid JSON: {response.text[:200]}...")
raise SupersetAPIError(f"API error during upload: Response is not valid JSON: {json_e}") from json_e
return response.json()
except requests.exceptions.HTTPError as e:
raise SupersetAPIError(f"API error during upload: {e.response.text}") from e
except requests.exceptions.RequestException as e:
raise NetworkError(f"Network error during upload: {e}", url=url) from e
# [/DEF:APIClient._perform_upload]
# [DEF:APIClient.fetch_paginated_count:Function]
# @PURPOSE: Получает общее количество элементов для пагинации.
# @PARAM: endpoint (str) - Эндпоинт.
# @PARAM: query_params (Dict) - Параметры запроса.
# @PARAM: count_field (str) - Поле с количеством.
# @RETURN: int - Количество.
def fetch_paginated_count(self, endpoint: str, query_params: Dict, count_field: str = "count") -> int:
response_json = cast(Dict[str, Any], self.request("GET", endpoint, params={"q": json.dumps(query_params)}))
return response_json.get(count_field, 0)
# [/DEF:APIClient.fetch_paginated_count]
# [DEF:APIClient.fetch_paginated_data:Function]
# @PURPOSE: Автоматически собирает данные со всех страниц пагинированного эндпоинта.
# @PARAM: endpoint (str) - Эндпоинт.
# @PARAM: pagination_options (Dict[str, Any]) - Опции пагинации.
# @RETURN: List[Any] - Список данных.
def fetch_paginated_data(self, endpoint: str, pagination_options: Dict[str, Any]) -> List[Any]:
base_query, total_count = pagination_options["base_query"], pagination_options["total_count"]
results_field, page_size = pagination_options["results_field"], base_query.get('page_size')
assert page_size and page_size > 0, "'page_size' must be a positive number."
results = []
for page in range((total_count + page_size - 1) // page_size):
query = {**base_query, 'page': page}
response_json = cast(Dict[str, Any], self.request("GET", endpoint, params={"q": json.dumps(query)}))
results.extend(response_json.get(results_field, []))
return results
# [/DEF:APIClient.fetch_paginated_data]
# [/DEF:APIClient]
# [/DEF:superset_tool.utils.network]
# [DEF:superset_tool.utils.network:Module]
#
# @SEMANTICS: network, http, client, api, requests, session, authentication
# @PURPOSE: Инкапсулирует низкоуровневую HTTP-логику для взаимодействия с Superset API, включая аутентификацию, управление сессией, retry-логику и обработку ошибок.
# @LAYER: Infra
# @RELATION: DEPENDS_ON -> superset_tool.exceptions
# @RELATION: DEPENDS_ON -> superset_tool.utils.logger
# @RELATION: DEPENDS_ON -> requests
# @PUBLIC_API: APIClient
# [SECTION: IMPORTS]
from typing import Optional, Dict, Any, List, Union, cast
import json
import io
from pathlib import Path
import requests
from requests.adapters import HTTPAdapter
import urllib3
from urllib3.util.retry import Retry
from superset_tool.exceptions import AuthenticationError, NetworkError, DashboardNotFoundError, SupersetAPIError, PermissionDeniedError
from superset_tool.utils.logger import SupersetLogger
# [/SECTION]
# [DEF:APIClient:Class]
# @PURPOSE: Инкапсулирует HTTP-логику для работы с API, включая сессии, аутентификацию, и обработку запросов.
class APIClient:
DEFAULT_TIMEOUT = 30
# [DEF:APIClient.__init__:Function]
# @PURPOSE: Инициализирует API клиент с конфигурацией, сессией и логгером.
# @PARAM: config (Dict[str, Any]) - Конфигурация.
# @PARAM: verify_ssl (bool) - Проверять ли SSL.
# @PARAM: timeout (int) - Таймаут запросов.
# @PARAM: logger (Optional[SupersetLogger]) - Логгер.
def __init__(self, config: Dict[str, Any], verify_ssl: bool = True, timeout: int = DEFAULT_TIMEOUT, logger: Optional[SupersetLogger] = None):
self.logger = logger or SupersetLogger(name="APIClient")
self.logger.info("[APIClient.__init__][Entry] Initializing APIClient.")
self.base_url: str = config.get("base_url", "")
self.auth = config.get("auth")
self.request_settings = {"verify_ssl": verify_ssl, "timeout": timeout}
self.session = self._init_session()
self._tokens: Dict[str, str] = {}
self._authenticated = False
self.logger.info("[APIClient.__init__][Exit] APIClient initialized.")
# [/DEF:APIClient.__init__]
# [DEF:APIClient._init_session:Function]
# @PURPOSE: Создает и настраивает `requests.Session` с retry-логикой.
# @RETURN: requests.Session - Настроенная сессия.
def _init_session(self) -> requests.Session:
session = requests.Session()
retries = Retry(total=3, backoff_factor=0.5, status_forcelist=[500, 502, 503, 504])
adapter = HTTPAdapter(max_retries=retries)
session.mount('http://', adapter)
session.mount('https://', adapter)
if not self.request_settings["verify_ssl"]:
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
self.logger.warning("[_init_session][State] SSL verification disabled.")
session.verify = self.request_settings["verify_ssl"]
return session
# [/DEF:APIClient._init_session]
# [DEF:APIClient.authenticate:Function]
# @PURPOSE: Выполняет аутентификацию в Superset API и получает access и CSRF токены.
# @POST: `self._tokens` заполнен, `self._authenticated` установлен в `True`.
# @RETURN: Dict[str, str] - Словарь с токенами.
# @THROW: AuthenticationError, NetworkError - при ошибках.
def authenticate(self) -> Dict[str, str]:
self.logger.info("[authenticate][Enter] Authenticating to %s", self.base_url)
try:
login_url = f"{self.base_url}/security/login"
response = self.session.post(login_url, json=self.auth, timeout=self.request_settings["timeout"])
response.raise_for_status()
access_token = response.json()["access_token"]
csrf_url = f"{self.base_url}/security/csrf_token/"
csrf_response = self.session.get(csrf_url, headers={"Authorization": f"Bearer {access_token}"}, timeout=self.request_settings["timeout"])
csrf_response.raise_for_status()
self._tokens = {"access_token": access_token, "csrf_token": csrf_response.json()["result"]}
self._authenticated = True
self.logger.info("[authenticate][Exit] Authenticated successfully.")
return self._tokens
except requests.exceptions.HTTPError as e:
raise AuthenticationError(f"Authentication failed: {e}") from e
except (requests.exceptions.RequestException, KeyError) as e:
raise NetworkError(f"Network or parsing error during authentication: {e}") from e
# [/DEF:APIClient.authenticate]
@property
def headers(self) -> Dict[str, str]:
# [DEF:APIClient.headers:Function]
# @PURPOSE: Возвращает HTTP-заголовки для аутентифицированных запросов.
if not self._authenticated: self.authenticate()
return {
"Authorization": f"Bearer {self._tokens['access_token']}",
"X-CSRFToken": self._tokens.get("csrf_token", ""),
"Referer": self.base_url,
"Content-Type": "application/json"
}
# [/DEF:APIClient.headers]
# [DEF:APIClient.request:Function]
# @PURPOSE: Выполняет универсальный HTTP-запрос к API.
# @RETURN: `requests.Response` если `raw_response=True`, иначе `dict`.
# @THROW: SupersetAPIError, NetworkError и их подклассы.
# @PARAM: method (str) - HTTP метод.
# @PARAM: endpoint (str) - API эндпоинт.
# @PARAM: headers (Optional[Dict]) - Дополнительные заголовки.
# @PARAM: raw_response (bool) - Возвращать ли сырой ответ.
def request(self, method: str, endpoint: str, headers: Optional[Dict] = None, raw_response: bool = False, **kwargs) -> Union[requests.Response, Dict[str, Any]]:
full_url = f"{self.base_url}{endpoint}"
_headers = self.headers.copy()
if headers: _headers.update(headers)
try:
response = self.session.request(method, full_url, headers=_headers, **kwargs)
response.raise_for_status()
return response if raw_response else response.json()
except requests.exceptions.HTTPError as e:
self._handle_http_error(e, endpoint)
except requests.exceptions.RequestException as e:
self._handle_network_error(e, full_url)
# [/DEF:APIClient.request]
# [DEF:APIClient._handle_http_error:Function]
# @PURPOSE: (Helper) Преобразует HTTP ошибки в кастомные исключения.
# @PARAM: e (requests.exceptions.HTTPError) - Ошибка.
# @PARAM: endpoint (str) - Эндпоинт.
def _handle_http_error(self, e: requests.exceptions.HTTPError, endpoint: str):
status_code = e.response.status_code
if status_code == 404: raise DashboardNotFoundError(endpoint) from e
if status_code == 403: raise PermissionDeniedError() from e
if status_code == 401: raise AuthenticationError() from e
raise SupersetAPIError(f"API Error {status_code}: {e.response.text}") from e
# [/DEF:APIClient._handle_http_error]
# [DEF:APIClient._handle_network_error:Function]
# @PURPOSE: (Helper) Преобразует сетевые ошибки в `NetworkError`.
# @PARAM: e (requests.exceptions.RequestException) - Ошибка.
# @PARAM: url (str) - URL.
def _handle_network_error(self, e: requests.exceptions.RequestException, url: str):
if isinstance(e, requests.exceptions.Timeout): msg = "Request timeout"
elif isinstance(e, requests.exceptions.ConnectionError): msg = "Connection error"
else: msg = f"Unknown network error: {e}"
raise NetworkError(msg, url=url) from e
# [/DEF:APIClient._handle_network_error]
# [DEF:APIClient.upload_file:Function]
# @PURPOSE: Загружает файл на сервер через multipart/form-data.
# @RETURN: Ответ API в виде словаря.
# @THROW: SupersetAPIError, NetworkError, TypeError.
# @PARAM: endpoint (str) - Эндпоинт.
# @PARAM: file_info (Dict[str, Any]) - Информация о файле.
# @PARAM: extra_data (Optional[Dict]) - Дополнительные данные.
# @PARAM: timeout (Optional[int]) - Таймаут.
def upload_file(self, endpoint: str, file_info: Dict[str, Any], extra_data: Optional[Dict] = None, timeout: Optional[int] = None) -> Dict:
full_url = f"{self.base_url}{endpoint}"
_headers = self.headers.copy(); _headers.pop('Content-Type', None)
file_obj, file_name, form_field = file_info.get("file_obj"), file_info.get("file_name"), file_info.get("form_field", "file")
files_payload = {}
if isinstance(file_obj, (str, Path)):
with open(file_obj, 'rb') as f:
files_payload = {form_field: (file_name, f.read(), 'application/x-zip-compressed')}
elif isinstance(file_obj, io.BytesIO):
files_payload = {form_field: (file_name, file_obj.getvalue(), 'application/x-zip-compressed')}
else:
raise TypeError(f"Unsupported file_obj type: {type(file_obj)}")
return self._perform_upload(full_url, files_payload, extra_data, _headers, timeout)
# [/DEF:APIClient.upload_file]
# [DEF:APIClient._perform_upload:Function]
# @PURPOSE: (Helper) Выполняет POST запрос с файлом.
# @PARAM: url (str) - URL.
# @PARAM: files (Dict) - Файлы.
# @PARAM: data (Optional[Dict]) - Данные.
# @PARAM: headers (Dict) - Заголовки.
# @PARAM: timeout (Optional[int]) - Таймаут.
# @RETURN: Dict - Ответ.
def _perform_upload(self, url: str, files: Dict, data: Optional[Dict], headers: Dict, timeout: Optional[int]) -> Dict:
try:
response = self.session.post(url, files=files, data=data or {}, headers=headers, timeout=timeout or self.request_settings["timeout"])
response.raise_for_status()
# Добавляем логирование для отладки
if response.status_code == 200:
try:
return response.json()
except Exception as json_e:
self.logger.debug(f"[_perform_upload][Debug] Response is not valid JSON: {response.text[:200]}...")
raise SupersetAPIError(f"API error during upload: Response is not valid JSON: {json_e}") from json_e
return response.json()
except requests.exceptions.HTTPError as e:
raise SupersetAPIError(f"API error during upload: {e.response.text}") from e
except requests.exceptions.RequestException as e:
raise NetworkError(f"Network error during upload: {e}", url=url) from e
# [/DEF:APIClient._perform_upload]
# [DEF:APIClient.fetch_paginated_count:Function]
# @PURPOSE: Получает общее количество элементов для пагинации.
# @PARAM: endpoint (str) - Эндпоинт.
# @PARAM: query_params (Dict) - Параметры запроса.
# @PARAM: count_field (str) - Поле с количеством.
# @RETURN: int - Количество.
def fetch_paginated_count(self, endpoint: str, query_params: Dict, count_field: str = "count") -> int:
response_json = cast(Dict[str, Any], self.request("GET", endpoint, params={"q": json.dumps(query_params)}))
return response_json.get(count_field, 0)
# [/DEF:APIClient.fetch_paginated_count]
# [DEF:APIClient.fetch_paginated_data:Function]
# @PURPOSE: Автоматически собирает данные со всех страниц пагинированного эндпоинта.
# @PARAM: endpoint (str) - Эндпоинт.
# @PARAM: pagination_options (Dict[str, Any]) - Опции пагинации.
# @RETURN: List[Any] - Список данных.
def fetch_paginated_data(self, endpoint: str, pagination_options: Dict[str, Any]) -> List[Any]:
base_query, total_count = pagination_options["base_query"], pagination_options["total_count"]
results_field, page_size = pagination_options["results_field"], base_query.get('page_size')
assert page_size and page_size > 0, "'page_size' must be a positive number."
results = []
for page in range((total_count + page_size - 1) // page_size):
query = {**base_query, 'page': page}
response_json = cast(Dict[str, Any], self.request("GET", endpoint, params={"q": json.dumps(query)}))
results.extend(response_json.get(results_field, []))
return results
# [/DEF:APIClient.fetch_paginated_data]
# [/DEF:APIClient]
# [/DEF:superset_tool.utils.network]

208
superset_tool/utils/whiptail_fallback.py Normal file → Executable file
View File

@@ -1,104 +1,104 @@
# [DEF:superset_tool.utils.whiptail_fallback:Module]
#
# @SEMANTICS: ui, fallback, console, utility, interactive
# @PURPOSE: Предоставляет плотный консольный UI-fallback для интерактивных диалогов, имитируя `whiptail` для систем, где он недоступен.
# @LAYER: UI
# @PUBLIC_API: menu, checklist, yesno, msgbox, inputbox, gauge
# [SECTION: IMPORTS]
import sys
from typing import List, Tuple, Optional, Any
# [/SECTION]
# [DEF:menu:Function]
# @PURPOSE: Отображает меню выбора и возвращает выбранный элемент.
# @PARAM: title (str) - Заголовок меню.
# @PARAM: prompt (str) - Приглашение к вводу.
# @PARAM: choices (List[str]) - Список вариантов для выбора.
# @RETURN: Tuple[int, Optional[str]] - Кортеж (код возврата, выбранный элемент). rc=0 - успех.
def menu(title: str, prompt: str, choices: List[str], **kwargs) -> Tuple[int, Optional[str]]:
print(f"\n=== {title} ===\n{prompt}")
for idx, item in enumerate(choices, 1):
print(f"{idx}) {item}")
try:
raw = input("\nВведите номер (0 отмена): ").strip()
sel = int(raw)
return (0, choices[sel - 1]) if 0 < sel <= len(choices) else (1, None)
except (ValueError, IndexError):
return 1, None
# [/DEF:menu]
# [DEF:checklist:Function]
# @PURPOSE: Отображает список с возможностью множественного выбора.
# @PARAM: title (str) - Заголовок.
# @PARAM: prompt (str) - Приглашение к вводу.
# @PARAM: options (List[Tuple[str, str]]) - Список кортежей (значение, метка).
# @RETURN: Tuple[int, List[str]] - Кортеж (код возврата, список выбранных значений).
def checklist(title: str, prompt: str, options: List[Tuple[str, str]], **kwargs) -> Tuple[int, List[str]]:
print(f"\n=== {title} ===\n{prompt}")
for idx, (val, label) in enumerate(options, 1):
print(f"{idx}) [{val}] {label}")
raw = input("\nВведите номера через запятую (пустой ввод → отказ): ").strip()
if not raw: return 1, []
try:
indices = {int(x.strip()) for x in raw.split(",") if x.strip()}
selected_values = [options[i - 1][0] for i in indices if 0 < i <= len(options)]
return 0, selected_values
except (ValueError, IndexError):
return 1, []
# [/DEF:checklist]
# [DEF:yesno:Function]
# @PURPOSE: Задает вопрос с ответом да/нет.
# @PARAM: title (str) - Заголовок.
# @PARAM: question (str) - Вопрос для пользователя.
# @RETURN: bool - `True`, если пользователь ответил "да".
def yesno(title: str, question: str, **kwargs) -> bool:
ans = input(f"\n=== {title} ===\n{question} (y/n): ").strip().lower()
return ans in ("y", "yes", "да", "д")
# [/DEF:yesno]
# [DEF:msgbox:Function]
# @PURPOSE: Отображает информационное сообщение.
# @PARAM: title (str) - Заголовок.
# @PARAM: msg (str) - Текст сообщения.
def msgbox(title: str, msg: str, **kwargs) -> None:
print(f"\n=== {title} ===\n{msg}\n")
# [/DEF:msgbox]
# [DEF:inputbox:Function]
# @PURPOSE: Запрашивает у пользователя текстовый ввод.
# @PARAM: title (str) - Заголовок.
# @PARAM: prompt (str) - Приглашение к вводу.
# @RETURN: Tuple[int, Optional[str]] - Кортеж (код возврата, введенная строка).
def inputbox(title: str, prompt: str, **kwargs) -> Tuple[int, Optional[str]]:
print(f"\n=== {title} ===")
val = input(f"{prompt}\n")
return (0, val) if val else (1, None)
# [/DEF:inputbox]
# [DEF:_ConsoleGauge:Class]
# @PURPOSE: Контекстный менеджер для имитации `whiptail gauge` в консоли.
class _ConsoleGauge:
def __init__(self, title: str, **kwargs):
self.title = title
def __enter__(self):
print(f"\n=== {self.title} ===")
return self
def __exit__(self, exc_type, exc_val, exc_tb):
sys.stdout.write("\n"); sys.stdout.flush()
def set_text(self, txt: str) -> None:
sys.stdout.write(f"\r{txt} "); sys.stdout.flush()
def set_percent(self, percent: int) -> None:
sys.stdout.write(f"{percent}%"); sys.stdout.flush()
# [/DEF:_ConsoleGauge]
# [DEF:gauge:Function]
# @PURPOSE: Создает и возвращает экземпляр `_ConsoleGauge`.
# @PARAM: title (str) - Заголовок для индикатора прогресса.
# @RETURN: _ConsoleGauge - Экземпляр контекстного менеджера.
def gauge(title: str, **kwargs) -> _ConsoleGauge:
return _ConsoleGauge(title, **kwargs)
# [/DEF:gauge]
# [/DEF:superset_tool.utils.whiptail_fallback]
# [DEF:superset_tool.utils.whiptail_fallback:Module]
#
# @SEMANTICS: ui, fallback, console, utility, interactive
# @PURPOSE: Предоставляет плотный консольный UI-fallback для интерактивных диалогов, имитируя `whiptail` для систем, где он недоступен.
# @LAYER: UI
# @PUBLIC_API: menu, checklist, yesno, msgbox, inputbox, gauge
# [SECTION: IMPORTS]
import sys
from typing import List, Tuple, Optional, Any
# [/SECTION]
# [DEF:menu:Function]
# @PURPOSE: Отображает меню выбора и возвращает выбранный элемент.
# @PARAM: title (str) - Заголовок меню.
# @PARAM: prompt (str) - Приглашение к вводу.
# @PARAM: choices (List[str]) - Список вариантов для выбора.
# @RETURN: Tuple[int, Optional[str]] - Кортеж (код возврата, выбранный элемент). rc=0 - успех.
def menu(title: str, prompt: str, choices: List[str], **kwargs) -> Tuple[int, Optional[str]]:
print(f"\n=== {title} ===\n{prompt}")
for idx, item in enumerate(choices, 1):
print(f"{idx}) {item}")
try:
raw = input("\nВведите номер (0 отмена): ").strip()
sel = int(raw)
return (0, choices[sel - 1]) if 0 < sel <= len(choices) else (1, None)
except (ValueError, IndexError):
return 1, None
# [/DEF:menu]
# [DEF:checklist:Function]
# @PURPOSE: Отображает список с возможностью множественного выбора.
# @PARAM: title (str) - Заголовок.
# @PARAM: prompt (str) - Приглашение к вводу.
# @PARAM: options (List[Tuple[str, str]]) - Список кортежей (значение, метка).
# @RETURN: Tuple[int, List[str]] - Кортеж (код возврата, список выбранных значений).
def checklist(title: str, prompt: str, options: List[Tuple[str, str]], **kwargs) -> Tuple[int, List[str]]:
print(f"\n=== {title} ===\n{prompt}")
for idx, (val, label) in enumerate(options, 1):
print(f"{idx}) [{val}] {label}")
raw = input("\nВведите номера через запятую (пустой ввод → отказ): ").strip()
if not raw: return 1, []
try:
indices = {int(x.strip()) for x in raw.split(",") if x.strip()}
selected_values = [options[i - 1][0] for i in indices if 0 < i <= len(options)]
return 0, selected_values
except (ValueError, IndexError):
return 1, []
# [/DEF:checklist]
# [DEF:yesno:Function]
# @PURPOSE: Задает вопрос с ответом да/нет.
# @PARAM: title (str) - Заголовок.
# @PARAM: question (str) - Вопрос для пользователя.
# @RETURN: bool - `True`, если пользователь ответил "да".
def yesno(title: str, question: str, **kwargs) -> bool:
ans = input(f"\n=== {title} ===\n{question} (y/n): ").strip().lower()
return ans in ("y", "yes", "да", "д")
# [/DEF:yesno]
# [DEF:msgbox:Function]
# @PURPOSE: Отображает информационное сообщение.
# @PARAM: title (str) - Заголовок.
# @PARAM: msg (str) - Текст сообщения.
def msgbox(title: str, msg: str, **kwargs) -> None:
print(f"\n=== {title} ===\n{msg}\n")
# [/DEF:msgbox]
# [DEF:inputbox:Function]
# @PURPOSE: Запрашивает у пользователя текстовый ввод.
# @PARAM: title (str) - Заголовок.
# @PARAM: prompt (str) - Приглашение к вводу.
# @RETURN: Tuple[int, Optional[str]] - Кортеж (код возврата, введенная строка).
def inputbox(title: str, prompt: str, **kwargs) -> Tuple[int, Optional[str]]:
print(f"\n=== {title} ===")
val = input(f"{prompt}\n")
return (0, val) if val else (1, None)
# [/DEF:inputbox]
# [DEF:_ConsoleGauge:Class]
# @PURPOSE: Контекстный менеджер для имитации `whiptail gauge` в консоли.
class _ConsoleGauge:
def __init__(self, title: str, **kwargs):
self.title = title
def __enter__(self):
print(f"\n=== {self.title} ===")
return self
def __exit__(self, exc_type, exc_val, exc_tb):
sys.stdout.write("\n"); sys.stdout.flush()
def set_text(self, txt: str) -> None:
sys.stdout.write(f"\r{txt} "); sys.stdout.flush()
def set_percent(self, percent: int) -> None:
sys.stdout.write(f"{percent}%"); sys.stdout.flush()
# [/DEF:_ConsoleGauge]
# [DEF:gauge:Function]
# @PURPOSE: Создает и возвращает экземпляр `_ConsoleGauge`.
# @PARAM: title (str) - Заголовок для индикатора прогресса.
# @RETURN: _ConsoleGauge - Экземпляр контекстного менеджера.
def gauge(title: str, **kwargs) -> _ConsoleGauge:
return _ConsoleGauge(title, **kwargs)
# [/DEF:gauge]
# [/DEF:superset_tool.utils.whiptail_fallback]

126
test_update_yamls.py Normal file → Executable file
View File

@@ -1,63 +1,63 @@
# [DEF:test_update_yamls:Module]
#
# @SEMANTICS: test, yaml, update, script
# @PURPOSE: Test script to verify update_yamls behavior.
# @LAYER: Test
# @RELATION: DEPENDS_ON -> superset_tool.utils.fileio
# @PUBLIC_API: main
# [SECTION: IMPORTS]
import tempfile
import os
from pathlib import Path
import yaml
from superset_tool.utils.fileio import update_yamls
# [/SECTION]
# [DEF:main:Function]
# @PURPOSE: Main test function.
# @RELATION: CALLS -> update_yamls
def main():
# Create a temporary directory structure
with tempfile.TemporaryDirectory() as tmpdir:
tmp_path = Path(tmpdir)
# Create a mock dashboard directory structure
dash_dir = tmp_path / "dashboard"
dash_dir.mkdir()
# Create a mock metadata.yaml file
metadata_file = dash_dir / "metadata.yaml"
metadata_content = {
"dashboard_uuid": "12345",
"database_name": "Prod Clickhouse",
"slug": "test-dashboard"
}
with open(metadata_file, 'w') as f:
yaml.dump(metadata_content, f)
print("Original metadata.yaml:")
with open(metadata_file, 'r') as f:
print(f.read())
# Test update_yamls
db_configs = [
{
"old": {"database_name": "Prod Clickhouse"},
"new": {"database_name": "DEV Clickhouse"}
}
]
update_yamls(db_configs=db_configs, path=str(dash_dir))
print("\nAfter update_yamls:")
with open(metadata_file, 'r') as f:
print(f.read())
print("Test completed.")
# [/DEF:main]
if __name__ == "__main__":
main()
# [/DEF:test_update_yamls]
# [DEF:test_update_yamls:Module]
#
# @SEMANTICS: test, yaml, update, script
# @PURPOSE: Test script to verify update_yamls behavior.
# @LAYER: Test
# @RELATION: DEPENDS_ON -> superset_tool.utils.fileio
# @PUBLIC_API: main
# [SECTION: IMPORTS]
import tempfile
import os
from pathlib import Path
import yaml
from superset_tool.utils.fileio import update_yamls
# [/SECTION]
# [DEF:main:Function]
# @PURPOSE: Main test function.
# @RELATION: CALLS -> update_yamls
def main():
# Create a temporary directory structure
with tempfile.TemporaryDirectory() as tmpdir:
tmp_path = Path(tmpdir)
# Create a mock dashboard directory structure
dash_dir = tmp_path / "dashboard"
dash_dir.mkdir()
# Create a mock metadata.yaml file
metadata_file = dash_dir / "metadata.yaml"
metadata_content = {
"dashboard_uuid": "12345",
"database_name": "Prod Clickhouse",
"slug": "test-dashboard"
}
with open(metadata_file, 'w') as f:
yaml.dump(metadata_content, f)
print("Original metadata.yaml:")
with open(metadata_file, 'r') as f:
print(f.read())
# Test update_yamls
db_configs = [
{
"old": {"database_name": "Prod Clickhouse"},
"new": {"database_name": "DEV Clickhouse"}
}
]
update_yamls(db_configs=db_configs, path=str(dash_dir))
print("\nAfter update_yamls:")
with open(metadata_file, 'r') as f:
print(f.read())
print("Test completed.")
# [/DEF:main]
if __name__ == "__main__":
main()
# [/DEF:test_update_yamls]