Models¶
This page documents the core database models in Sapari. All models use SQLAlchemy with async support and follow a consistent pattern with UUIDs, timestamps, and soft deletion.
Core Entities¶
Sapari has six main entities that form the video editing workflow:
erDiagram
User ||--o{ Project : owns
User ||--o{ UserAsset : uploads
User ||--o{ AssetGroup : creates
Project ||--o{ Clip : contains
Project ||--o{ Edit : has
Project ||--o{ Draft : saves
Project ||--o{ Export : renders
Clip }o--|| ClipFile : references
Draft }o--o{ Edit : overrides
Export }o--o| Draft : "based on"
UserAsset }o--|| AssetFile : references
UserAsset }o--o{ AssetGroup : "belongs to (many-to-many)"
User¶
class User:
id: int
name: str # 2-30 chars
username: str # 2-20 chars, lowercase alphanumeric, unique
email: str # unique
hashed_password: str
profile_image_url: str | None
tier_id: int | None # FK to Tier
is_superuser: bool
email_verified: bool # Must be True for password users to login
google_id: str | None # Google OAuth user ID (unique)
github_id: str | None # GitHub OAuth user ID (unique)
oauth_provider: str | None # "google" or "github"
stripe_customer_id: str | None # Stripe customer reference
storage_used_bytes: int # Cached counter of upload storage used (default 0)
onboarding_seen: dict | None # Tour keys dismissed, e.g. {"desktop_pipeline": true} (JSON, nullable)
Auth flow: Password-based users start with email_verified=False. A verification email is sent on signup. Login returns 403 until verified. OAuth users (Google/GitHub) get email_verified set from the provider and skip the verification step.
Storage quota: storage_used_bytes is a cached counter — incremented on confirm (clip/asset upload), decremented on delete (non-YouTube only, last reference). Daily reconciliation cron (reconcile_storage_usage, 3 AM) corrects drift. Quota checked at presign-time; exceeding tier quota returns 422.
Project¶
A project is the top-level container for a video editing session. Users create projects, upload clips to them, and export edited videos.
class Project:
uuid: UUID
user_id: int
name: str
status: ProjectStatus # created, analyzing, analyzed, rendering, complete, failed
settings: dict # pacing_level, silence_threshold_ms, language, etc.
transcript: str | None # Full transcript text (copy from active run)
active_run_id: UUID | None # Currently active AnalysisRun
error_message: str | None
Status Flow:
stateDiagram-v2
[*] --> CREATED
CREATED --> ANALYZING: Trigger analysis
ANALYZING --> ANALYZED: Analysis complete
ANALYZING --> FAILED: Analysis error
ANALYZED --> RENDERING: Trigger render
RENDERING --> COMPLETE: Render complete
RENDERING --> FAILED: Render error
COMPLETE --> [*]
FAILED --> ANALYZING: Retry analysis
FAILED --> RENDERING: Retry render
Clip¶
A clip is a video segment within a project. Projects can have multiple clips that get concatenated during render.
class Clip:
uuid: UUID
project_id: UUID
clip_file_id: UUID # Reference to shared ClipFile
display_order: int # Position in timeline
status: ClipStatus # pending, processing, ready, failed
no_trim: bool # Skip silence removal for this clip
no_assets: bool # Skip asset overlay
no_subtitles: bool # Skip subtitle generation
ClipFile¶
ClipFile represents the actual media file, separate from Clip. This allows the same video file to be used in multiple projects without re-downloading or re-processing.
class ClipFile:
uuid: UUID
youtube_video_id: str | None # For metadata display (source tracking)
sha256_hash: str | None # For integrity verification
storage_key: str # R2 path to original video
audio_key: str | None # R2 path to extracted audio
proxy_key: str | None # R2 path to web-compatible proxy
sprite_key: str | None # R2 path to timeline scrub sprite (10x20 grid)
sprite_seconds_per_tile: int | None # Density chosen at sprite generation time
waveform_json: list[float] # Peak amplitudes for timeline
duration_ms: int
status: ClipFileStatus # pending, processing, ready, failed
Each YouTube import creates a new ClipFile (per-user storage, no deduplication). The youtube_video_id field is kept for metadata and source tracking only.
flowchart TB
A[Import YouTube URL] --> B[Download video]
B --> C[Create new ClipFile]
C --> D[Create Clip referencing ClipFile]
Edit¶
An edit represents a region to modify in the video - either a silence, false start, or profanity detected by analysis, a manual edit created by the user, or a keep region marking content to preserve.
class Edit:
uuid: UUID
project_id: UUID
type: EditType # silence, false_start, profanity, manual, asset, keep
action: EditAction # cut (remove video+audio), mute (silence audio), insert (asset), keep (preserve region)
start_ms: int # Edit start time (ms into combined project timeline)
end_ms: int # Edit end time (enforced end_ms <= SUM(clip_file.duration_ms) at service layer)
active: bool # Whether to apply this edit
confidence: float # Detection confidence (0-1)
reason: str # Full explanation (for debugging/logs)
reason_tag: str # Short tag for UI display (e.g., "Word gap", "Serial repetition")
# Asset fields (only when type='asset')
asset_file_id: UUID | None # FK to UserAsset
insert_source: InsertSource # fixed, ai_directed, manual
fixed_position: FixedPosition # intro, outro, watermark, background_audio
visual_mode: VisualMode # replace, overlay, insert, none
audio_mode: AudioMode # original_only, asset_only, mix, none
overlay_position: OverlayPosition # 9-position grid (top_left, center, etc.)
overlay_size_percent: int # Overlay size as % of video (1-100)
overlay_opacity_percent: int # Overlay opacity (10-100)
overlay_x: float # Custom X position (0-1, center-anchored)
overlay_y: float # Custom Y position (0-1, center-anchored)
overlay_flip_h: bool # Flip overlay horizontally
overlay_flip_v: bool # Flip overlay vertically
overlay_rotation_deg: int # Rotation in degrees (0-360)
audio_volume_percent: int # Audio volume (0-100)
audio_duck_main: bool # Duck main audio during asset
asset_offset_ms: int # Offset into asset source (ms)
EditAction determines how the edit is applied:
| Action | Video | Audio | Use Case |
|---|---|---|---|
CUT |
Removed | Removed | Silence, false starts - skip entirely |
MUTE |
Keeps playing | Silenced/bleeped | Profanity - video continues, audio censored |
The action field is set by the backend during analysis - the frontend just uses it without needing to know the business logic. Profanity edits use MUTE, all others use CUT.
The reason field contains the full LLM explanation and is useful for debugging. The reason_tag is a short, formatted tag shown in the UI (converted from snake_case to "Sentence case" by the API).
Users can toggle active to include/exclude specific edits before rendering. Each edit belongs to the AnalysisRun that created it (including manual cuts, which belong to the active run at creation time).
AnalysisRun¶
Tracks a single analysis pipeline execution. Each re-analysis creates a new run — old runs' edits/captions/transcript are preserved.
class AnalysisRun:
uuid: UUID
project_id: UUID
user_id: int
status: AnalysisRunStatus # pending, running, completed, failed
pacing_level: int # Settings used for this run
false_start_sensitivity: int
language: str | None
transcript: str | None # Owned by this run, copied to project when active
transcript_words: list[dict] | None
edit_count: int # Snapshot counts from analysis time
silence_count: int
false_start_count: int
credits_charged: int
duration_ms: int | None
Run switching: project.active_run_id points to the current run. Edit, caption, and draft queries filter by this. POST /analysis-runs/{run_uuid}/activate switches runs — copies transcript to project, invalidates frontend caches.
Draft¶
A draft saves a specific configuration of edits for a project. Think of it as a "save state" that users can restore later.
class Draft:
uuid: UUID
project_id: UUID
name: str
edit_overrides: dict[str, EditOverride] # Per-edit adjustments
export_settings: dict # Resolution, aspect ratio, etc.
is_default: bool # Auto-load this draft
The edit_overrides field stores adjustments to individual edits without modifying the original Edit records:
flowchart LR
subgraph Original["Original Edit Records"]
E1["Edit 1: 1000-2500ms, active"]
E2["Edit 2: 5000-6200ms, active"]
end
subgraph Draft["Draft Overrides"]
O1["Edit 1: start_ms=1200"]
O2["Edit 2: active=false"]
end
subgraph Result["Applied State"]
R1["Edit 1: 1200-2500ms, active"]
R2["Edit 2: 5000-6200ms, inactive"]
end
E1 --> O1 --> R1
E2 --> O2 --> R2
class EditOverride:
active: bool # Override active state
start_ms: int | None # Override start time
end_ms: int | None # Override end time
Export¶
An export is a rendered video. Each export captures a snapshot of the edits at render time.
class Export:
uuid: UUID
project_id: UUID
draft_id: UUID | None # Source draft (optional)
name: str
edit_snapshot: dict # Frozen copy of active edits
settings_snapshot: dict # Frozen export settings
storage_key: str | None # R2 path to rendered video
status: ExportStatus # pending, processing, complete, failed
duration_ms: int | None # Final video duration
file_size_bytes: int | None
The snapshots mean you can keep editing while a render is in progress - it uses the frozen state.
Supporting Entities¶
Asset & AssetFile¶
User-uploaded media assets (images, videos, audio) for overlays and b-roll:
class AssetFile:
uuid: UUID
storage_key: str | None # R2 path to file
original_filename: str
content_type: str
size_bytes: int | None
status: AssetFileStatus # pending, uploaded, failed
youtube_video_id: str | None # For metadata display (source tracking)
duration_ms: int | None # Video/audio duration
waveform_json: list[float] | None # Peak amplitudes for timeline visualization
metadata_json: dict | None # Additional metadata (title, uploader, etc.)
class UserAsset:
uuid: UUID
user_id: int
asset_file_id: UUID # FK with CASCADE delete
display_name: str
tags: list[str]
# Note: Groups via UserAssetGroupMembership (many-to-many)
Each asset import creates a new AssetFile (per-user storage, no deduplication). Deleting a UserAsset cascades to delete the underlying AssetFile and storage object. The youtube_video_id field is kept for metadata display only.
AssetGroup¶
Organizational folders for assets with AI instructions:
class AssetGroup:
uuid: UUID
user_id: int
name: str
description: str | None
default_instructions: str | None # AI hint for when to use assets
is_default: bool # Auto-include in projects
is_pinned: bool # Show at top of list
display_order: int
UserAssetGroupMembership¶
Junction table for many-to-many asset-group relationships:
class UserAssetGroupMembership:
user_asset_id: UUID # FK to UserAsset (CASCADE delete)
group_id: UUID # FK to AssetGroup (CASCADE delete)
added_at: datetime
This enables assets to belong to multiple groups simultaneously. "Copy to group" creates a membership link rather than duplicating the asset.
flowchart TB
A[Asset: Logo.png] --> M1[Membership]
A --> M2[Membership]
M1 --> G1[Brand Assets]
M2 --> G2[Intro Templates]
CaptionLine¶
Editable caption/subtitle lines generated from transcript:
class CaptionLine:
uuid: UUID
project_id: UUID # FK to Project
sequence: int # Order in transcript
text: str # Editable caption text
original_text: str # Original from transcript
start_ms: int # Start time
end_ms: int # End time
Caption lines are generated from transcript words with configurable max_words per line (3-5 for vertical video, 7-10 for horizontal). Users can edit text while original_text preserves the original for comparison.
AnalysisPreset¶
Saved analysis settings that users can reuse across projects:
class AnalysisPreset:
uuid: UUID
user_id: int
name: str
is_default: bool # Auto-apply this preset
pacing_level: int # 0-100, silence removal aggressiveness
false_start_sensitivity: int # 0-100, false start detection
language: str | None # ISO code (en, pt-BR) or null for auto
audio_clean: bool # Enable noise reduction + LUFS normalization
audio_censorship: str # 'none', 'mute', 'bleep' - profanity handling
caption_censorship: bool # Replace profanity with asterisks in captions
director_notes: str | None # Free-text AI instructions
Audio Clean Processing:
When audio_clean is enabled, exports include two audio processing stages:
| Stage | FFmpeg Filter | Purpose |
|---|---|---|
| Noise Reduction | afftdn=nf=-25 |
Removes background noise (AC, fans, room tone) |
| LUFS Normalization | loudnorm=I=-14:TP=-1.5:LRA=11 |
Adjusts loudness to -14 LUFS (YouTube/Spotify standard) |
Noise reduction runs first to avoid amplifying background noise during normalization.
Audio Censorship Options:
| Mode | Effect |
|---|---|
none |
No audio censorship - profanity plays normally |
mute |
Silence audio during profanity regions |
bleep |
Play 1kHz tone during profanity regions |
Users can have up to 5 presets. One can be marked as default.
PreviewPreset¶
Saved preview/export styling settings:
class PreviewPreset:
uuid: UUID
user_id: int
name: str
is_default: bool
# Format settings
format: str | None # "W:H" aspect ratio (e.g. "16:9", "9:16", "1:1", "4:3", "5:4"). None = original.
background: str # Letterbox color (#000000)
# Caption styling
caption_style: CaptionStyle # default, minimal, bold
caption_font: str # sans, serif, mono (generic family ids)
caption_size: int # Font size in px (12-144)
caption_position: CaptionPosition # top, center, bottom
caption_color: str # Text color (#FFFFFF)
caption_length: CaptionLength # short, medium, long (words per line)
# Main video transforms
video_flip_h: bool # Flip main video horizontally
video_flip_v: bool # Flip main video vertically
Allows users to save and quickly switch between different export configurations.
format field semantics. Stored as a "W:H" string (e.g. "16:9", "9:16", "1:1", "4:3", "5:4") or None meaning "Original" (keep the source video's native ratio). Validated at the API boundary against FORMAT_RATIO_PATTERN = r"^[1-9]\d{0,1}:[1-9]\d{0,1}$" in modules/preview_preset/constants.py. The pattern rejects zero sides (prevents a 0:0 reaching any future downstream consumer that would divide by it) and caps each side at 99 so stored values stay ≤ 5 characters (no log pollution) and within the range of real video aspect ratios.
Was previously a 3-value enum (FormatRatio = {LANDSCAPE, PORTRAIT, SQUARE}). Widened because the mobile FormatPanel lets users pick custom ratios, and saving a preset on a custom ratio would 422. The enum and its PG type formatratio have been removed.
Security boundary. This field is UI metadata only — it round-trips between the DB and React state to restore previewAspectRatio on preset load. It does NOT reach FFmpeg arguments, shell commands, filesystem paths, or any other external consumer. If a future change adds such a consumer, the FORMAT_RATIO_PATTERN validation must be re-evaluated for that context.
Billing & Entitlements¶
Payment¶
Tracks Stripe Checkout Sessions and payment records.
| Field | Type | Description |
|---|---|---|
user_id |
FK → User | User who made the payment |
price_id |
FK → Price | Price being purchased |
payment_type |
Enum | ONE_TIME or SUBSCRIPTION |
status |
Enum | PENDING, SUCCEEDED, FAILED |
amount |
int | Amount in cents |
stripe_checkout_session_id |
str | Stripe Checkout Session ID |
stripe_payment_intent_id |
str | Stripe Payment Intent ID (used for chargeback matching) |
stripe_customer_id |
str | Stripe Customer ID |
stripe_subscription_id |
str | For subscription payments |
UserEntitlement¶
Flexible access control — grants credits, tier access, or features to users.
| Field | Type | Description |
|---|---|---|
user_id |
FK → User | User who owns this entitlement |
entitlement_type |
Enum | SUBSCRIPTION, CREDIT_GRANT, TIER_ACCESS, FEATURE_UNLOCK |
grant_reason |
Enum | PURCHASE, SUBSCRIPTION, TRIAL, BONUS, etc. |
tier_id |
FK → Tier | For TIER_ACCESS entitlements |
credit_type |
str | For CREDIT_GRANT (e.g., "ai_minutes") |
quantity_granted |
int | Total credits granted |
quantity_used |
int | Credits consumed |
consumption_type |
Enum | NONE, DECREMENTAL, RENEWABLE |
expires_at |
datetime | When entitlement expires (null = permanent) |
status |
Enum | ACTIVE, INACTIVE, EXPIRED, SUSPENDED |
EntitlementTransaction¶
Authoritative append-only ledger for all credit events. quantity_used on UserEntitlement is a denormalized cache rebuildable from this table via rebuild_user_balance().
| Field | Type | Description |
|---|---|---|
user_id |
FK -> User | User |
entitlement_id |
FK -> UserEntitlement | Which entitlement was affected |
credit_type |
Enum | AI_MINUTES, API_CALLS, etc. |
amount |
int | Positive for grants, negative for usage, 0 for resets |
transaction_type |
Enum | PURCHASE, GRANT, USAGE, RESET, TRIAL, BETA, REFUND |
balance_before |
int | Per-entitlement balance before transaction |
balance_after |
int | Per-entitlement balance after transaction |
period |
str | Period start date for RESET transactions (e.g., "2026-04-05") |
pool_identifier |
Enum | Credit pool (subscription_allowance, beta_program, etc.) |
transaction_metadata |
JSON | Extra context (e.g., {"usage_before_reset": 75}) |
Common Patterns¶
UUIDs Everywhere¶
All entities use UUIDs as public identifiers. Internal integer IDs exist but are never exposed via the API:
# Good - use UUID in API responses
{"uuid": "550e8400-e29b-41d4-a716-446655440000", ...}
# Bad - never expose internal IDs
{"id": 42, ...}
Soft Deletion¶
Entities aren't truly deleted - they're marked with deleted_at and is_deleted:
Queries filter out deleted records by default.
Timestamps¶
All entities track creation and update times:
Key Files¶
| Component | Location |
|---|---|
| Project model | backend/src/modules/project/models.py |
| Clip model | backend/src/modules/clip/models.py |
| Edit model | backend/src/modules/edit/models.py |
| Export model | backend/src/modules/export/models.py |
| Draft model | backend/src/modules/draft/models.py |
| Asset models | backend/src/modules/asset/models.py |
| CaptionLine model | backend/src/modules/caption_line/models.py |
| AnalysisRun model | backend/src/modules/analysis_run/models.py |
| AnalysisPreset model | backend/src/modules/preset/models.py |
| PreviewPreset model | backend/src/modules/preview_preset/models.py |
| Common schemas | backend/src/modules/common/schemas.py |