Standardized SQLite storage that all media analysis plugins should use. Defines the canonical schema for the analysis_jobs table with file hashing for traceability and config-based caching.
Schema:
CREATETABLEIFNOTEXISTS analysis_jobs (idINTEGERPRIMARYKEY AUTOINCREMENT, file_path TEXT NOTNULL, file_hash TEXT NOTNULL, config_hash TEXT NOTNULL, ranges JSON, metadata JSON, created_at REALNOTNULL,UNIQUE(file_path, config_hash));
The UNIQUE(file_path, config_hash) constraint enables result caching — re-running the same file with the same config replaces the previous result. Different configs for the same file are stored separately.
MediaAnalysisStorage
def MediaAnalysisStorage( db_path:str, # Absolute path to the SQLite database file):
Standardized SQLite storage for media analysis results.
Cached: /tmp/test_audio.mp3
Ranges: 2 segments
File hash: sha256:aaaaaaaaaaaaa...
Cache miss for different config: OK
# Save with same file+config replaces (upsert)storage.save( file_path="/tmp/test_audio.mp3", file_hash="sha256:"+"a"*64, config_hash="sha256:"+"c"*64, ranges=[{"start": 0.0, "end": 3.0, "label": "speech"}], metadata={"segment_count": 1, "total_speech": 3.0})updated = storage.get_cached("/tmp/test_audio.mp3", "sha256:"+"c"*64)assertlen(updated.ranges) ==1# Updated to 1 rangeassert updated.metadata["segment_count"] ==1# Only 1 row total (replaced, not appended)all_jobs = storage.list_jobs()assertlen(all_jobs) ==1print("Upsert replaced existing row: OK")
Upsert replaced existing row: OK
# Different config for same file creates separate rowstorage.save( file_path="/tmp/test_audio.mp3", file_hash="sha256:"+"a"*64, config_hash="sha256:"+"e"*64, # Different config ranges=[{"start": 0.5, "end": 2.0, "label": "speech"}], metadata={"segment_count": 1})all_jobs = storage.list_jobs()assertlen(all_jobs) ==2print(f"Two configs for same file: {len(all_jobs)} rows")
A dataclass representing a single row in the standardized processing_jobs table. Tracks input/output file pairs with hashes for full traceability of media transformations.
Standardized SQLite storage that all media processing plugins should use. Defines the canonical schema for the processing_jobs table, tracking input/output file pairs with content hashes.
Schema:
CREATETABLEIFNOTEXISTS processing_jobs (idINTEGERPRIMARYKEY AUTOINCREMENT, job_id TEXT UNIQUENOTNULL, action TEXT NOTNULL, input_path TEXT NOTNULL, input_hash TEXT NOTNULL, output_path TEXT NOTNULL, output_hash TEXT NOTNULL,parameters JSON, metadata JSON, created_at REALNOTNULL);
Both input_hash and output_hash use the self-describing "algo:hexdigest" format, enabling verification of both source integrity (“is this the same file we converted?”) and output integrity (“has the output been modified since conversion?”).
MediaProcessingStorage
def MediaProcessingStorage( db_path:str, # Absolute path to the SQLite database file):
Standardized SQLite storage for media processing results.