cjm-transcript-source-select
Install
pip install cjm_transcript_source_selectProject Structure
nbs/
├── components/ (6)
│ ├── helpers.ipynb # Shared helper functions for the selection module
│ ├── local_files.ipynb # Local files browser for importing external .db files
│ ├── preview_panel.ipynb # Collapsible preview panel for displaying selected content
│ ├── selection_queue.ipynb # Selection queue component with drag-drop reordering
│ ├── source_browser.ipynb # Source browser components for displaying and filtering transcription sources
│ └── step_renderer.ipynb # Phase 1 step renderer: Source Selection & Ordering with two-column layout and collapsible preview
├── routes/ (7)
│ ├── core.ipynb # Selection step state management helpers
│ ├── filtering.ipynb # Filtering, grouping, and keyboard navigation route handlers
│ ├── init.ipynb # Router assembly for Phase 1 selection routes
│ ├── local_files.ipynb # Local files browser route handlers
│ ├── queue.ipynb # Selection queue route handlers for Phase 1
│ ├── source_browser.ipynb # Source browser virtual collection router for Phase 1 selection
│ └── tabs.ipynb # Tab switching route handlers
├── services/ (2)
│ ├── source.ipynb # Source service for federated transcription queries via DuckDB
│ └── source_utils.ipynb # Source record operations for metadata extraction, grouping, and validation
├── html_ids.ipynb # HTML ID constants for Phase 1: Source Selection & Ordering
├── models.ipynb # Data models and URL bundles for Phase 1: Source Selection & Ordering
└── utils.ipynb # Display formatting and word counting utilities for the selection step
Total: 18 notebooks across 3 directories
Module Dependencies
graph LR
components_helpers[components.helpers<br/>helpers]
components_local_files[components.local_files<br/>local_files]
components_preview_panel[components.preview_panel<br/>preview_panel]
components_selection_queue[components.selection_queue<br/>selection_queue]
components_source_browser[components.source_browser<br/>source_browser]
components_step_renderer[components.step_renderer<br/>step_renderer]
html_ids[html_ids<br/>html_ids]
models[models<br/>models]
routes_core[routes.core<br/>core]
routes_filtering[routes.filtering<br/>filtering]
routes_init[routes.init<br/>init]
routes_local_files[routes.local_files<br/>local_files]
routes_queue[routes.queue<br/>queue]
routes_source_browser[routes.source_browser<br/>source_browser]
routes_tabs[routes.tabs<br/>tabs]
services_source[services.source<br/>source]
services_source_utils[services.source_utils<br/>source_utils]
utils[utils<br/>utils]
components_helpers --> models
components_local_files --> html_ids
components_local_files --> components_helpers
components_preview_panel --> html_ids
components_source_browser --> utils
components_source_browser --> services_source_utils
components_source_browser --> html_ids
components_step_renderer --> html_ids
components_step_renderer --> components_source_browser
components_step_renderer --> components_local_files
components_step_renderer --> components_selection_queue
components_step_renderer --> models
components_step_renderer --> components_preview_panel
components_step_renderer --> utils
routes_core --> components_step_renderer
routes_core --> models
routes_core --> components_selection_queue
routes_core --> services_source
routes_core --> html_ids
routes_filtering --> services_source_utils
routes_filtering --> routes_core
routes_filtering --> models
routes_filtering --> services_source
routes_init --> routes_queue
routes_init --> routes_core
routes_init --> routes_tabs
routes_init --> routes_local_files
routes_init --> models
routes_init --> services_source
routes_init --> routes_source_browser
routes_init --> routes_filtering
routes_local_files --> components_local_files
routes_local_files --> models
routes_local_files --> services_source
routes_local_files --> routes_core
routes_queue --> services_source_utils
routes_queue --> routes_core
routes_queue --> models
routes_queue --> components_preview_panel
routes_queue --> services_source
routes_source_browser --> html_ids
routes_source_browser --> components_source_browser
routes_source_browser --> routes_core
routes_source_browser --> models
routes_source_browser --> components_preview_panel
routes_source_browser --> services_source
routes_source_browser --> services_source_utils
routes_tabs --> routes_core
routes_tabs --> models
routes_tabs --> components_step_renderer
routes_tabs --> services_source
routes_tabs --> services_source_utils
52 cross-module dependencies detected
CLI Reference
No CLI commands found in this project.
Module Overview
Detailed documentation for each module in the project:
core (core.ipynb)
Selection step state management helpers
Import
from cjm_transcript_source_select.routes.core import (
DEBUG_SELECTION_STATE,
WorkflowStateStore
)Functions
def _get_step_state(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
session_id: str # Session identifier string
) -> Dict[str, Any]: # Step state dictionary
"Get the selection step state from the workflow state store."def _find_duplicate_media_source(
source_service: SourceService, # Source service for lookups
record_id: str, # Candidate record ID
provider_id: str, # Candidate provider ID
selected_sources: List[Dict[str, str]], # Current selections
) -> Optional[Dict[str, str]]: # Conflicting source dict or None
"Find an already-selected source that shares the same audio file."def _render_duplicate_flash(
candidate_row_id: str, # DOM element ID of the candidate row
existing_row_id: Optional[str] = None, # DOM element ID of the conflicting row (None if off-screen)
) -> Div: # OOB Div with flash script
"Render a flash animation on one or two rows to indicate duplicate rejection."def _get_active_source_tab(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
session_id: str # Session identifier string
) -> str: # Active tab: "db" or "files"
"Get the currently active source tab from workflow state."def _build_queue_response(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for querying transcriptions
session_id: str, # Session identifier string
selected_sources: List[Dict[str, str]], # Current selected sources after mutation
urls: SelectionUrls, # URL bundle for rendering
include_stats: bool = True, # Include OOB stats swap
include_checkbox_oobs: bool = True, # Include OOB checkbox cells for visible rows
) -> Union[Any, Tuple]: # Single component or tuple of components with OOB swaps
"Build the standard response for queue-mutating handlers."def _update_step_state(
"Update the selection step state in the workflow state store."Variables
DEBUG_SELECTION_STATE = False
_rebuild_and_render_ref: list
_sync_items_ref: list
_get_checkbox_oobs_ref: list
_get_checkbox_oob_for_ref: list
_get_vc_row_id_for_ref: list
_activate_toggle_ref: listfiltering (filtering.ipynb)
Filtering, grouping, and keyboard navigation route handlers
Import
from cjm_transcript_source_select.routes.filtering import (
init_filtering_router
)Functions
def _handle_source_filter(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for queries
request, # FastHTML request object
sess, # FastHTML session object
search: str, # Search term from input
urls: SelectionUrls, # URL bundle for rendering
): # VC content wrapper (direct swap, not OOB)
"Filter transcription sources by search term."def _handle_grouping_change(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for queries
request, # FastHTML request object
sess, # FastHTML session object
grouping_mode: str, # New grouping mode: "media_path" or "batch_id"
urls: SelectionUrls, # URL bundle for rendering
): # VC content wrapper (direct swap, not OOB)
"Change the grouping mode and re-render the VC content."def _handle_selection_toggle_focused(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for queries
request, # FastHTML request object
sess, # FastHTML session object
record_id: str, # Job ID from focused row (via hx-include)
provider_id: str, # Plugin name from focused row (via hx-include)
urls: SelectionUrls, # URL bundle for rendering
): # Queue component with OOB stats, optionally with OOB source list
"Toggle selection of the focused row (keyboard shortcut handler)."def _handle_keyboard_reorder(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for queries
request, # FastHTML request object
sess, # FastHTML session object
record_id: str, # Record ID of item to move
provider_id: str, # Provider ID of item to move
direction: str, # Direction to move: "up" or "down"
urls: SelectionUrls, # URL bundle for rendering
): # Queue component, optionally with OOB source list
"Move an item up or down in the selection queue via keyboard."def init_filtering_router(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for queries
prefix: str, # Route prefix (e.g., "/workflow/selection/filtering")
urls: SelectionUrls, # URL bundle for rendering
) -> Tuple[APIRouter, Dict[str, Callable]]: # (router, route_dict)
"Initialize filtering and keyboard navigation routes."helpers (helpers.ipynb)
Shared helper functions for the selection module
Import
from cjm_transcript_source_select.components.helpers import *Functions
def _get_selection_state(
ctx: InteractionContext # Interaction context with state
) -> SelectionStepState: # Typed selection step state
"Get the full selection step state from context."def _get_selected_sources(
ctx: InteractionContext # Interaction context with state
) -> List[SelectedSource]: # List of selected source dicts
"Get the list of selected sources from step state."def _get_grouping_mode(
ctx: InteractionContext # Interaction context with state
) -> str: # Grouping mode: "media_path" or "batch_id"
"Get the current grouping mode from step state."html_ids (html_ids.ipynb)
HTML ID constants for Phase 1: Source Selection & Ordering
Import
from cjm_transcript_source_select.html_ids import (
SelectionHtmlIds
)Classes
class SelectionHtmlIds:
"HTML ID constants for Phase 1: Source Selection & Ordering."
def as_selector(
id_str:str # The HTML ID to convert
) -> str: # CSS selector with # prefix
"Convert an ID to a CSS selector format."
def source_checkbox(
record_id:str, # Record identifier
provider_id:str # Provider identifier
) -> str: # HTML ID for the source checkbox
"Generate HTML ID for a source selection checkbox."
def source_row(
record_id:str, # Record identifier
provider_id:str # Provider identifier
) -> str: # HTML ID for the source row
"Generate HTML ID for a source browser row."
def queue_item(
record_id:str, # Record identifier
provider_id:str # Provider identifier
) -> str: # HTML ID for the queue item
"Generate HTML ID for a queue item."init (init.ipynb)
Router assembly for Phase 1 selection routes
Import
from cjm_transcript_source_select.routes.init import (
init_selection_routers
)Functions
def init_selection_routers(
state_store: WorkflowStateStore, # The workflow state store
source_service: SourceService, # The source service for queries
workflow_id: str, # The workflow identifier
prefix: str, # Base prefix for selection routes (e.g., "/workflow/selection")
) -> SelectionResult: # Selection router result with routers, urls, routes, and restore
"Initialize and return all selection routers with URL bundle."local_files (local_files.ipynb)
Local files browser for importing external .db files
Import
from cjm_transcript_source_select.components.local_files import *Functions
def _get_external_db_paths(
ctx: InteractionContext # Interaction context with state
) -> List[str]: # List of external database paths
"Get the list of external database paths from step state."def _get_current_browse_path(
ctx: InteractionContext # Interaction context with state
) -> str: # Current browse path
"Get the current browse path from step state."def _get_file_browser_state(
step_state: Dict[str, Any], # Selection step state dictionary
default_path: Optional[str] = None # Default path if no state exists
) -> BrowserState: # BrowserState for file browser
"Get or create BrowserState from step state."def _create_db_browser_config() -> FileBrowserConfig: # Configured FileBrowserConfig for .db file selection
"Create file browser config for .db file selection."def _render_external_sources_list(
external_paths: List[str], # List of added external database paths
remove_url: str, # URL for removing external source
oob: bool = False, # Whether to render as OOB swap
) -> Any: # External sources section component (always rendered for OOB targeting)
"Render the list of added external database sources with scrollable paths."def _render_error_alert(
error_message: Optional[str] = None, # Error message to display (None = clear)
oob: bool = False, # Whether to render as OOB swap
) -> Any: # Error alert container (always present for OOB targeting)
"Render the error alert container for the local files browser."def _render_local_files_browser(
render_fn: Optional[Callable] = None, # FileBrowserRouters.render callable
external_paths: Optional[List[str]] = None, # List of added external database paths
remove_url: str = "", # URL for removing external source
error_message: Optional[str] = None, # Error message to display
) -> Any: # Local files browser component
"Render the local files browser for adding external .db files."local_files (local_files.ipynb)
Local files browser route handlers
Import
from cjm_transcript_source_select.routes.local_files import (
init_local_files_router
)Functions
def _get_local_files_provider() -> LocalFileSystemProvider:
"""Get or create the local files provider singleton."""
global _local_files_provider
if _local_files_provider is None
"Get or create the local files provider singleton."def _handle_remove_external_source(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for external db ops
sess, # FastHTML session object
db_path: str, # Path to the .db file to remove
external_db_paths_ref: List[str], # Shared external paths list (mutated in place)
fb_routers: FileBrowserRouters, # File browser routers (for targeted OOB)
remove_url: str, # URL for remove button in external sources list
urls: SelectionUrls, # Full URL bundle for queue re-rendering
): # Tuple of OOB elements (external sources list + checkbox cells + queue + stats)
"Remove an external database source and clean up orphaned queue items."def init_local_files_router(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for external db ops
prefix: str, # Route prefix (e.g., "/workflow/selection/local_files")
urls: SelectionUrls, # URL bundle for rendering
) -> LocalFilesResult: # Router result with routers, routes, render, restore, and reset
"Initialize local files browser routes with new file browser API."Variables
_local_files_provider: Optional[LocalFileSystemProvider] = Nonemodels (models.ipynb)
Data models and URL bundles for Phase 1: Source Selection & Ordering
Import
from cjm_transcript_source_select.models import (
SelectionStepState,
SelectionUrls,
LocalFilesResult,
SelectionResult
)Functions
def _no_op_restore(session_id: str) -> None:
"""Default no-op for restore_state."""
pass
def _no_op_reset() -> None
"Default no-op for restore_state."def _no_op_reset() -> None:
"""Default no-op for reset_state."""
pass
@dataclass
class LocalFilesResult
"Default no-op for reset_state."Classes
class SelectionStepState(TypedDict):
"State for Phase 1: Source Selection & Ordering."@dataclass
class SelectionUrls:
"URL bundle for Phase 1 selection route handlers and renderers."
add: str = '' # Add source to queue
remove: str = '' # Remove source from queue
toggle: str = '' # Toggle source selection (add/remove based on current state)
reorder: str = '' # Reorder queue items
clear: str = '' # Clear all from queue
select_all: str = '' # Select all in a group
preview: str = '' # Preview source content
toggle_focused: str = '' # Toggle focused row selection
keyboard_reorder: str = '' # Keyboard reorder (Shift+Up/Down)
filter: str = '' # Filter source list
grouping_change: str = '' # Change grouping mode
browse_directory: str = '' # Browse directory
add_external: str = '' # Add external .db source
remove_external: str = '' # Remove external .db source
tab_switch: str = '' # Switch source tabs@dataclass
class LocalFilesResult:
"Return type from init_local_files_router."
routers: List[APIRouter] # Routers to register (custom + file browser + VC)
routes: Dict[str, Callable] # Named route handlers
render_panel: Callable # (error_message?, session_id?) -> rendered panel
restore_state: Callable = field(...) # (session_id) -> None, restore persisted state
reset_state: Callable = field(...) # () -> None, reset in-memory caches@dataclass
class SelectionResult:
"Return type from init_selection_routers."
routers: List[APIRouter] # All selection routers to register
urls: 'SelectionUrls' = field(...) # URL bundle
routes: Dict[str, Callable] = field(...) # All named route handlers
render_local_files_panel: Optional[Callable] # Render fn for local files tab
sb_state: Any # SourceBrowserRouterState
restore_state: Callable = field(...) # (session_id) -> None, restore persisted state
reset_state: Callable = field(...) # () -> None, reset in-memory cachespreview_panel (preview_panel.ipynb)
Collapsible preview panel for displaying selected content
Import
from cjm_transcript_source_select.components.preview_panel import *Functions
def _render_preview_panel(
preview_record_id: Optional[str] = None, # Job ID being previewed
preview_text: Optional[str] = None, # Text content to preview
is_open: bool = False, # Whether the collapse should be open
) -> Any: # Preview panel component (collapsible, full-width)
"Render the collapsible preview panel for displaying selected content."queue (queue.ipynb)
Selection queue route handlers for Phase 1
Import
from cjm_transcript_source_select.routes.queue import (
init_queue_router
)Functions
def _handle_selection_toggle(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for queries
request, # FastHTML request object
sess, # FastHTML session object
record_id: str, # Job ID to toggle
provider_id: str, # Plugin name for the source
urls: SelectionUrls, # URL bundle for rendering
): # Queue component with OOB stats (no checkbox OOBs -- checkbox already correct)
"Toggle a source's selection state (add if absent, remove if present)."def _handle_selection_add(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for queries
request, # FastHTML request object
sess, # FastHTML session object
record_id: str, # Job ID to add
provider_id: str, # Plugin name for the source
urls: SelectionUrls, # URL bundle for rendering
): # Queue component with OOB stats and visible checkbox OOBs
"Add a source to the selection queue."def _handle_selection_remove(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for queries
request, # FastHTML request object
sess, # FastHTML session object
key: str, # Item key (record_id) to remove
urls: SelectionUrls, # URL bundle for rendering
): # Queue component with OOB stats and visible checkbox OOBs
"Remove a source from the selection queue by key."async def _handle_selection_reorder(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for queries
request, # FastHTML request object
sess, # FastHTML session object
urls: SelectionUrls, # URL bundle for rendering
): # Updated queue component
"Reorder items in the selection queue based on SortableJS result."def _handle_selection_clear(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for queries
request, # FastHTML request object
sess, # FastHTML session object
urls: SelectionUrls, # URL bundle for rendering
): # Queue component with OOB stats, optionally with OOB source list
"Clear all items from the selection queue."def _handle_selection_select_all(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for queries
request, # FastHTML request object
sess, # FastHTML session object
group_key: str, # Group key to select all transcriptions for
grouping_mode: str, # Current grouping mode: "media_path" or "batch_id"
urls: SelectionUrls, # URL bundle for rendering
): # Queue component with OOB stats, optionally with OOB source list
"Select all transcriptions for a given group, skipping duplicate audio sources."def _handle_selection_preview(
source_service: SourceService, # The source service for queries
request, # FastHTML request object
record_id: str, # Job ID to preview
provider_id: str, # Plugin name for the source
): # Full preview panel component (collapsible, open with content)
"Get preview panel for a selected source."def init_queue_router(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for queries
prefix: str, # Route prefix (e.g., "/workflow/selection/queue")
urls: SelectionUrls, # URL bundle for rendering (populated after all routers created)
) -> Tuple[APIRouter, Dict[str, Callable]]: # (router, route_dict)
"Initialize queue management routes."selection_queue (selection_queue.ipynb)
Selection queue component with drag-drop reordering
Import
from cjm_transcript_source_select.components.selection_queue import (
SD_QUEUE_PREFIX,
SD_QUEUE_CONFIG,
SD_QUEUE_IDS
)Functions
def _render_queue_content(
item: dict, # Source dict with record_id and provider_id
index: int, # 0-based position in queue
) -> Any: # Custom content for the queue item
"Render the job ID display as custom content for each queue item."def _render_queue_empty() -> Any: # Empty state element
"Render the custom empty state for the source selection queue."def _render_selection_queue(
selected_sources: List[Dict[str, str]], # List of selected sources in order
remove_url: str, # URL for removing from queue
reorder_url: str, # URL for reordering queue
clear_url: str, # URL for clearing all
) -> Any: # Queue panel component
"Render the selection queue panel via cjm-fasthtml-sortable-queue."Variables
SD_QUEUE_PREFIX = 'sd'
SD_QUEUE_CONFIG
SD_QUEUE_IDSsource (source.ipynb)
Source service for federated transcription queries via DuckDB
Import
from cjm_transcript_source_select.services.source import (
VALID_DB_EXTENSIONS,
TranscriptionDBProvider,
SourceService,
validate_and_toggle_external_db
)Functions
def validate_and_toggle_external_db(
source_service: SourceService, # Source service for duplicate detection
path: str, # Path to the .db file
external_paths: List[str], # Current external database paths
valid_extensions: List[str] = None, # Valid file extensions (default: VALID_DB_EXTENSIONS)
) -> Tuple[List[str], Optional[str]]: # (updated_paths, error_message or None)
"Validate and toggle an external database path in the external paths list."Classes
class TranscriptionDBProvider:
def __init__(
self,
db_path: str, # Path to SQLite database file
name: str, # Display name for this provider
provider_id: Optional[str] = None # Unique ID (defaults to db_path)
)
"SourceProvider for transcription SQLite databases."
def __init__(
self,
db_path: str, # Path to SQLite database file
name: str, # Display name for this provider
provider_id: Optional[str] = None # Unique ID (defaults to db_path)
)
"Initialize provider for a transcription database."
def provider_id(self) -> str: # Unique identifier
"""Unique identifier for this provider instance."""
return self._id
@property
def provider_name(self) -> str: # Display name
"Unique identifier for this provider instance."
def provider_name(self) -> str: # Display name
"""Human-readable name for display."""
return self._name
@property
def provider_type(self) -> str: # Provider category
"Human-readable name for display."
def provider_type(self) -> str: # Provider category
"""Provider type category."""
return "transcription_db"
@property
def db_path(self) -> Path: # Database file path
"Provider type category."
def db_path(self) -> Path: # Database file path
"""Path to the underlying database file."""
return self._db_path
def is_available(self) -> bool: # Whether database exists and is accessible
"Path to the underlying database file."
def is_available(self) -> bool: # Whether database exists and is accessible
"""Check if the database file exists and is accessible."""
return self._db_path.exists() and self._db_path.suffix == '.db'
def validate_schema(self) -> Tuple[bool, str]: # (is_valid, error_message)
"Check if the database file exists and is accessible."
def validate_schema(self) -> Tuple[bool, str]: # (is_valid, error_message)
"""Check if database has valid transcription schema."""
if not self.is_available()
"Check if database has valid transcription schema."
def query_records(
self,
limit: int = 100 # Maximum records to return
) -> List[SourceRecord]: # List of source records
"Query transcription records from the database."
def get_source_block(
self,
record_id: str # Job ID to fetch
) -> Optional[SourceBlock]: # SourceBlock or None if not found
"Fetch a specific transcription as a SourceBlock."
def from_plugin(
cls,
meta: PluginMeta # Plugin metadata with manifest containing db_path
) -> Optional["TranscriptionDBProvider"]: # Provider or None if no valid db_path
"Create provider from plugin metadata."
def from_external_path(
cls,
path: str # Path to external database file
) -> Optional["TranscriptionDBProvider"]: # Provider or None if path invalid
"Create provider from an external database path."class SourceService:
def __init__(
self,
plugin_manager: PluginManager, # Plugin manager for discovering plugin sources
source_categories: List[str] = None, # Plugin categories to query (default: ['transcription'])
external_paths: List[str] = None # External database paths
)
"Service for federated access to content sources via providers."
def __init__(
self,
plugin_manager: PluginManager, # Plugin manager for discovering plugin sources
source_categories: List[str] = None, # Plugin categories to query (default: ['transcription'])
external_paths: List[str] = None # External database paths
)
"Initialize the source service."
def add_provider(
self,
provider: SourceProvider # Provider instance to add
) -> bool: # True if added, False if ID already exists
"Add a source provider."
def remove_provider(
self,
provider_id: str # ID of provider to remove
) -> bool: # True if removed, False if not found
"Remove a source provider by ID."
def get_provider(
self,
provider_id: str # ID of provider to get
) -> Optional[SourceProvider]: # Provider or None if not found
"Get a provider by ID."
def get_providers(self) -> List[SourceProvider]: # List of all providers
"""Get all registered providers."""
return list(self._providers.values())
def get_provider_by_name(
self,
name: str # Provider name to search for
) -> Optional[SourceProvider]: # Provider or None if not found
"Get all registered providers."
def get_provider_by_name(
self,
name: str # Provider name to search for
) -> Optional[SourceProvider]: # Provider or None if not found
"Find a provider by its display name."
def has_provider_for_path(
self,
path: str # Path to check
) -> Tuple[bool, Optional[str]]: # (has_duplicate, existing_provider_name)
"Check if any provider uses the same resolved database path."
def add_plugin_providers(self) -> int: # Number of providers added
"""Discover and add providers from loaded plugins."""
added = 0
for category in self._categories
"Discover and add providers from loaded plugins."
def set_external_paths(
self,
paths: List[str] # List of external database paths to set
) -> None
"Set external database paths (replaces existing external providers)."
def add_external_path(
self,
path: str # External database path to add
) -> bool: # True if added, False if already exists or invalid
"Add an external database as a provider."
def remove_external_path(
self,
path: str # External database path to remove
) -> bool: # True if removed, False if not found
"Remove an external database provider."
def get_external_paths(self) -> List[str]: # List of external database paths
"""Get list of external database paths."""
paths = []
for pid, provider in self._providers.items()
"Get list of external database paths."
def get_available_sources(self) -> List[Dict[str, Any]]: # List of source info dicts
"""Get list of available sources (for UI display)."""
# First ensure plugin providers are loaded
self.add_plugin_providers()
sources = []
for provider in self._providers.values()
"Get list of available sources (for UI display)."
def query_transcriptions(
self,
provider_name: Optional[str] = None, # Filter by provider name (None for all)
limit: int = 100 # Maximum number of results per provider
) -> List[Dict[str, Any]]: # List of transcription records
"Query records from all providers (or a specific one)."
def get_transcription_by_id(
self,
record_id: str, # Record ID to fetch
provider_id: str # Provider ID that owns this record
) -> Optional[SourceBlock]: # SourceBlock or None if not found
"Get a specific transcription as a SourceBlock."
def get_source_blocks(
self,
selections: List[Dict[str, str]] # List of {record_id, provider_id} dicts
) -> List[SourceBlock]: # Ordered list of SourceBlocks
"Fetch multiple records as SourceBlocks in order."Variables
VALID_DB_EXTENSIONS = [3 items]source_browser (source_browser.ipynb)
Source browser components for displaying and filtering transcription sources
Import
from cjm_transcript_source_select.components.source_browser import (
SOURCE_BROWSER_COLUMNS,
SB_SYSTEM_ID,
SourceBrowserItem,
build_source_items,
is_source_item_skippable,
create_source_cell_renderer,
render_source_empty
)Functions
def _render_grouping_selector(
grouping_mode: str, # Current grouping mode: "media_path" or "batch_id"
grouping_change_url: str, # URL for changing grouping mode
) -> Any: # Grouping selector component
"Render the dropdown for selecting grouping mode."def build_source_items(
transcriptions: List[Dict[str, Any]], # Available transcription records
selected_sources: List[Dict[str, str]], # Currently selected sources
grouping_mode: str = "media_path", # Grouping mode: "media_path" or "batch_id"
) -> List[SourceBrowserItem]: # Flat list with interleaved headers and records
"Build the items list for the source browser virtual collection."def is_source_item_skippable(
item: SourceBrowserItem, # Item to check
) -> bool: # True if item is a group header (cursor should skip)
"Predicate for virtual collection is_skippable parameter."def _render_header_cell(
item: SourceBrowserItem, # Header item
ctx: CellRenderContext, # Cell render context
select_all_url: str = "", # URL for selecting all in group
) -> Any: # Cell content for a group header row
"Render cell content for a group header item."def _render_record_cell(
item: SourceBrowserItem, # Record item
ctx: CellRenderContext, # Cell render context
toggle_url: str = "", # URL for toggling source selection
) -> Any: # Cell content for a data record row
"Render cell content for a data record item."def create_source_cell_renderer(
toggle_url: str = "", # URL for toggling source selection
select_all_url: str = "", # URL for selecting all in a group
) -> Callable: # render_cell(item: SourceBrowserItem, ctx: CellRenderContext) -> Any
"Create a render_cell callback for the source browser virtual collection."def render_source_empty() -> Any: # Empty state component
"Render empty state when no transcription sources are available."def _render_source_browser_vc_content(
sb_state: Any, # SourceBrowserRouterState from routes.source_browser
) -> Any: # VC content wrapper (without search/grouping header)
"Render the VC content portion of the source browser."def _render_source_browser_vc(
sb_state: Any, # SourceBrowserRouterState from routes.source_browser
filter_url: str = "", # URL for filtering sources
grouping_mode: str = "media_path", # Current grouping mode
grouping_change_url: str = "", # URL for changing grouping mode
) -> Any: # Source browser component with virtual collection
"Render the full source browser panel (header + VC content)."Classes
@dataclass
class SourceBrowserItem:
"Item in the source browser virtual collection (header or record)."
item_type: str # "header" or "record"
group_key: str = '' # Group key (media_path or batch_id value)
group_display: str = '' # Formatted display text for group header
group_count: int = 0 # Number of records in this group
grouping_mode: str = '' # Grouping mode used ("media_path" or "batch_id")
record: Optional[Dict[str, Any]] # Original transcription record dict
is_selected: bool = False # Whether currently in queueVariables
SOURCE_BROWSER_COLUMNS
_SB_CONTENT_ID = 'sb-content'
_SB_VC_WRAPPER_ID = 'sb-vc-wrapper'
SB_SYSTEM_ID = 'sb-collection'source_browser (source_browser.ipynb)
Source browser virtual collection router for Phase 1 selection
Import
from cjm_transcript_source_select.routes.source_browser import (
SourceBrowserRouterState,
init_source_browser_router
)Functions
def init_source_browser_router(
source_service: SourceService, # Source service for querying transcriptions
urls: SelectionUrls, # URL bundle (toggle, select_all, filter, grouping_change)
prefix: str = "/browser", # Route prefix for VC routes
) -> SourceBrowserRouterState: # Router state with all VC objects and helpers
"Initialize the source browser virtual collection router."Classes
@dataclass
class SourceBrowserRouterState:
"Return value from init_source_browser_router."
router: APIRouter # VC routes (nav, focus, activate, sort, viewport)
urls: VirtualCollectionUrls # VC URL bundle
ids: VirtualCollectionHtmlIds # VC HTML IDs
btn_ids: VirtualCollectionButtonIds # VC keyboard button IDs
config: VirtualCollectionConfig # VC config
state: VirtualCollectionState # VC state (mutable)
items: List[SourceBrowserItem] # Shared items list (mutable)
render_cell: Callable # Cell render callback
rebuild_and_render: Callable # (transcriptions, selected_sources, grouping_mode, content_only) -> Div
rebuild_items: Callable # (transcriptions, selected_sources, grouping_mode) -> None
sync_items_selection: Callable # (selected_sources) -> None
get_visible_checkbox_oobs: Callable # () -> tuple of OOB elements
get_checkbox_oob_for: Callable # (record_id, provider_id) -> OOB element or None
get_vc_row_id_for: Callable # (record_id, provider_id) -> str or Nonesource_utils (source_utils.ipynb)
Source record operations for metadata extraction, grouping, and validation
Import
from cjm_transcript_source_select.services.source_utils import (
extract_batch_id,
extract_model_name,
group_transcriptions,
group_transcriptions_by_audio,
is_source_selected,
get_selected_media_paths,
filter_transcriptions,
select_all_in_group,
toggle_source_selection,
reorder_item,
reorder_sources,
calculate_next_tab,
check_audio_exists,
validate_browse_path
)Functions
def extract_batch_id(
metadata: Any # Metadata dict or JSON string
) -> str: # Batch ID or "No Batch ID"
"Extract batch_id from transcription metadata."def extract_model_name(
metadata: Any # Metadata dict or JSON string
) -> str: # Formatted model name for display
"Extract and format model name from transcription metadata."def group_transcriptions(
transcriptions: List[Dict[str, Any]], # List of transcription records
group_by: str = "media_path" # Grouping mode: "media_path" or "batch_id"
) -> Dict[str, List[Dict[str, Any]]]: # Grouped transcriptions
"Group transcription records by the specified field."def group_transcriptions_by_audio(
transcriptions: List[Dict[str, Any]] # List of transcription records
) -> Dict[str, List[Dict[str, Any]]]: # Grouped by media_path
"Group transcription records by their source audio file."def is_source_selected(
record_id: str, # Job ID to check
provider_id: str, # Provider ID to check
selected_sources: List[Dict[str, str]] # List of selected sources
) -> bool: # True if source is selected
"Check if a source is in the selected list by (record_id, provider_id) pair."def get_selected_media_paths(
selected_sources: List[Dict[str, str]], # Current selections (record_id, provider_id)
all_transcriptions: List[Dict[str, Any]], # All available transcription records
) -> Set[str]: # Media paths already represented in selections
"Get the set of media_paths for currently selected sources."def filter_transcriptions(
transcriptions: List[Dict[str, Any]], # List of transcription records to filter
search_text: str, # Search term for case-insensitive substring matching
) -> List[Dict[str, Any]]: # Filtered transcription records
"Filter transcriptions by substring match across record_id, media_path, and text fields."def select_all_in_group(
transcriptions: List[Dict[str, Any]], # All transcription records
group_key: str, # Group key to match against
grouping_mode: str, # Grouping mode: "media_path" or "batch_id"
selected_sources: List[Dict[str, str]], # Current selections
excluded_media_paths: Optional[Set[str]] = None, # Media paths to skip (already selected)
) -> List[Dict[str, str]]: # Updated selections with new items appended
"Add all transcriptions matching a group key to the selection list, skipping duplicates."def toggle_source_selection(
record_id: str, # Job ID to toggle
provider_id: str, # Plugin name for the source
selected_sources: List[Dict[str, str]], # Current selections
) -> List[Dict[str, str]]: # Updated selections
"Toggle a source in or out of the selection list by (record_id, provider_id) pair."def reorder_item(
selected_sources: List[Dict[str, str]], # Current selections
record_id: str, # Record ID of item to move
provider_id: str, # Provider ID of item to move
direction: str, # Direction: "up" or "down"
) -> List[Dict[str, str]]: # Reordered selections
"Move an item up or down in the selection list by swapping with its neighbor."def reorder_sources(
selected_sources: List[Dict[str, str]], # Current selections
new_order_ids: List[str], # Job IDs in desired order
) -> List[Dict[str, str]]: # Reordered selections
"Reorder sources to match the given job ID order."def calculate_next_tab(
direction: str, # Direction: "prev", "next", or a direct tab name
current_tab: str, # Currently active tab name
tabs: List[str], # Available tab names in order
) -> str: # New active tab name
"Calculate the next tab based on direction or direct selection."def check_audio_exists(
media_path: str # Path to audio file
) -> bool: # True if file exists
"Check if the audio file exists at the given path."def validate_browse_path(
path: str # Path to validate
) -> str: # Validated and resolved path, or home directory on error
"Validate a browse path for security. Returns home directory on invalid input."step_renderer (step_renderer.ipynb)
Phase 1 step renderer: Source Selection & Ordering with two-column layout and collapsible preview
Import
from cjm_transcript_source_select.components.step_renderer import (
SD_TAB_PREV_BTN,
SD_TAB_NEXT_BTN,
SD_PREVIEW_BTN,
FB_SYSTEM_ID,
render_selection_step
)Functions
def _create_parent_keyboard_manager() -> ZoneManager: # Parent keyboard manager for hierarchy
"Create the parent keyboard manager with two ghost zones for column switching."def _render_selection_stats(
selected_sources: List[Dict[str, str]], # Selected sources
transcriptions: List[Dict[str, Any]], # All transcriptions (for word count)
oob: bool = False, # Whether to render as OOB swap
) -> Any: # Stats component
"Render the selection statistics (word count and source count)."def _render_selection_footer(
selected_sources: List[Dict[str, str]], # Selected sources
transcriptions: List[Dict[str, Any]], # All transcriptions (for word count)
) -> Any: # Footer component
"Render the footer with statistics and continue button."def _render_tab_headers(
active_tab: str, # Currently active tab ('db' or 'files')
tab_switch_url: str = "", # URL for switching tabs via HTMX
oob: bool = False, # Whether to render as OOB swap
) -> Any: # Tab headers container
"Render the tab header radio inputs."def _render_source_tabs(
active_tab: str, # Currently active tab ('db' or 'files')
active_content: Any, # Content for the currently active tab
tab_switch_url: str = "", # URL for switching tabs via HTMX
) -> Any: # Tabs header + separate content container
"Render source type tabs with a single shared content container."def _generate_hierarchy_js(
active_tab: str, # Active tab: "db" or "files"
) -> Script: # Script element with hierarchy wiring and activation logic
"Generate JavaScript for keyboard system hierarchy and child activation."def render_selection_step(
sources: List[Dict[str, Any]], # Available source plugins
transcriptions: List[Dict[str, Any]], # Available transcription records
selected_sources: List[Dict[str, str]], # Ordered selection
grouping_mode: str, # Grouping mode: "media_path" or "batch_id"
active_tab: str, # Active tab: "db" or "files"
urls: SelectionUrls, # URL bundle for selection routes
render_local_files_panel: Optional[Callable] = None, # Render fn for Files tab content
sb_state: Any = None, # SourceBrowserRouterState for DB tab VC rendering
) -> Any: # FastHTML component
"Render Phase 1: Source Selection & Ordering step with two-column layout."Variables
SD_TAB_PREV_BTN = 'sd-tab-prev-btn'
SD_TAB_NEXT_BTN = 'sd-tab-next-btn'
SD_PREVIEW_BTN = 'sd-preview-btn'
FB_SYSTEM_ID = 'lfb-collection'
_ZONE_FOCUS_CLASSES
_VIEWPORT_FIT_CONFIGtabs (tabs.ipynb)
Tab switching route handlers
Import
from cjm_transcript_source_select.routes.tabs import (
init_tabs_router
)Functions
def _handle_tab_switch(
source_service: SourceService, # The source service for queries
request, # FastHTML request object
sess, # FastHTML session object
direction: str, # Direction: "prev", "next", "db", or "files"
urls: SelectionUrls, # URL bundle for rendering
current_tab_ref: List[str], # Mutable ref [current_tab] for closure-based tracking
render_local_files_panel: Optional[Callable] = None, # Render fn for Files tab
sb_state: Any = None, # SourceBrowserRouterState for DB tab VC rendering
state_store: WorkflowStateStore = None, # State store (for reading step state)
workflow_id: str = "", # Workflow ID (for reading step state)
): # Tuple of inner content, OOB tab headers, and tab switch script
"Switch between Plugin DB and Local Files tabs."def init_tabs_router(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for queries
prefix: str, # Route prefix (e.g., "/workflow/selection/tabs")
urls: SelectionUrls, # URL bundle for rendering
render_local_files_panel: Optional[Callable] = None, # Render fn for Files tab content
sb_state: Any = None, # SourceBrowserRouterState for DB tab VC rendering
) -> Tuple[APIRouter, Dict[str, Callable]]: # (router, route_dict)
"Initialize tab switching routes."utils (utils.ipynb)
Display formatting and word counting utilities for the selection step
Import
from cjm_transcript_source_select.utils import (
count_words,
format_date,
format_audio_filename
)Functions
def count_words(
text: str # Text to count words in
) -> int: # Word count
"Count the number of whitespace-delimited words in text."def format_date(
created_at: str # ISO date string, Unix timestamp, or similar
) -> str: # Formatted date for display
"Format a date string for human-readable display (e.g., 'Jan 20, 2026')."def format_audio_filename(
audio_path: str # Full path to audio file
) -> str: # Shortened filename for display
"Extract and format the filename from a path."