Resource Validation

Validate resource availability before job execution and determine appropriate actions

Validation Actions

The validation system returns different actions based on resource availability:


source

ValidationAction

 ValidationAction (value, names=None, module=None, qualname=None,
                   type=None, start=1, boundary=None)

Actions that can be taken based on validation results.

Validation Result

The result object contains all information needed to make decisions about job execution:


source

ValidationResult

 ValidationResult (action:__main__.ValidationAction, can_proceed:bool,
                   message:str, conflict:Optional[cjm_fasthtml_resources.c
                   ore.manager.ResourceConflict]=None, current_worker:Opti
                   onal[cjm_fasthtml_resources.core.manager.WorkerState]=N
                   one, plugin_name_to_reload:Optional[str]=None,
                   new_config:Optional[Dict[str,Any]]=None)

Result of resource validation.

# Example: Creating a validation result
result = ValidationResult(
    action=ValidationAction.PROCEED,
    can_proceed=True,
    message="GPU available. Proceeding with job."
)

print(f"Action: {result.action.value}")
print(f"Can proceed: {result.can_proceed}")
print(f"Message: {result.message}")
Action: proceed
Can proceed: True
Message: GPU available. Proceeding with job.

Resource Validation Function

This is the main validation function that coordinates resource checks and returns appropriate actions.

The validation logic handles several scenarios:

  1. CPU-based plugins: Check system memory availability
  2. API-based plugins: Skip resource validation
  3. GPU-based plugins: Check GPU availability and handle conflicts
  4. Plugin switching: Detect when plugins need to be reloaded
  5. Resource conflicts: Identify conflicts with app workers or external processes

source

validate_resources_for_job

 validate_resources_for_job (resource_manager, plugin_registry,
                             get_plugin_resource_requirements,
                             compare_plugin_resources,
                             get_plugin_resource_identifier,
                             plugin_id:str,
                             plugin_config:Optional[Dict[str,Any]]=None,
                             worker_pid:Optional[int]=None,
                             worker_type:str='transcription',
                             verbose:bool=False)

Validate if resources are available to run a job with the specified plugin. This function is dependency-injected with helper functions to avoid tight coupling with specific plugin registry implementations.

Type Default Details
resource_manager ResourceManager instance
plugin_registry Plugin registry protocol (has get_plugin, load_plugin_config methods)
get_plugin_resource_requirements Function: (plugin_id, config) -> Dict with requirements
compare_plugin_resources Function: (config1, config2) -> bool (same resource?)
get_plugin_resource_identifier Function: (config) -> str (resource ID)
plugin_id str Unique plugin ID
plugin_config Optional None Plugin configuration (will load if not provided)
worker_pid Optional None PID of the worker that will run the job (if known)
worker_type str transcription Type of worker (e.g., “transcription”, “llm”, “ollama”)
verbose bool False Whether to print verbose logging
Returns ValidationResult ValidationResult with action to take

The validation function uses dependency injection to avoid tight coupling. You provide helper functions that know how to work with your specific plugin registry implementation.

Error Handling Integration

When the cjm-error-handling library is installed, you can convert ValidationResult objects into structured exceptions. This is useful when you want to raise errors instead of returning result objects.


source

validation_result_to_error

 validation_result_to_error (result:__main__.ValidationResult,
                             plugin_id:Optional[str]=None,
                             job_id:Optional[str]=None,
                             worker_pid:Optional[int]=None,
                             **extra_context)

Convert a ValidationResult into a structured error. Requires cjm-error-handling library.

Type Default Details
result ValidationResult Validation result to convert
plugin_id Optional None Plugin ID for error context
job_id Optional None Job ID for error context
worker_pid Optional None Worker PID for error context
extra_context VAR_KEYWORD
Returns Optional Structured error based on validation action, or None if no error needed

Example: Converting ValidationResult to Errors

This shows how to use the error handling integration:

# Example 1: ABORT case (plugin not found)
if _has_error_handling:
    abort_result = ValidationResult(
        action=ValidationAction.ABORT,
        can_proceed=False,
        message="Plugin transcription_whisper_huge not found."
    )
    
    error = validation_result_to_error(
        abort_result,
        plugin_id="transcription_whisper_huge",
        job_id="job-123"
    )
    
    print("Example 1: ABORT -> ValidationError")
    print(f"  Error type: {type(error).__name__}")
    print(f"  Message: {error.get_user_message()}")
    print(f"  Retryable: {error.is_retryable}")
    print(f"  Severity: {error.severity.value}")
    print(f"  Context: plugin_id={error.context.plugin_id}, job_id={error.context.job_id}")
else:
    print("cjm-error-handling not installed - skipping example")
Example 1: ABORT -> ValidationError
  Error type: ValidationError
  Message: Plugin transcription_whisper_huge not found.
  Retryable: False
  Severity: error
  Context: plugin_id=transcription_whisper_huge, job_id=job-123
# Example 2: WAIT_FOR_JOB case (GPU busy with same worker type)
if _has_error_handling:
    # Simulate a worker state
    from cjm_fasthtml_resources.core.manager import WorkerState
    
    busy_worker = WorkerState(
        pid=54321,
        worker_type="transcription",
        job_id="job-current",
        plugin_id="whisper_large",
        plugin_name="whisper_large",
        loaded_plugin_resource="openai/whisper-large-v3",
        config={"model_id": "openai/whisper-large-v3"},
        status="running"
    )
    
    wait_result = ValidationResult(
        action=ValidationAction.WAIT_FOR_JOB,
        can_proceed=False,
        message="GPU in use by running job (PID 54321). Wait for completion or cancel job.",
        current_worker=busy_worker
    )
    
    error = validation_result_to_error(
        wait_result,
        plugin_id="whisper_base",
        job_id="job-456"
    )
    
    print("\nExample 2: WAIT_FOR_JOB -> ResourceError")
    print(f"  Error type: {type(error).__name__}")
    print(f"  Message: {error.get_user_message()}")
    print(f"  Retryable: {error.is_retryable}")
    print(f"  Resource type: {error.resource_type}")
    print(f"  Suggested action: {error.suggested_action}")
    print(f"  Context worker PID: {error.context.worker_pid}")
else:
    print("cjm-error-handling not installed - skipping example")

Example 2: WAIT_FOR_JOB -> ResourceError
  Error type: ResourceError
  Message: GPU in use by running job (PID 54321). Wait for completion or cancel job.
  Retryable: True
  Resource type: GPU
  Suggested action: Wait for current job to complete or cancel it
  Context worker PID: 54321
# Example 3: Practical usage pattern - raise error if validation fails
if _has_error_handling:
    print("\nExample 3: Practical Usage Pattern")
    print("="*60)
    
    def start_job_with_validation(plugin_id, job_id, validation_result):
        """Example function showing how to use validation + errors together."""
        # Check if we can proceed
        if not validation_result.can_proceed:
            # Convert to error and raise
            error = validation_result_to_error(
                validation_result,
                plugin_id=plugin_id,
                job_id=job_id
            )
            raise error
        
        # Validation passed, proceed with job
        return f"Job {job_id} started successfully with {plugin_id}"
    
    # Test with ABORT case
    try:
        result = ValidationResult(
            action=ValidationAction.ABORT,
            can_proceed=False,
            message="Plugin not found"
        )
        start_job_with_validation("whisper_huge", "job-789", result)
    except ValidationError as e:
        print(f"Caught ValidationError: {e.get_user_message()}")
        print(f"  Action: Don't retry, fix the plugin ID")
    
    # Test with WAIT_FOR_JOB case (retryable)
    try:
        result = ValidationResult(
            action=ValidationAction.WAIT_FOR_JOB,
            can_proceed=False,
            message="GPU busy"
        )
        start_job_with_validation("whisper_base", "job-999", result)
    except ResourceError as e:
        print(f"\nCaught ResourceError: {e.get_user_message()}")
        print(f"  Retryable: {e.is_retryable}")
        print(f"  Action: Retry after GPU becomes available")
    
    print("\n" + "="*60)
else:
    print("cjm-error-handling not installed - skipping example")

Example 3: Practical Usage Pattern
============================================================
Caught ValidationError: Plugin not found
  Action: Don't retry, fix the plugin ID

Caught ResourceError: GPU busy
  Retryable: True
  Action: Retry after GPU becomes available

============================================================