Empirical Resource Tracking

Persistent store for empirically-observed resource usage per (instance_id, config_hash) pair. CR-7’s data foundation — record_sample is called from PluginManager.execute_plugin* finally blocks; aggregates feed eviction-candidate selection + future UI hints + cost-aware retry decisions.

compute_config_hash

Canonical hash of a plugin instance’s effective config — the second component of the (instance_id, config_hash) compound key. Two distinct configs for the same instance get two distinct empirical records, because their resource profiles differ (e.g. Whisper at model=‘base’ vs Whisper at model=‘large-v3’). Reuses hash_dict_canonical from cjm_plugin_system.utils.hashing, the same primitive CR-8 uses for compute_config_schema_hash — so stability rules are uniform across the substrate.


compute_config_hash


def compute_config_hash(
    config:Optional, # Plugin's effective config dict (or None / empty)
)->str: # Hash string in "algo:hexdigest" format

CR-7: hash a plugin instance’s effective config for empirical-record keying.

Same canonicalization as CR-8’s compute_config_schema_hash — sorted keys, no whitespace, "sha256:hex" shape. None / empty configs hash deterministically to the canonical-empty value so plugins with no config still get a single record per instance rather than scattering across hash-of-None edge cases.

ResourceSample

One observation captured at the end of an execute call. Frozen because it’s an immutable record of what happened. Substrate aggregates samples online into a running EmpiricalResourceRecord and discards each sample after — no per-sample retention in v1.

Important v1 limitation: cpu_percent / memory_mb_peak / gpu_memory_mb_peak come from proxy.get_stats() at the moment we measure, not from continuous sampling during execute. They’re a proxy for peak (“what was the worker using when execute returned”), not the true peak (“what was the worker using at any moment during execute”). Worker-side peak tracking is a future enhancement when needed.


ResourceSample


def ResourceSample(
    cpu_percent:float, memory_mb_peak:float, gpu_memory_mb_peak:float, duration_seconds:float, success:bool,
    observed_at:datetime, api_usage:Optional=None
)->None:

Single observation captured after an execute call completes.

Frozen — substrate aggregates online via Welford’s algorithm; no need to keep raw samples around. observed_at is tz-aware per the CR-5 convention.

EmpiricalResourceRecord

Aggregated profile for a (instance_id, config_hash) pair. Welford-mean for CPU + duration; max-of-peaks AND running mean for memory metrics (the worst observation drives eviction-candidate selection; the mean smooths over noise for UI hints / capacity-planning).

success_rate = success_count / sample_count — exposed pre-computed; substrate doesn’t surface raw success_count / failure_count.


EmpiricalResourceRecord


def EmpiricalResourceRecord(
    instance_id:str, plugin_name:str, config_hash:str, sample_count:int, cpu_percent_mean:float,
    memory_mb_peak_max:float, memory_mb_peak_mean:float, gpu_memory_mb_peak_max:float, gpu_memory_mb_peak_mean:float,
    duration_seconds_mean:float, success_rate:float, last_observed:datetime, api_usage_totals:Dict=<factory>
)->None:

Aggregated empirical resource profile for a (instance_id, config_hash) pair.

EmpiricalResourceStore Protocol

Protocol for persisting empirical resource records. Mirrors PluginConfigStore from CR-2 / OQ-4 — substrate ships LocalEmpiricalResourceStore (SQLite) as the default; hosts pass an explicit implementation to constrain to in-memory (tests), workflow-scoped storage, in-process Redis, etc.

@runtime_checkable so isinstance(x, EmpiricalResourceStore) works for substrate code that wants to assert the seam without importing the concrete class.


EmpiricalResourceStore


def EmpiricalResourceStore(
    args:VAR_POSITIONAL, kwargs:VAR_KEYWORD
):

Protocol for persisting empirically-observed resource usage.

Implementations aggregate online (Welford for means, max-of-peaks for memory). No raw-sample retention required — v1 is one row per (instance_id, config_hash) pair with running aggregates. A future implementation can add a samples table if time-series queries become necessary.

LocalEmpiricalResourceStore

SQLite-backed default. Mirrors LocalPluginConfigStore — DB created lazily on first write, reads against a non-existent DB return None / empty rather than raising. Default path is ~/.cjm/empirical_resources.db (override via db_path=... in the constructor or via per-project cjm.yaml data_dir).

Welford’s algorithm: for each new sample x, update running mean via mean += (x - mean) / n where n is the new sample count. Trivial cost; numerically stable across long sample runs. We don’t track the variance accumulator M2 in v1 since no consumer exposes variance — adding it later is a schema-additive change.


LocalEmpiricalResourceStore


def LocalEmpiricalResourceStore(
    db_path:Optional=None
):

SQLite-backed default implementation of EmpiricalResourceStore.

Online Welford aggregation for means; max-of-peaks for memory metrics. success_rate computed at read time from success_count / sample_count. DB + schema created lazily on first write.


LocalEmpiricalResourceStore.record_sample


def record_sample(
    instance_id:str, # PluginInstance.instance_id
    plugin_name:str, # PluginInstance.plugin_name (denormalized for filtering)
    config_hash:str, # compute_config_hash(inst.config)
    sample:ResourceSample, # One observation
)->None:

Fold a sample into the running aggregate. Creates a new row on first call.

Welford update for each mean. Max-of-peaks for memory metrics. success_count incremented by 1 if sample.success else 0.


LocalEmpiricalResourceStore.get_record


def get_record(
    instance_id:str, config_hash:str
)->Optional:

Fetch the aggregated record for (instance_id, config_hash), or None.


LocalEmpiricalResourceStore.list_records


def list_records(
    plugin_name:Optional=None
)->List:

List all records, optionally filtered to a plugin.


LocalEmpiricalResourceStore.delete_record


def delete_record(
    instance_id:str, config_hash:str
)->bool:

Remove a record. Returns True if a row was deleted.

Tests