Manifest Format (v2.0)

Typed parser + writer for the nested v2.0 manifest layout per the 2026-05-19 substrate audit’s CR-8. Substrate manifests transitioned from a flat top-level JSON object to a four-section nested layout: install (deployment-specific facts populated at install time), code (code-derived facts refreshed by cjm-ctl regenerate-manifest), drift_tracking (a config_schema hash that records the witness shape so live-vs-stored comparisons can detect drift), and overrides (an operator-supplied overlay placeholder).

Current format version

The substrate emits format_version: "2.0" on every freshly-written manifest. The reader accepts both "2.0" (nested layout) and legacy manifests (no format_version key, flat layout). Unrecognized future values raise ValueError so the substrate fails loud rather than silently degrading.

Section dataclasses

Four dataclasses mirror the JSON layout one-to-one. CodeSection.class_name is the Python-side rename for the reserved-word class JSON key; the dict serializers below handle the rename at the boundary.


InstallSection


def InstallSection(
    python_path:str='', conda_env:str='', db_path:str='', env_vars:Dict=<factory>, installed_at:str='',
    installer_version:str='', package_source:str=''
)->None:

Deployment-specific facts populated at install time.

These fields are written by install_all (paths, conda env, env vars) plus _generate_manifest’s post-introspection step (installed_at, installer_version, package_source). regenerate-manifest preserves the install section across regeneration so paths survive code-side refreshes.


CodeSection


def CodeSection(
    name:str='', version:str='', description:str='', module:str='', class_name:str='', interface:str='',
    taxonomy:Optional=None, resources:Optional=None, config_schema:Optional=None, regenerated_at:Optional=None,
    worker_env:Optional=None
)->None:

Code-derived facts refreshed by cjm-ctl regenerate-manifest.

Everything in this section comes from running the introspection script inside the plugin’s conda env: metadata + interface + config_schema + derived taxonomy + binary platform/hardware hard-facts. Drift detection hashes this section’s config_schema field as its witness shape.

class_name serializes as the JSON key "class" (Python reserved-word workaround).


DriftTracking


def DriftTracking(
    config_schema_hash:Optional=None
)->None:

Witness hashes for drift detection.

config_schema_hash is computed at write time (regenerate-manifest / install_all) from a canonical JSON encoding of the code section’s config_schema. The PluginManager’s drift-check fetches the live /config_schema from the worker, hashes it the same way, and compares; a mismatch raises PluginMeta.config_schema_drift = True plus a warning log.


ManifestV2


def ManifestV2(
    install:InstallSection=<factory>, code:CodeSection=<factory>, drift_tracking:DriftTracking=<factory>,
    overrides:Dict=<factory>, format_version:str='2.0'
)->None:

Top-level v2.0 manifest with four named sections plus format_version.

Loaded from a v2.0 nested JSON file as-is, or from a v1.0 flat file via _from_v1_flat_dict (REMOVE-AFTER-OVERHAUL shim). When the v1.0 shim fires, format_version is set to "1.0" so substrate code can distinguish legacy loads from fresh writes; otherwise format_version is always CURRENT_FORMAT_VERSION.

Config-schema hashing

compute_config_schema_hash canonicalizes the schema (sorted keys, no whitespace) before hashing so the digest is stable across Python versions and dict-insertion orders. Reuses cjm_plugin_system.utils.hashing.hash_bytes for the algo-tagged "sha256:hex" return shape that the rest of the ecosystem already uses (graph plugin, future bundle library).


compute_config_schema_hash


def compute_config_schema_hash(
    schema:Optional, # JSON Schema or None
)->str: # "sha256:hexdigest"

Hash a JSON Schema with stable canonicalization.

None is treated as {} — the hash records “no schema declared” rather than refusing. This way a plugin that lost its config_schema between install and load still gets a drift warning rather than a crash.

Read path

load_manifest(path) is the public entry point. It detects the on-disk format from the top-level format_version key:

  • "2.0" → parse the nested sections directly (_from_v2_dict).
  • missing → legacy flat manifest; pass through _from_v1_flat_dict (REMOVE-AFTER-OVERHAUL tagged).
  • anything else → ValueError (fail loud on unrecognized future formats).

Both parsers return a ManifestV2. The downstream code path never sees a flat dict again — only typed sections.


load_manifest


def load_manifest(
    path:Union, # Path to manifest JSON file on disk
)->ManifestV2: # Parsed manifest in v2.0 typed shape

Load a manifest file and return a typed ManifestV2.

Format detection by top-level format_version key: - "2.0" → nested layout, parse directly. - missing → legacy flat layout, pass through v1.0 shim. - anything else → ValueError.

Write path

write_manifest(path, m) always emits v2.0 layout regardless of how m was loaded — loading a legacy flat manifest and re-writing it transparently upgrades the file. cascade_manifests.py uses this property to bulk-migrate manifests via a read-then-write loop.

manifest_to_dict(m) is the underlying serializer; exposed separately so callers that need the dict (cjm-ctl validate, tests) can pull it without going through disk.


manifest_to_dict


def manifest_to_dict(
    m:ManifestV2, # Manifest to serialize
)->Dict: # v2.0 nested dict ready for `json.dumps`

Serialize a ManifestV2 to a v2.0 dict.

Always emits format_version == CURRENT_FORMAT_VERSION — even if the manifest was loaded from a legacy v1.0 file. This is the upgrade seam: load-then-write transparently rewrites flat manifests as nested.


write_manifest


def write_manifest(
    path:Union, # Output JSON file path
    manifest:ManifestV2, # Manifest to serialize
)->None:

Serialize a ManifestV2 to disk in v2.0 nested layout (indent=2).

Tests