# GPU model release


<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

`release_model` is the single source of truth for the move-to-CPU / del
/ gc / `empty_cache` / `synchronize` sequence that every torch GPU
capability reimplements (Whisper, Voxtral-HF, Qwen3-FA, Demucs, LavaSR).
Capabilities call it from `_release_<trigger>` (CR-4 reconfigure),
`on_disable` (CR-2), and `cleanup`.

It is **best-effort**: per-attribute `.to('cpu')` failures and
CUDA-cleanup failures are logged and never raised, so teardown can’t
mask the original code path. Objects without a `.to` method (processors,
tokenizers) are dropped without the CPU move.

------------------------------------------------------------------------

### release_model

``` python

def release_model(
    obj:Any, # The capability instance holding the model attribute(s)
    model_attr_names:List, # Names of the attributes to release, in release order
    device:str='cuda', # Device the model is on; gates the CUDA-specific cleanup
    logger:Logger, # Logger for best-effort failure reporting
)->None:

```

*Release one or more model objects: move to CPU, drop references, gc,
free CUDA cache.*

For each name in `model_attr_names`, if `obj` has a non-None
attribute: 1. when on CUDA, best-effort `.to('cpu')` (frees GPU tensors;
skipped for objects without a `.to` method, e.g. processors/tokenizers),
2. `setattr(obj, name, None)` and drop the local reference. Then a
single `gc.collect()` and — on CUDA — `empty_cache()` + `synchronize()`.

Best-effort throughout: failures are logged and swallowed. Missing or
already-None attributes are skipped, so the call is idempotent.

Tests.

``` python
import logging as _logging
_log = _logging.getLogger("cjm_substrate_torch_utils.test")


class _FakeModule:
    def __init__(self): self.moved_to = None
    def to(self, d): self.moved_to = d; return self


class _FakeProcessor:  # no .to() -- like an AutoProcessor / tokenizer
    pass


class _Holder:
    pass


# device="cuda": .to('cpu') attempted on objects that have it; attrs cleared.
_h = _Holder(); _h.model = _FakeModule(); _h.processor = _FakeProcessor()
_m = _h.model
release_model(_h, ["model", "processor"], device="cuda", logger=_log)
assert _h.model is None and _h.processor is None
assert _m.moved_to == "cpu"

# Missing / already-None attrs are skipped; idempotent (no error on re-call).
release_model(_h, ["model", "processor", "does_not_exist"], device="cuda", logger=_log)
assert _h.model is None

# device="cpu": no .to('cpu') attempted.
_h2 = _Holder(); _h2.model = _FakeModule(); _m2 = _h2.model
release_model(_h2, ["model"], device="cpu", logger=_log)
assert _h2.model is None
assert _m2.moved_to is None
```
