Domain schemas for content structure (Document, Segment)
Document
Represents a logical container for content such as a book chapter, podcast episode, lecture, or transcript. Documents group related segments into a traversable unit.
Represents an atomic unit of text within a document, typically a sentence or paragraph. Segments are linked sequentially via NEXT edges to form a traversable “narrative spine”.
For audio/video content, optional timing fields (start_time, end_time) enable alignment with the source media.
# Create a Segment nodesegment = Segment( text="The art of war is of vital importance to the state.", index=2, start_time=5.2, end_time=8.7, role="content")print(f"Segment [{segment.index}]: {segment.text}")print(f"Timing: {segment.start_time}s - {segment.end_time}s")
Segment [2]: The art of war is of vital importance to the state.
Timing: 5.2s - 8.7s
# Convert to GraphNode (name auto-populated from text, truncated to 50 chars)graph_node = segment.to_graph_node()print(f"GraphNode name: '{graph_node.properties['name']}'")assertlen(graph_node.properties['name']) <=50
GraphNode name: 'The art of war is of vital importance to the state'
This example demonstrates building a traversable graph structure from transcript content.
from cjm_graph_plugin_system.core import SourceRef# Source reference to the transcription jobsource = SourceRef( plugin_name="cjm-transcription-plugin-voxtral-hf", table_name="transcriptions", row_id="b0ceddd3-05a0-40e6-ac99-1903dd3e7170")# Create the documentdoc = Document(title="1. Laying Plans", media_type="audio")# Create segments from transcript sentencessentences = ["Laying Plans","Sun Tzu said,","The art of war is of vital importance to the state.","It is a matter of life and death, a road either to safety or to ruin.","Hence it is a subject of inquiry which can on no account be neglected."]segments = [ Segment(text=text, index=i, role="title"if i ==0else"content")for i, text inenumerate(sentences)]# Convert to GraphNodes with provenancedoc_node = doc.to_graph_node(sources=[source])segment_nodes = [s.to_graph_node(sources=[source]) for s in segments]print(f"Document: {doc_node.properties['name']}")print(f"Segments: {len(segment_nodes)}")for node in segment_nodes:print(f" [{node.properties['index']}] {node.properties['name'][:40]}...")
Document: 1. Laying Plans
Segments: 5
[0] Laying Plans...
[1] Sun Tzu said,...
[2] The art of war is of vital importance to...
[3] It is a matter of life and death, a road...
[4] Hence it is a subject of inquiry which c...