Canonical Graph Builder¶

The semantiva.pipeline.graph_builder module normalizes a pipeline definition (from YAML, dictionaries or an existing Pipeline) into a canonical GraphV1 representation. Each node receives a deterministic node_uuid and the entire graph hashes to a PipelineId using SHA-256.

Graph and Identity¶

When Semantiva loads a YAML pipeline, it is converted into a canonical GraphV1. From this graph Semantiva derives stable identifiers:

node_uuid - a positional identifier for each node.
PipelineId - a deterministic identifier for the whole pipeline.

These identities are embedded in trace records and introspection outputs, ensuring that pipelines can be compared, cached, and reproduced across environments.

GraphV1 guarantees:

version: 1 in all canonical specifications
Deterministic node identities: UUIDv5 derived from canonical fields (role, FQCN, params, ports)
Linear chain edges for demo pipelines (ODO will extend topology later without breaking node identities)
Stable PipelineId: “plid-” + sha256(canonical_spec JSON)

Note

To avoid identity churn, derive blocks (e.g., derive.parameter_sweep) are not hashed into the node UUID. Only the resolved processor class and the effective parameter map (after preprocessing/merging) participate. Semantic changes introduced by preprocessors are tracked separately in trace metadata via pipeline_config_id.

Example:

from semantiva.pipeline import build_graph, compute_pipeline_id
graph = build_graph("tests/simple_pipeline.yaml")
pipeline_id = compute_pipeline_id(graph)
print(pipeline_id)

Canonical fields that feed UUIDv5: - role: Node’s semantic role in the pipeline - FQCN: Fully qualified class name of the processor - params: Shallow parameter mapping (sorted keys) - ports: Declared input/output port specifications

See Trace ↔ Graph Alignment for how this graph aligns with tracing.

Canonical Graph Builder¶

Graph and Identity¶

See also¶