Semantic Execution Record (SER) v1¶
Semantiva records pipeline execution using the Semantic Execution Record (SER) v1. A single SER is emitted for every node that runs and contains:
stable identifiers for the run, pipeline and node under
identitythe node’s upstream dependencies via
dependencies.upstreamprocessor details (
processor.refwith parameters and their sources)a minimal context delta describing reads and writes
structured assertions explaining why the node ran and why it was OK
timing information (wall and CPU)
explicit status and optional error details
optional summaries for input/output data and context snapshots
optional
tagsfor downstream correlation
SERs are written to *.ser.jsonl files where each line is a JSON object. Tools
and the Studio viewer consume these records directly.
Example SER¶
{
"record_type": "ser",
"schema_version": 1,
"identity": {"run_id": "run-…", "pipeline_id": "plid-…", "node_id": "n-3"},
"dependencies": {"upstream": ["n-2"]},
"processor": {
"ref": "semantiva.examples.test_utils.FloatBasicProbe",
"parameters": {"context_key": "probed_data"},
"parameter_sources": {"context_key": "node"}
},
"context_delta": {
"read_keys": ["probed_data"],
"created_keys": ["probed_data"],
"updated_keys": [],
"key_summaries": {
"probed_data": {"dtype": "FloatDataType", "len": 1}
}
},
"assertions": {
"trigger": "dependency",
"upstream_evidence": [{"node_id": "n-2", "state": "succeeded"}],
"preconditions": [
{
"code": "required_keys_present",
"result": "PASS",
"details": {"expected": ["probed_data"], "missing": []}
},
{
"code": "input_type_ok",
"result": "PASS",
"details": {"expected": "FloatDataType", "actual": "FloatDataType"}
}
],
"postconditions": [
{
"code": "output_type_ok",
"result": "PASS",
"details": {"expected": "FloatDataType", "actual": "FloatDataType"}
},
{
"code": "context_writes_realized",
"result": "PASS",
"details": {"created_keys": ["probed_data"], "updated_keys": [], "missing_keys": []}
}
],
"invariants": [],
"environment": {
"python": "3.12.0",
"platform": "Linux-…",
"semantiva": "0.2.0.dev0",
"numpy": null,
"pandas": null
},
"redaction_policy": {},
"args": {"run_space.index": 1, "run_space.combine": "combinatorial"}
},
"timing": {"started_at": "…", "finished_at": "…", "wall_ms": 5, "cpu_ms": 4},
"status": "succeeded",
"tags": {"node_ref": "semantiva.examples.test_utils.FloatBasicProbe"},
"summaries": {
"input_data": {"dtype": "FloatDataType", "sha256": "…"},
"output_data": {"dtype": "FloatDataType", "sha256": "…"}
}
}
The assertions block always contains structured evidence describing why the
node ran and why it was considered successful. Additional metadata (like
trigger and upstream_evidence) is included alongside the formal
preconditions/postconditions for convenient consumption.
Processor semantics¶
When preprocessors modify a processor before execution (for example,
derive.parameter_sweep), the processor object is enriched with optional
fields:
semantic_id— deterministic fingerprint for the preprocessor metadata.preprocessing_provenance— normalized, versioned provenance detailing variables, expressions, mode, broadcast flag, collection output, and dependencies used to derive parameters.
These additions extend SER while keeping previously documented fields and shapes intact.
Inspection now exposes the same sanitized metadata in the canonical payload
(Inspection Payload & CLI). Raw expr values live only inside the optional
preprocessor_view helper, which is excluded from hashing. Runtime SER still
captures the normalized provenance via processor.preprocessing_provenance
while keeping original expressions for audit trails. See
Introspection & Validation for rendered examples.
Identity facets¶
Two complementary identifiers appear in trace metadata:
pipeline_id— structural identity derived from the canonical graph.pipeline_config_id— semantic identity derived from sorted(node_uuid, semantic_id)pairs. Changes to sweep semantics alter this value even when the structural graph is unchanged.
Note
Expression signatures are conservative in v1. ExpressionSigV1 only treats
+ and * as commutative/associative; other algebraic rewrites remain
distinct.
Detail flags control which summary fields are emitted when using the JSONL driver:
hash(default) - includesha256hashes only.repr- additionally includereprfor input/output data.context- withrepralso includereprfor pre/post context.all- enable all of the above.
Versioning Policy¶
Note
SER Versioning Policy:
schema_versionis a major integer for breaking changes onlyv0 during pre-release development; v1 at first public release
Future breaking changes increment to v2, v3, etc.
Optional
schema_tagfield may be present but is not required by readers
Schema¶
The canonical JSON Schema ships with the package and can be loaded via:
from importlib import resources
schema = resources.files("semantiva.trace.schema") / "semantic_execution_record_v1.schema.json"
Context Delta¶
Each SER includes a context_delta describing how the node interacted with context:
read_keys: declared required keys (if provided by the processor)created_keys: new keys written by the nodeupdated_keys: existing keys whose values changedkey_summaries(changed keys only):dtype,len,rows, and optionalsha256(hashflag) andrepr(reprflag)
Assertions via SERHooks¶
The template-method orchestrator collects SER evidence centrally. The base
SemantivaOrchestrator
builds the pre/post assertion lists, captures context_delta snapshots, and pins the
runtime environment exactly once per node. Downstream policy engines can extend
these hooks (for example via _extra_pre_checks) but every SER produced
by the runtime includes the following assertions out of the box—even on error.
When a node fails, the exception entry is followed by the standard
output_type_ok and context_writes_realized checks so failure records
retain the same structure as successful ones.
Built-in assertions¶
The runtime emits the following assertion entries for every node:
Code |
Channel |
Purpose |
PASS |
WARN / FAIL |
|---|---|---|---|---|
|
|
Declared context keys are available before execution. |
All required keys present. |
Missing keys listed in |
|
|
Input payload matches the processor’s |
|
Type mismatch triggers |
|
|
Node configuration contains no unrecognised parameters. |
|
|
|
|
Output payload matches the processor’s |
|
Type mismatch triggers |
|
|
Context keys declared in |
All declared keys materialised, |
|
Environment pins¶
assertions.environment captures a reproducibility snapshot: Python runtime,
implementation, platform string, Semantiva version, and optional third-party
versions (numpy/pandas when installed). Values are simple strings or
null and contain no host-specific secrets.
Timing¶
Each SER includes a timing object describing execution durations and
timestamps. Fields:
wall_ms(required) — wall-clock duration in milliseconds (>= 0).cpu_ms(optional) — CPU time measured on the reporting host in milliseconds (>= 0). This field may be omitted when running on devices or in distributed executors where CPU attribution is unreliable (for example, GPU-backed processing or remote worker pools).
When present, started_at and finished_at should be ISO 8601 timestamps.