Step Evidence Record (SER) v0 — draft¶
Semantiva records pipeline execution using the Step Evidence Record (SER) v0 (draft). A single SER is emitted for every node that runs and contains:
stable identifiers for the run, pipeline and node
the node’s upstream dependencies
action information (operation reference and parameters)
a minimal input/output delta
check results explaining why the node ran and why it was considered OK
timing information (wall and CPU)
explicit status and optional error details
optional summaries for input/output data and context snapshots
SERs are written to *.ser.jsonl
files where each line is a JSON object. Tools
and the Studio viewer consume these records directly.
Example SER¶
{
"type": "ser",
"schema_version": 0,
"ids": {"run_id": "run-…", "pipeline_id": "plid-…", "node_id": "n-3"},
"topology": {"upstream": ["n-2"]},
"action": {
"op_ref": "FloatBasicProbe",
"params": {"context_keyword": "probed_data"},
"param_source": {"context_keyword": "node"}
},
"io_delta": {
"read": ["probed_data"],
"created": ["probed_data"],
"updated": [],
"summaries": {
"probed_data": {"dtype": "FloatDataType", "len": 1}
}
},
"checks": {
"why_run": {
"trigger": "dependency",
"upstream_evidence": [{"node_id": "n-2", "state": "completed"}],
"pre": [
{
"code": "required_keys_present",
"result": "PASS",
"details": {"expected": ["probed_data"], "missing": []}
},
{
"code": "input_type_ok",
"result": "PASS",
"details": {"expected": "FloatDataType", "actual": "FloatDataType"}
}
],
"policy": []
},
"why_ok": {
"post": [
{
"code": "output_type_ok",
"result": "PASS",
"details": {"expected": "FloatDataType", "actual": "FloatDataType"}
},
{
"code": "context_writes_realized",
"result": "PASS",
"details": {"created": ["probed_data"], "updated": [], "missing": []}
}
],
"invariants": [],
"env": {
"python": "3.11.2",
"implementation": "cpython",
"platform": "Linux-…",
"semantiva": "0.1.0.dev0+dummy",
"numpy": null,
"pandas": null
},
"redaction": {}
}
},
"timing": {"start": "…", "end": "…", "duration_ms": 5, "cpu_ms": 4},
"status": "completed",
"labels": {"node_fqn": "FloatBasicProbe"},
"summaries": {
"input_data": {"dtype": "FloatDataType", "sha256": "…"},
"output_data": {"dtype": "FloatDataType", "sha256": "…"}
}
}
The checks
block now always contains:
why_run.pre
– built-in validation executed before the node runs.why_ok.post
– output validations that ran after the node returned.why_ok.env
– minimal, non-sensitive environment pins for reproducibility.
Detail flags control which summary fields are emitted when using the JSONL driver:
hash
(default) – includesha256
hashes only.repr
– additionally includerepr
for input/output data.context
– withrepr
also includerepr
for pre/post context.all
– enable all of the above.
Versioning Policy¶
Note
SER Versioning Policy:
schema_version
is a major integer for breaking changes onlyv0 during pre-release development; v1 at first public release
Future breaking changes increment to v2, v3, etc.
Optional
schema_tag
field may be present but is not required by readers
Schema¶
The canonical JSON Schema ships with the package and can be loaded via:
from importlib import resources
schema = resources.files("semantiva.trace.schema") / "ser_v0.schema.json"
IO Delta¶
Each SER now includes an io_delta
describing how the node interacted with context:
read
: declared required keys (if provided by the processor)created
: new keys written by the nodeupdated
: existing keys whose values changedsummaries
(changed keys only):dtype
,len
,rows
, and optionalsha256
(hash
flag) andrepr
(repr
flag)
Checks via SERHooks¶
The template-method orchestrator collects SER evidence centrally. The base
SemantivaOrchestrator
builds the pre/post check lists, captures io_delta
snapshots, and pins the
runtime environment exactly once per node. Downstream policy engines can extend
these hooks (for example via _extra_pre_checks
) but every SER produced
by the runtime includes the following checks out of the box—even on error.
When a node fails, the exception entry is followed by the standard
output_type_ok
and context_writes_realized
checks so failure records
retain the same structure as successful ones.
Built-in checks¶
The runtime emits the following check entries for every node:
Code |
Channel |
Purpose |
PASS |
WARN / FAIL |
---|---|---|---|---|
|
|
Declared context keys are available before execution. |
All required keys present. |
Missing keys listed in |
|
|
Input payload matches the processor’s |
|
Type mismatch triggers |
|
|
Node configuration contains no unrecognised parameters. |
|
|
|
|
Output payload matches the processor’s |
|
Type mismatch triggers |
|
|
Context keys declared in |
All declared keys materialised, |
|
Environment pins¶
checks.why_ok.env
captures a reproducibility snapshot: Python runtime,
implementation, platform string, Semantiva version, and optional third-party
versions (numpy
/pandas
when installed). Values are simple strings or
null
and contain no host-specific secrets.