Architects & system designers¶

This page is for people who design systems around Semantiva:

You care about how Semantiva fits into a larger architecture: orchestration, storage, observability, lineage.
You want a clear picture of execution, inspection and trace flows, not just individual pipelines or components.
You may own standards and conventions around pipelines, SER, run spaces and contracts for your organisation.

If you mainly run and tweak existing pipelines, see Pipeline users instead.

If you primarily write components without owning wider system design, see Framework developers & component authors.

New to Semantiva?

If you have never run a Semantiva pipeline before, start with Getting Started, then follow Pipeline users and Framework developers & component authors.

Come back to this page once you are comfortable with pipelines, components and basic inspection.

What you should know already¶

Before you use this page as your main guide, you should:

Have followed Getting Started, Pipeline users and Framework developers & component authors.
Be comfortable with:
- Reading and reviewing pipeline YAMLs for non-trivial workloads.
- Understanding the responsibilities of DataOperation, DataProbe and ContextProcessor components.
- Running pytest and semantiva dev lint as part of a development workflow.
Have a rough picture that:
- Semantiva executes graphs of nodes (pipelines) over payloads with data and context.
- Execution is recorded as Semantic Execution Records (SER) and related trace artefacts.
- Contracts and SVA rules (including SVA250 and friends) encode architectural invariants in a machine-checkable way.

If any of that is unfamiliar, revisit Pipeline users, Framework developers & component authors and Basic Concepts first.

Your learning path (301+)¶

Once you are comfortable as a pipeline user and component author, this is the recommended path for architects and system designers.

Step 1 - Get the high-level execution & trace picture¶

Start with the inspection and trace side of the system:

Inspection Payload & CLI — why Semantiva records execution the way it does; how SER, traces and run spaces fit together.
Semantic Execution Record (SER) v1 — high-level overview of the Semantic Execution Record: what each SER describes, and how it relates to a node run.
Run Space (v1): blocks that expand context — how executions are grouped into run spaces (experiments, campaigns, workflows).

At this stage, you should focus on:

How node-level execution becomes SER JSONL files.
How run spaces organise multiple runs and variations.
Where, in your architecture, SER and run spaces would be stored and consumed (file systems, object stores, trace viewers, etc.).

Step 2 - Understand the core architecture slices¶

Next, read the core architecture docs with an “integration” mindset:

Pipeline Configuration Schema — how pipelines are represented as graphs; which parts are user-facing vs internal.
Context Processing Architecture — how context processing, observers and validators are wired (important for understanding invariants like those enforced by SVA250).
Registry System — where processors and other components come from, and how they are discovered.

As an architect, you should be able to answer:

Which artefacts are static configuration (pipeline schema, registry entries) vs runtime (SER, traces, run spaces).
Where extension points are: new processors, new transports, custom registries.
How Semantiva’s own invariants (as documented in contracts and architecture) align with your organisation’s design principles.

Configuration artefacts¶

In most organisations, YAML pipeline configurations are the governed configuration artefact for Semantiva:

YAML pipelines are versioned, validated (via Semantiva Contracts and semantiva dev lint) and promoted across environments.
YAML is the source for building execution graphs in production (see Pipelines - YAML & CLI and Pipeline Configuration Schema).

Semantiva also exposes a Python API for constructing pipelines (Pipelines in Python), which is extremely useful for internal testing, simulation and R&D workflows. These Python pipelines should be treated as internal tooling, not as the system-of-record configuration.

Step 3 - Tie SER, trace streams and aggregation together¶

Once you have the basic slices, go deeper into the trace pipeline:

SER v1 JSON Schema — reference for SER v1 schema.
Trace Stream v1 — how SERs are turned into trace streams.
Trace Aggregator v1 — how traces are aggregated into higher-level structures.
Trace ↔ Graph Alignment — how traces map back to pipeline graphs.

You do not need to memorise every field, but you should understand:

How a single node run flows into SER → trace stream → aggregated view.
Which artefacts external tools (e.g. Semantiva Studio Viewer) consume.
What guarantees you get about identity and provenance across these stages.

Step 4 - Look at contracts as architecture spec¶

Contracts and SVA rules act as a compact architecture specification:

Semantiva Contracts — how contracts are defined and enforced; how semantiva dev lint operationalises them.
The embedded catalog (contracts_catalog) — the list of rules, including:
- Rules around data and context typing.
- Context key metadata (created/suppressed/injected keys).
- Signature invariants such as SVA250 (no ContextType in _process_logic).

As an architect you should:

Treat these rules as part of the system’s architecture, not just implementation details.
Use them to derive organisational conventions (e.g. how components are allowed to see and modify context, how identity is attached to nodes).

Step 5 - Connect Semantiva to your wider system¶

Finally, map Semantiva onto your environment:

Execution — overview of orchestrators, executors and transports: where execution lives and how it can be embedded or driven externally.
Run-Space Lifecycle and Run-Space Emission (Runtime) — how runs are created, updated and closed, and how SER is emitted.
Semantiva Studio Viewer — how Semantiva Studio Viewer consumes run and trace artefacts to provide interactive exploration.
Testing Strategies and Best Practices — how Semantiva components and pipelines fit into CI and system-level testing.

At this point you should be able to sketch:

Where Semantiva sits relative to schedulers, workflow engines, data stores and observability tools.
Which boundaries you need to protect (e.g. what is allowed to mutate context; where identity is assigned and propagated).
How you would onboard new teams into Semantiva across the three personas (pipeline users, component authors, architects).

Architecture overview for integrators¶

This section summarises the Semantiva architecture from an integration point of view. Use it as a mental map while you read the detailed docs.

Execution core¶

Pipelines are declared as graphs (see Pipeline Configuration Schema):
- Nodes wrap processors (DataOperation, DataProbe, ContextProcessor).
- Edges describe data flow and dependencies.
- Pipelines are static artefacts, typically versioned alongside code.
Execution operates on a Payload:
- Payload.data carries domain data.
- Payload.context carries metadata and state.
- Nodes never receive the context object directly in their processors’ business logic; context is mediated via observers and contracts.

Inspection & trace stack¶

For every node run, Semantiva emits a Semantic Execution Record (SER):
- Contains identity (run, pipeline, node).
- Records parameter sources, context deltas, data summaries, timing and status.
- Stored as JSONL (one SER per line).
SERs are then:
- Turned into trace streams (for streaming analysis).
- Aggregated by the trace aggregator into higher-level artefacts (e.g. run graphs, summaries).
- Consumed by tools like Semantiva Studio Viewer to support exploration, debugging and reporting.

Contracts & invariants¶

Semantiva’s contracts (SVA rules) express architectural invariants such as:
- How types are declared on processors.
- How context keys are introduced, suppressed or injected.
- How processors may (and may not) relate to context.
These rules are enforced by semantiva dev lint and should be treated as:
- A machine-checkable architecture spec.
- A key input when you define internal guidelines for Semantiva usage.

Extension & integration points¶

As an architect, you will often look at:

Component families (documented in Creating Components (Authoring Guide) and Data Operations / Data Probes / Context Processors) as extension points for domain logic.
Registry (Registry System) as the way to manage and expose these components.
Execution & transports (Execution) as integration points for:
- External schedulers / workflow engines.
- Custom storage for SER and run spaces.
- Organisation-specific tooling around traces and reports.

Common tasks and where to look in the docs¶

This section is a quick router for common architectural tasks and where to start in the documentation.

Evaluate whether Semantiva fits a system design¶

You want to: decide how Semantiva would integrate into an existing or proposed architecture.
Look at:
- Inspection Payload & CLI and Semantic Execution Record (SER) v1 (what execution metadata you get).
- Execution (how execution is organised).
- Run Space (v1): blocks that expand context and Run-Space Lifecycle (how runs and experiments are structured).

Define organisational conventions for pipelines¶

You want to: standardise how teams design and structure pipelines.
Look at:
- Pipelines in Semantiva and Pipeline Configuration Schema (pipeline structure and graph model).
- Semantiva Contracts (rules you can rely on as global invariants).
- Glossary (shared terminology for documentation and reviews).

Integrate SER & traces with observability tooling¶

You want to: feed Semantiva traces into existing monitoring/logging systems or custom dashboards.
Look at:
- Semantic Execution Record (SER) v1 and SER v1 JSON Schema (data model).
- Trace Stream v1 and Trace Aggregator v1 (aggregation and streaming).
- Semantiva Studio Viewer (how one consumer visualises SER and trace data).

Design extension points and internal libraries¶

You want to: create internal libraries of processors or domain-specific building blocks.
Look at:
- Creating Components (Authoring Guide) and Framework developers & component authors (authoring).
- Registry System (component registration & discovery).
- Semantiva Contracts (constraints and guarantees your libraries should uphold).

Plan governance, CI and quality gates¶

You want to: define how Semantiva fits into CI/CD and technical governance.
Look at:
- Semantiva Contracts and Testing Strategies and Best Practices (how to enforce contracts and write effective tests).
- Semantiva CLI (commands to run in CI: lint, inspection, tests).
- Logger (logging behaviour, if you integrate with central logging).