Architects & system designers

This page is for people who design systems around Semantiva:

  • You care about how Semantiva fits into a larger architecture: orchestration, storage, observability, lineage.

  • You want a clear picture of execution, inspection and trace flows, not just individual pipelines or components.

  • You may own standards and conventions around pipelines, SER, run spaces and contracts for your organisation.

If you mainly run and tweak existing pipelines, see Pipeline users instead.

If you primarily write components without owning wider system design, see Framework developers & component authors.

New to Semantiva?

If you have never run a Semantiva pipeline before, start with Getting Started, then follow Pipeline users and Framework developers & component authors.

Come back to this page once you are comfortable with pipelines, components and basic inspection.

What you should know already

Before you use this page as your main guide, you should:

  • Have followed Getting Started, Pipeline users and Framework developers & component authors.

  • Be comfortable with:

    • Reading and reviewing pipeline YAMLs for non-trivial workloads.

    • Understanding the responsibilities of DataOperation, DataProbe and ContextProcessor components.

    • Running pytest and semantiva dev lint as part of a development workflow.

  • Have a rough picture that:

    • Semantiva executes graphs of nodes (pipelines) over payloads with data and context.

    • Execution is recorded as Semantic Execution Records (SER) and related trace artefacts.

    • Contracts and SVA rules (including SVA250 and friends) encode architectural invariants in a machine-checkable way.

If any of that is unfamiliar, revisit Pipeline users, Framework developers & component authors and Basic Concepts first.

Your learning path (301+)

Once you are comfortable as a pipeline user and component author, this is the recommended path for architects and system designers.

Step 1 - Get the high-level execution & trace picture

Start with the inspection and trace side of the system:

At this stage, you should focus on:

  • How node-level execution becomes SER JSONL files.

  • How run spaces organise multiple runs and variations.

  • Where, in your architecture, SER and run spaces would be stored and consumed (file systems, object stores, trace viewers, etc.).

Step 2 - Understand the core architecture slices

Next, read the core architecture docs with an “integration” mindset:

  • Pipeline Configuration Schema — how pipelines are represented as graphs; which parts are user-facing vs internal.

  • Context Processing Architecture — how context processing, observers and validators are wired (important for understanding invariants like those enforced by SVA250).

  • Registry System — where processors and other components come from, and how they are discovered.

As an architect, you should be able to answer:

  • Which artefacts are static configuration (pipeline schema, registry entries) vs runtime (SER, traces, run spaces).

  • Where extension points are: new processors, new transports, custom registries.

  • How Semantiva’s own invariants (as documented in contracts and architecture) align with your organisation’s design principles.

Configuration artefacts

In most organisations, YAML pipeline configurations are the governed configuration artefact for Semantiva:

Semantiva also exposes a Python API for constructing pipelines (Pipelines in Python), which is extremely useful for internal testing, simulation and R&D workflows. These Python pipelines should be treated as internal tooling, not as the system-of-record configuration.

Step 3 - Tie SER, trace streams and aggregation together

Once you have the basic slices, go deeper into the trace pipeline:

You do not need to memorise every field, but you should understand:

  • How a single node run flows into SER → trace stream → aggregated view.

  • Which artefacts external tools (e.g. Semantiva Studio Viewer) consume.

  • What guarantees you get about identity and provenance across these stages.

Step 4 - Look at contracts as architecture spec

Contracts and SVA rules act as a compact architecture specification:

  • Semantiva Contracts — how contracts are defined and enforced; how semantiva dev lint operationalises them.

  • The embedded catalog (contracts_catalog) — the list of rules, including:

    • Rules around data and context typing.

    • Context key metadata (created/suppressed/injected keys).

    • Signature invariants such as SVA250 (no ContextType in _process_logic).

As an architect you should:

  • Treat these rules as part of the system’s architecture, not just implementation details.

  • Use them to derive organisational conventions (e.g. how components are allowed to see and modify context, how identity is attached to nodes).

Step 5 - Connect Semantiva to your wider system

Finally, map Semantiva onto your environment:

At this point you should be able to sketch:

  • Where Semantiva sits relative to schedulers, workflow engines, data stores and observability tools.

  • Which boundaries you need to protect (e.g. what is allowed to mutate context; where identity is assigned and propagated).

  • How you would onboard new teams into Semantiva across the three personas (pipeline users, component authors, architects).

Architecture overview for integrators

This section summarises the Semantiva architecture from an integration point of view. Use it as a mental map while you read the detailed docs.

Execution core

  • Pipelines are declared as graphs (see Pipeline Configuration Schema):

    • Nodes wrap processors (DataOperation, DataProbe, ContextProcessor).

    • Edges describe data flow and dependencies.

    • Pipelines are static artefacts, typically versioned alongside code.

  • Execution operates on a Payload:

    • Payload.data carries domain data.

    • Payload.context carries metadata and state.

    • Nodes never receive the context object directly in their processors’ business logic; context is mediated via observers and contracts.

Inspection & trace stack

  • For every node run, Semantiva emits a Semantic Execution Record (SER):

    • Contains identity (run, pipeline, node).

    • Records parameter sources, context deltas, data summaries, timing and status.

    • Stored as JSONL (one SER per line).

  • SERs are then:

    • Turned into trace streams (for streaming analysis).

    • Aggregated by the trace aggregator into higher-level artefacts (e.g. run graphs, summaries).

    • Consumed by tools like Semantiva Studio Viewer to support exploration, debugging and reporting.

Contracts & invariants

  • Semantiva’s contracts (SVA rules) express architectural invariants such as:

    • How types are declared on processors.

    • How context keys are introduced, suppressed or injected.

    • How processors may (and may not) relate to context.

  • These rules are enforced by semantiva dev lint and should be treated as:

    • A machine-checkable architecture spec.

    • A key input when you define internal guidelines for Semantiva usage.

Extension & integration points

As an architect, you will often look at:

Common tasks and where to look in the docs

This section is a quick router for common architectural tasks and where to start in the documentation.

Evaluate whether Semantiva fits a system design

Define organisational conventions for pipelines

Integrate SER & traces with observability tooling

Design extension points and internal libraries

Plan governance, CI and quality gates