Collection-based modifiers¶
Semantiva provides two families of helpers for working with data collections:
slicers, which apply a processor element-wise to an existing collection, and
derive-based parameter sweeps, which build collections by repeatedly invoking a processor with different parameter values.
These helpers are optional. Processors can always be implemented directly for collection types, but slicers and sweeps remove boilerplate loops and make the structure of collection processing visible in inspection and trace records.
For an overview of collection types themselves, see Data collections.
Slicers¶
Slicers are generated by the factory in
semantiva.data_processors.data_slicer_factory. Given a data processor
that works on a single element, the factory
creates a new processor class that operates on a
DataCollectionType by iterating over its elements.
The factory supports both data operations and probes:
DataOperation-based slicers
DataProbe-based slicers
DataOperation slicers¶
For data operations, the slicer:
consumes a collection of elements,
applies the wrapped operation to each element in sequence, and
returns a new collection of the same collection type with the processed elements.
The underlying processor must have matching input and output element types; the
slicer preserves the collection type and ordering. Under the hood this is
implemented by dynamically subclassing the original operation and overriding
input_data_type / output_data_type to point to the collection type.
DataProbe slicers¶
For probes, the slicer:
consumes a collection of elements,
runs the wrapped probe on each element, and
returns a list of probe results.
The collection itself flows through as the data channel (for downstream
processors), while probe results are usually written into context via
context_key when the probe is used in a node.
In both cases, the slicer keeps the element-wise pattern explicit. Inspection and trace records reflect that the collection was processed by a single slicer processor rather than by hand-written loops.
Derive-based parameter sweeps¶
Parameter sweeps are derive preprocessors that compute processor parameters from variables and, when variables enumerate more than one value, execute the processor multiple times to build a collection or a list of results.
They are configured under the reserved derive key on a node using the
parameter_sweep tool.
Objective¶
Derive-based parameter sweeps:
compute call-time parameters from variable specifications,
optionally expand a node into a collection-producing processor when variables take multiple values, and
publish the materialised variable values into context.
Basic shape¶
Under a node, the reserved preprocessor boundary derive hosts named
preprocessors. The parameter_sweep preprocessor computes parameters from
variables and, for data sources and operations, declares the collection type
produced:
pipeline:
nodes:
- processor: FloatValueDataSource
derive:
parameter_sweep:
parameters:
value: 2.0 * t
variables:
t: { lo: -1.0, hi: 2.0, steps: 3 }
mode: combinatorial
broadcast: false
collection: FloatDataCollection
What it does¶
Computes the parameter
valuefrom an expression using variablet.Expands into a collection typed by
collection(DataSource/DataOperation).Publishes
t_valuesin the context.
Supported kinds¶
Sweeps can wrap three kinds of processors:
DataSource → generates a collection via repeated
get_data(...).DataOperation → augmentation-style expansion via repeated
process(data, ...)on the same input.DataProbe → returns a list of probe results; probe nodes persist via a node-level
context_keyand pass through their input data.
For DataSource and DataOperation sweeps a collection output type is
required. For DataProbe sweeps collection is forbidden; probes always
return a list.
Configuration reference¶
Inside derive.parameter_sweep the following keys are recognised:
parameters(mapping; required): expressions that compute call-time arguments. Keys must match the wrapped processor’s parameter names.variables(mapping; required): variable definitions used by the expressions:Range:
{ lo: <float>, hi: <float>, steps: <int> [, scale: linear|log] }Sequence:
[v1, v2, ...]FromContext:
{ from_context: <key> }(must yield a non-empty sequence)
collection(string; required for DataSource/DataOperation, forbidden for DataProbe): collection type name.mode:combinatorial(default) orby_position.broadcast: boolean (defaultfalse).
Modes and validation¶
combinatorial: Cartesian product across variables.by_position: zip-style alignment; an error is raised if variable sequences have different lengths.DataProbe sweeps must not declare
collection.Unknown parameter names in
parametersproduce a clear error describing the wrapped processor’s signature.
Examples¶
DataSource sweep¶
- processor: FloatValueDataSource
derive:
parameter_sweep:
parameters:
value: 2.0 * t
variables:
t: { lo: -1.0, hi: 2.0, steps: 3 }
collection: FloatDataCollection
DataOperation sweep (augmentation)¶
- processor: FloatMultiplyOperation
derive:
parameter_sweep:
parameters:
factor: f
variables:
f: { lo: 1.0, hi: 3.0, steps: 3 }
mode: by_position
collection: FloatDataCollection
DataProbe sweep¶
- processor: FloatCollectValueProbe
derive:
parameter_sweep:
parameters: {}
variables:
n: { lo: 1, hi: 3, steps: 3 }
context_key: probe_values
FromContext variables¶
The FromContext variable specification enables sweeps over sequences that
are discovered or computed earlier in the pipeline. This is useful when sweep
values depend on runtime conditions or previous processing results.
- processor: FloatValueDataSource
derive:
parameter_sweep:
parameters:
value: float(input_value)
variables:
input_value: { from_context: discovered_values }
collection: FloatDataCollection
Requirements:
The context key must exist at runtime and contain a non-empty, non-string sequence.
The sweep processor exposes the context key via
get_context_requirements()for inspection.A
{var}_valuescontext entry is created (for exampleinput_value_values) containing the materialised sequence for downstream use.
Inspection and provenance¶
Inspection surfaces which parameters were computed, provided,
defaulted, or remain required_external_parameters. For nodes using
derive.parameter_sweep, inspection also includes derived_summary and
preprocessor_metadata attributes. See Introspection & Validation for
complete inspection details.
In the Semantic Execution Record (SER) trace format, parameter sweeps expose both the concrete parameter values and their origin (node config, context, or processor defaults). See Semantic Execution Record (SER) v1 and SER v1 JSON Schema for the full schema.
See also¶
Data collections for the underlying collection types.
Data Operations and Data Probes for processor contracts.
Run Space (v1): blocks that expand context for run-space expansion.