Sweeps in YAML¶
Objective¶
This page documents the Parametric Sweep feature. The objective is to show how to declare and run parameter sweeps that produce typed collections by iterating over independent variables, how to write safe parametric expressions that compute parameters for the element DataSource, and how to use context values as inputs to a sweep.
Overview¶
A sweep is a DataSource that constructs a collection (for example, an image stack or a list of numeric values) by repeatedly invoking an underlying element DataSource with different parameters. Sweeps help you generate systematic variations (grids, series, or element-wise combinations) without writing bespoke code.
Key concepts and parameters¶
vars
(required): A mapping of independent variable names to one of the variable specification types documented below. Each variable becomes an input to parametric expressions and (optionally) is forwarded to the element DataSource as an independent parameter.parametric_expressions
(optional but common): A mapping from element parameter names to small expressions (strings). Each expression is evaluated for every sweep step and its result is passed to the element DataSource under the parameter name. Expressions use a safe AST-based evaluator (see “Notes on expressions” below). Example:- parametric_expressions:
value: “50 + 20 * t”
This will evaluate
50 + 20 * t
for each value oft
and set the element parametervalue
accordingly.static_params
(optional): A mapping of parameters with fixed values passed unchanged to every element invocation.include_independent
(optional boolean): When true, the independent variables fromvars
will be forwarded to the element DataSource as parameters (in addition to any parametric expression outputs).mode
:product
(default) computes the Cartesian product of variable sequences;zip
pairs values element-wise (sequences must match lengths unlessbroadcast
is true).broadcast
(boolean, only forzip
mode): If true, shorter sequences are repeated to match the longest sequence length.
Variable specification types¶
RangeSpec
(YAML shorthand): Numeric range generation.YAML form:
t: { lo: 0.0, hi: 1.0, steps: 5 }
Produces a numeric sequence of
steps
values betweenlo
andhi
. Options:scale
:linear
orlog
;endpoint
includes upper bound if true.SequenceSpec
(YAML shorthand): Explicit sequence.YAML form:
file: ["a.csv", "b.csv"]
Use when you have an explicit list of values.
FromContext
(YAML shorthand): Read a sequence from pipeline context.YAML form:
sha: { from_context: commit_shas }
Use when an earlier processor placed a sequence into the pipeline context. The factory will validate the context entry is a non-empty, non-string sequence at runtime and expose the required context key to the inspection system.
Modes¶
product
: Cartesian product of sequences. If you havet=[a,b]
andp=[1,2]
you’ll get four combinations:(a,1),(a,2),(b,1),(b,2)
.zip
: Element-wise pairing. Withzip
andbroadcast=false
the sequences must have equal length, and each step uses the corresponding elements. Withbroadcast=true
shorter sequences are repeated.
Basic example¶
pipeline:
nodes:
- processor: "sweep:FloatValueDataSource:FloatDataCollection"
parameters:
vars:
t: {lo: 0.0, hi: 1.0, steps: 5}
parametric_expressions:
value: "t * 2"
include_independent: true
Explanation¶
With steps: 5
the range t
becomes the sequence [0.0, 0.25, 0.5,
0.75, 1.0]
(linear spacing). The expression "t * 2"
is evaluated for
each t
producing [0.0, 0.5, 1.0, 1.5, 2.0]
which are passed as the
value
parameter to the element DataSource. If include_independent
is
true, the element will also receive t
as a parameter along with value
.
From context & zip mode¶
pipeline:
nodes:
- processor: "sweep:AnalysisDataSource:AnalysisCollection"
parameters:
vars:
file: {from_context: discovered_files}
p: {from_context: parameter_sets}
parametric_expressions:
name: "f'{file}_{p}'"
mode: "zip"
broadcast: true
Explanation¶
Suppose the pipeline context contains discovered_files: ["a.tif", "b.tif"]
and parameter_sets: ["A", "B", "C"]
. With mode: zip
and
broadcast: true
the shorter sequence (discovered_files
) is repeated
to match the longer sequence, producing steps:
step 0: file=”a.tif”, p=”A” -> name=”a.tif_A”
step 1: file=”b.tif”, p=”B” -> name=”b.tif_B”
step 2: file=”a.tif”, p=”C” -> name=”a.tif_C”
Each name
is produced by evaluating the parametric expression
f'{file}_{p}'
for that step and passed to the element DataSource.
Inspection¶
Sweep sources require any
FromContext
keys; they appear insemantiva inspect
.Sweep sources create
{var}_values
context keys for downstream processors.
API reference (short)¶
The factory exposes three small helper types you may use in YAML or the programmatic API:
RangeSpec(lo, hi, steps, scale='linear', endpoint=True)
- produce a numeric range.SequenceSpec([...])
- provide an explicit sequence of values.FromContext('key')
- read a sequence from the pipeline context.
When using the programmatic API, call:
ParametricSweepFactory.create(
element=MyElementDataSource,
collection_output=MyCollectionType,
vars={ 't': RangeSpec(0,1,steps=5), 'file': SequenceSpec([...]) },
parametric_expressions={ 'x': '50 + 20 * t', 'name': "'img_' + str(t)" },
mode='product'|'zip',
include_independent=True|False,
)
Notes on expressions¶
- Expressions are parsed using a safe AST-based evaluator. This means:
No arbitrary
eval
or execution of imports.Only simple function calls are allowed (
abs
,min
,max
,round
, and the type conversionsfloat/int/str/bool
).Tuples are supported which makes multi-valued parameters possible, e.g.
"(50 + 20 * t, 20)"
returns a 2-tuple for a multi-valued parameter.Unknown variables or disallowed syntax raise clear errors at compile time.
Good practices¶
Prefer explicit
SequenceSpec
orRangeSpec
in YAML for readability.Use
FromContext
when sweep values are produced earlier in the same pipeline. The inspection output will list these required context keys.Use
mode: zip
when you want element-wise pairing. If sequences have differing lengths and you still want element-wise operation, setbroadcast: true
to repeat shorter sequences.
Examples¶
Tuple output (multi-valued parameter):
pipeline:
nodes:
- processor: "sweep:TwoDGaussianSingleChannelImageGenerator:SingleChannelImageStack"
parameters:
vars:
t: {lo: -1, hi: 2, steps: 3}
parametric_expressions:
x_0: "50 + 5 * t"
y_0: "50 + 5 * t + 5 * t ** 2"
std_dev: "(50 + 20 * t, 20)" # tuple -> (std_dev_x, std_dev_y)
amplitude: "100"
angle: "60 + 5 * t"
Explanation¶
The expression "(50 + 20 * t, 20)"
evaluates to a tuple for each
t
. For example, if t
takes values [-1, 0, 1]
then
std_dev
expands to the tuples [(30,20), (50,20), (70,20)]
. The
element DataSource must accept whatever parameter names you use (here the
factory will forward the evaluated std_dev
value under that parameter
name). If your element understands separate std_dev_x
and std_dev_y
you can either emit those as separate expressions or unpack the tuple in the
element implementation.
FromContext example with type conversion in expressions:
pipeline:
nodes:
- processor: "sweep:FloatValueDataSource:FloatDataCollection"
parameters:
vars:
input_value: { from_context: discovered_values }
parametric_expressions:
value: "float(input_value)"
Explanation¶
If the context key discovered_values
contains strings like
["1.5", "2.75"]
, the expression "float(input_value)"
converts
each string to a floating point value resulting in [1.5, 2.75]
which are
then passed to the element DataSource as the value
parameter.
Expressions¶
Expressions use a safe evaluator (no
eval
).Allowed: declared variable names, ops (+, -, , /, *), tuples, and functions:
abs
,min
,max
,round
,float
,int
,str
,bool
.Clear errors for unknown variables or disallowed syntax.
Tuple expressions like
"(x + 1, y * 2)"
are supported for multi-value parameters.Type conversion functions like
"float(input_value)"
are supported for data type conversion.