Testing Strategies and Best Practices¶
This document outlines recommended testing approaches for Semantiva components, with particular emphasis on avoiding internal API usage and following framework patterns.
Overview¶
Semantiva provides multiple approaches for testing components. The choice of testing strategy depends on the component type and testing goals:
Pipeline-based testing (Recommended) - Tests components through pipeline execution
Node-based testing (Conditional) - Direct processor testing with proper observers
Unit testing (Limited scope) - Testing isolated utility functions
Warning
Avoid direct instantiation of internal APIs like _ContextObserver
or _ValidatingContextObserver
in tests. These are framework internals and their usage patterns may change.
Pipeline-Based Testing (Recommended)¶
Pipeline-based testing is the preferred approach for most scenarios as it:
Tests real usage patterns that users will experience
Exercises the full component lifecycle including validation and parameter resolution
Provides clear expectations about results in the final payload context
Remains stable as internal implementation details change
Example: Testing Context Processors¶
def test_processor_through_pipeline():
"""Test processor behavior through pipeline execution."""
config = {
"extensions": ["your_extension"],
"pipeline": {
"nodes": [
{
"processor": "YourProcessor",
"parameters": {
"param1": "value1",
"param2": "value2"
}
}
]
}
}
pipeline = Pipeline.from_dict(config)
initial_payload = Payload(data=test_data, context=ContextType())
result_payload = pipeline.process(initial_payload)
# Check results in final context
expected_value = result_payload.context.get_value("your.key")
assert expected_value == expected_result
Example: YAML-Based Configuration Testing¶
def test_yaml_configuration():
"""Test component using YAML configuration (closest to real usage)."""
yaml_config = '''
extensions: ["your_extension"]
pipeline:
nodes:
- processor: YourProcessor
parameters:
strategy: "local"
options:
max_iterations: 100
tolerance: 1e-6
'''
import yaml
config = yaml.safe_load(yaml_config)
pipeline = Pipeline.from_dict(config)
result = pipeline.process(Payload(data=None, context=ContextType()))
# Verify results
result_value = result.context.get_value("processor.result")
assert result_value is not None
Testing Complex Processors¶
For processors that perform complex operations (like optimization, model fitting, etc.), pipeline-based tests provide the most realistic validation:
def test_complex_processor_integration():
"""Test complex processor with full parameter resolution."""
config = {
"extensions": ["semantiva_optimize"],
"pipeline": {
"nodes": [
{
"processor": "OptimizerContextProcessor",
"parameters": {
"strategy": "local",
"x0": [0.5],
"bounds": [[-10, 10]],
"model": YourModel(),
"termination": {
"max_evals": 200,
"ftol_abs": 1e-12
}
}
}
]
}
}
pipeline = Pipeline.from_dict(config)
result = pipeline.process(Payload(data=None, context=ContextType()))
# Check optimization results
best = result.context.get_value("optimizer.best_candidate")
assert "x" in best and "value" in best
history = result.context.get_value("optimizer.history")
assert len(history) > 0
Node-Based Testing (Limited Use)¶
Node-based testing involves direct processor instantiation and should only be used when:
Pipeline testing is not feasible due to complex setup requirements
Testing very specific processor behaviors in isolation
Performance testing where pipeline overhead is problematic
Warning
When using node-based testing, never directly instantiate _ContextObserver
or other internal framework APIs. Use the proper node execution infrastructure.
Correct Node-Based Testing Pattern¶
def test_processor_node_correctly():
"""Correct approach for node-based testing."""
from semantiva.pipeline.node import Node
from semantiva.pipeline.payload import Payload
from semantiva.context_processors.context_types import ContextType
# Create node properly
node = Node(
processor_name="YourProcessor",
parameters={
"param1": "value1",
"param2": "value2"
}
)
# Process with proper infrastructure
initial_payload = Payload(data=test_data, context=ContextType())
result_payload = node.process(initial_payload)
# Check results in context
result = result_payload.context.get_value("your.result.key")
assert result == expected_value
Antipatterns to Avoid¶
The following patterns should be avoided in tests:
Antipattern 1: Direct Observer Instantiation¶
# ❌ DON'T DO THIS
def test_processor_wrong():
processor = YourProcessor()
context = ContextType()
observer = _ContextObserver(context) # Internal API!
processor.operate_context_observer(observer, parameters)
# This creates confusion about where to check results
result = observer.observer_context.get_value("key") # ❌
# vs
result = context.get_value("key") # ❌ Which one?
Antipattern 2: Testing Against Internal State¶
# ❌ DON'T DO THIS
def test_processor_internal_state():
processor = YourProcessor()
processor._internal_method() # Testing private methods
assert processor._internal_state == expected # Testing private state
Antipattern 3: Mocking Framework Infrastructure¶
# ❌ DON'T DO THIS
def test_with_excessive_mocking():
with patch('semantiva.pipeline.node.Node'):
with patch('semantiva.context_processors.observer._ContextObserver'):
# Over-mocking breaks the framework contract
test_logic()