Introspection & Validation ========================== - Build inspections, summarize, export JSON, validate. You can perform pre-execution checks from the terminal; see :doc:`cli` (subcommand **inspect**). Examples -------- .. code-block:: python from semantiva.inspection.reporter import json_report # Assume p is a Pipeline instance report = json_report(p) print(report) CLI Inspection -------------- Use the CLI to parse and validate a pipeline before executing it. .. code-block:: bash # Basic inspection: parse YAML, build canonical :term:`GraphV1`, run basic checks semantiva inspect my_pipeline.yaml # Extended inspection: include node identities, ports, inferred types (where available) semantiva inspect my_pipeline.yaml --extended What to expect ~~~~~~~~~~~~~~ * A summary of the pipeline (number of nodes, topological order). * The canonical :term:`PipelineId` (deterministic across formatting changes). * For ``--extended``: * Each node’s processor (FQCN or registered name). * The positional :term:`node_uuid` from :term:`GraphV1`. * Declared **ports** (if any) and parameter snapshot. * Any inferred or declared **input/output** types the validator can derive. If a configuration problem is found, inspection exits non-zero and prints a structured error (see **Common Validation Errors** below). Python APIs for Introspection ----------------------------- For programmatic use, generate a JSON summary from a :class:`~semantiva.pipeline.pipeline.Pipeline`. .. code-block:: python from semantiva.inspection.reporter import json_report # assume p is a Pipeline instance (built from YAML or programmatically) summary = json_report(p) # -> python dict or JSON-serializable structure print(summary) Typical fields present in the JSON summary: * ``pipeline_id`` - the deterministic :term:`PipelineId` (see :doc:`graph`). * ``nodes`` - list of nodes with: * ``node_uuid`` - positional identity from :term:`GraphV1` * ``processor`` - fully qualified class name * ``parameters`` - normalized parameter map * ``ports`` - declared input/output ports (if present) * ``issues`` - list of warnings/errors detected by the validator JSON Outline Example -------------------- Below is a truncated outline of the JSON structure produced by :func:`semantiva.inspection.reporter.json_report`. Field names shown here are stable; values are illustrative. .. code-block:: json { "pipeline_id": "plid-033c3704...300e", "nodes": [ { "index": 0, "node_uuid": "2bd52eb9-9556-5663-b633-b69c9418f3ab", "processor": "FloatMockDataSource", "parameters": {}, "ports": {} }, { "index": 1, "node_uuid": "eb3e87c0-97b7-5097-8214-b53b4ba0fd6e", "processor": "FloatMultiplyOperation", "parameters": {"factor": 2.0}, "ports": {} } ], "issues": [] } *Note:* If you need to reference these identities elsewhere (e.g., in trace logs), see :doc:`trace_graph_alignment`. .. code-block:: python from semantiva.pipeline import Pipeline, load_pipeline_from_yaml from semantiva.inspection.reporter import json_report p = Pipeline(load_pipeline_from_yaml("tests/hello_pipeline.yaml")) report = json_report(p) assert "pipeline_id" in report and "nodes" in report Linking Reports to GraphV1 Identities ------------------------------------- Inspection always works over the canonical :term:`GraphV1` representation. That means the :term:`PipelineId` and every node’s ``node_uuid`` shown in inspection output match the values in: * the canonical spec (see :doc:`graph`), and * runtime trace records (see :doc:`trace_graph_alignment`). This identity contract lets you compare results across machines, builds, and formats. Common Validation Errors ------------------------ Here are representative issues the validator can flag: * **Missing parameter** - a required parameter is absent in the YAML. * **Unknown processor** - the specified processor class cannot be resolved/imported. * **Topology/ports mismatch** - the declared ports do not match available outputs/inputs. * **Type incompatibility** - an upstream node’s output type is incompatible with the next node’s expected input type. Example output (truncated): .. code-block:: text ERROR: PipelineConfigurationError details: node_index: 2 node_uuid: "eb3e87c0-97b7-5097-8214-b53b4ba0fd6e" processor: "TransformData" reason: "Incompatible types: expected ImageType, received TextType from previous node" hint: "Check the output of node 1 or insert a converter operation." Unknown / Unused Parameters --------------------------- Semantiva validates node configuration keys against the processor signature. Parameters present in YAML ``parameters`` that are not accepted by the processor are reported during inspection under ``invalid_parameters``: .. code-block:: text invalid_parameters: - name: facotr reason: unknown_parameter The GUI and CLI can highlight these entries. When using ``inspect --strict``, the command exits non-zero if any node contains invalid parameters. During execution, invalid configuration causes a validation error before running. Tips ---- * Use ``--extended`` to include identities and port/type summaries for faster debugging. * If resolving classes from your own packages, ensure they are importable (installed in the environment). * Increase verbosity with ``-v`` to see more details during inspection (see :doc:`logger`). Component Documentation in Introspection ----------------------------------------- Semantiva components (subclasses of :class:`~semantiva.core.semantiva_component._SemantivaComponent`) automatically include their class docstrings in introspection metadata. Docstrings are extracted using ``inspect.getdoc()`` and appear in: * **Component metadata** - accessible via ``get_metadata()["docstring"]`` * **Semantic identity** - formatted in ``semantic_id()`` output for debugging and LLM queries * **Pipeline inspection** - included in both summary and extended reports * **Tracing and orchestration** - used for channel identification in transport systems **Best Practice**: Keep component docstrings lean and concise (one-liner preferred) since they become part of the semantic identity used throughout the pipeline introspection system. Detailed documentation should be placed in dedicated RST files. .. _spec-phase-vs-runtime-vs-execution: Validation Phases ----------------- Semantiva applies checks at three moments: * **Spec-phase (pre-run)** - when parsing YAML and building :term:`GraphV1`: * missing parameters, unknown processors, invalid ports/topology. * surfaced via ``semantiva inspect`` and during pipeline load. * **Runtime (initialization)** - after classes are resolved and nodes are realized: * parameter coercion/normalization errors, environment/import issues. * surfaced when constructing the :class:`~semantiva.pipeline.pipeline.Pipeline`. * **Execution (process)** - during node operation on a :class:`~semantiva.pipeline.payload.Payload`: * actual input/output type contracts, context key requirements, invariant checks. * surfaced with node identity (:term:`node_uuid`) to support precise debugging and tracing. Troubleshooting Checklist ~~~~~~~~~~~~~~~~~~~~~~~~~ * Re-run with ``semantiva inspect --extended`` to confirm identities and topology. * Increase log verbosity (``-v``/``-vv``) and capture the first error. * Check :doc:`exceptions` for common error classes and meanings. * If the failure mentions a custom processor, ensure your package is installed and importable. Autodoc ------- .. automodule:: semantiva.inspection.builder :members: :undoc-members: .. automodule:: semantiva.inspection.reporter :members: :undoc-members: .. automodule:: semantiva.inspection.validator :members: :undoc-members: