Diagnose and Profile a Pipeline

Field	Value
Difficulty	Intermediate
Estimated Read Time	<10 minutes
Labels	`diagnostics`, `debugging`, `observability`

Concept

Run three checks — graph.validate(), one metrics-enabled run.run(), and run.stats() + run.report() — to answer whether a pipeline is wired correctly and how it is performing. This is the triage baseline before deep debugging.

The three checks answer three questions:

Is the Graph contract/build valid? (validate())
Does one run succeed with metrics enabled? (build(..., RunOptions(enable_metrics=True)) + run.run())
What do runtime diagnostics report? (run.stats(), run.report(), run.diagnostics_summary())

Especially useful when onboarding new models or environments — repeatable, fast, and catches most misconfiguration before it becomes a multi-hour debugging session.

APIs introduced

graph.validate() — contract-level check, returns a report with error_code.
pyneat.RunOptions() with enable_metrics=True and output_memory=OutputMemory.Owned.
run.stats() — inputs_enqueued, outputs_pulled, avg/min/max_latency_ms.
run.report(), run.diagnostics_summary() — structured runtime diagnostics.

Prerequisites Chapter 002 or 003 (Graph/Run basics).

References

Graph
Graph

Learning Process

Validate Graph contract and backend parse path (validate()).
Run one deterministic frame with metrics enabled.
Inspect runtime stats/report/diagnostic summary outputs.

Run

Python:

python3 share/sima-neat/tutorials/011_diagnose_a_pipeline/diagnose_a_pipeline.py

C++ (prebuilt):

./lib/sima-neat/tutorials/tutorial_011_diagnose_a_pipeline

C++ (build from source):

./build.sh --target tutorial_011_diagnose_a_pipeline
./build/tutorials-standalone/tutorial_011_diagnose_a_pipeline

To integrate this chapter's C++ source into your own project with a custom CMakeLists.txt (no extras folder required), see How to Run Tutorials on the landing page.

In Practice

Structured diagnostics, the error taxonomy, debug knobs, and the plugin-failure workflow you reach for when validate() / stats() / report() point at a problem.

GraphReport

GraphReport captures structured diagnostics:

pipeline string (for reproduction)
canonical error_code (machine triage)
repro_note (human summary + hint)
node reports and owned element names
bus messages and error details
optional flow/timing counters

When an error occurs, NeatError carries a GraphReport you can log or serialize.

Error taxonomy

Framework errors use stable code families:

Error code	Meaning	Typical fix
`misconfig.pipeline_shape`	Node order/shape contract violation	Ensure `Input()` first for push pipelines and `Output()` last for pull pipelines
`misconfig.caps`	Caps negotiation/override mismatch	Align `caps_override`, format, and downstream caps
`misconfig.input_shape`	Input tensor/frame/sample shape/layout mismatch	Validate width/height/depth, layout, dtype, storage
`build.parse_launch`	`gst_parse_launch` failed	Validate fragment syntax and plugin availability
`runtime.pull`	Runtime pull/timeout/closed-output failure	Check sink output production, queue pressure, and upstream errors
`io.parse`	Saved-graph JSON parse/schema failure	Validate JSON and required node fields
`io.open`	Graph save/load file open/read/write failure	Check path existence, permissions, and storage health

PullError.code uses the same taxonomy (not only exception paths).

Programmatic handling

#include "pipeline/ErrorCodes.h"
#include "pipeline/NeatError.h"

try {
  auto run = graph.build(input);
  simaai::neat::Sample out;
  simaai::neat::PullError perr;
  const auto st = run.pull(500, out, &perr);
  if (st == simaai::neat::PullStatus::Error &&
      perr.code == simaai::neat::error_codes::kRuntimePull) {
    // runtime pull triage path
  }
} catch (const simaai::neat::NeatError& e) {
  if (e.report().error_code == simaai::neat::error_codes::kParseLaunch) {
    // build/parse-launch triage path
  }
}

Debug knobs (environment)

Key environment variables (see Architecture for detail):

SIMA_GST_DOT_DIR: write DOT graphs for failures
SIMA_GST_BOUNDARY_PROBES: boundary flow counters
SIMA_GST_ELEMENT_TIMINGS: per-element timings
SIMA_GST_FLOW_DEBUG: per-element flow counters
SIMA_GST_ENFORCE_NAMES: enforce naming contract

Debug workflow

Capture GraphReport.error_code and bucket the failure by taxonomy first.
Capture GraphReport.repro_note for concrete context and built-in hint.
Capture pipeline text: Graph::describe_backend() or last_pipeline().
Capture structured diagnostics: Run::report() or NeatError::report().
Inspect GraphReport.bus for first terminal ERROR source + detail.
If runtime stalls/timeouts, enable boundary/element probes to localize flow stop.

Recommended support bundle:

error_code
repro_note
full pipeline_string
first 3-5 terminal bus errors (GraphReport.bus)
environment overrides used in run/validate

Common failures → fixes

Symptom	Likely cause	Fix
`missing ... plugin`	GStreamer plugin not found	Check `GST_PLUGIN_PATH`, run `gst-inspect-1.0 <plugin>`
`appsink 'mysink' not found`	Missing terminal `Output()`	Ensure `Output` is the last node in run/build pipelines
`caps_override is set; renegotiation disabled`	caps pinned	Remove `caps_override` or keep input caps fixed
`tensor caps change not supported`	Tensor shape/dtype change at runtime	Keep tensor shape/dtype stable (no renegotiation)

Debugging plugin failures

When a plugin fails, NEAT raises a NeatError whose message contains the GStreamer error and a structured debug string. Use the fields to locate the root cause quickly.

Read the structured fields. Look for the debug key/value fields in the error text:
- node: the failing element name in the pipeline
- config_path: JSON config file (if applicable)
- model_path: model/pack path (if applicable)
- hint: actionable fix guidance
- detail: extra context such as missing keys or allocator state
See the Error Format Reference for the full list.
Confirm the pipeline context. Use the pipeline string from Graph::last_pipeline() or from the error report:
- Verify the node name appears in the pipeline.
- Confirm the config_path exists and is readable.
- For caps errors, check upstream elements that negotiate into the failing node.
Apply common fixes.
- Config errors: verify JSON syntax, required keys, and any model paths.
- Caps errors: add or fix parser elements (e.g., h264parse), ensure caps include required fields like parsed=true, stream-format=byte-stream, alignment=au.
- Allocator errors: ensure upstream elements use the required allocator type (system vs. simaai memory/segment).
Capture more diagnostics with the debug knobs above (SIMA_GST_DOT_DIR, SIMA_GST_FLOW_DEBUG, SIMA_GST_ELEMENT_TIMINGS).

Code

tutorials/011_diagnose_a_pipeline/diagnose_a_pipeline.cpp
// Three diagnostic commands: Graph::validate, Run::stats, Run::report / diagnostics_summary.
//
// Usage:
//   tutorial_011_diagnose_a_pipeline

#include "neat.h"

#include <opencv2/core.hpp>

#include <iostream>
#include <stdexcept>

int main() {
  try {
    cv::Mat rgb(96, 128, CV_8UC3, cv::Scalar(22, 44, 66));
    if (!rgb.isContinuous())
      rgb = rgb.clone();

    simaai::neat::Graph graph;
    simaai::neat::InputOptions in;
    in.format = "RGB";
    in.width = rgb.cols;
    in.height = rgb.rows;
    in.depth = rgb.channels();
    graph.add(simaai::neat::nodes::Input(in));
    graph.add(simaai::neat::nodes::Output());

    // CORE LOGIC
    // 1) validate() checks the Graph before build() and prints any caps problems.
    auto report = graph.validate();
    std::cout << "validate.error_code=" << report.error_code << "\n";

    // 2) Run the Graph with metrics enabled so stats() has data.
    simaai::neat::RunOptions run_opt;
    run_opt.enable_metrics = true;
    run_opt.output_memory = simaai::neat::OutputMemory::Owned;
    auto run = graph.build(std::vector<cv::Mat>{rgb}, simaai::neat::RunMode::Sync, run_opt);
    simaai::neat::TensorList out = run.run(std::vector<cv::Mat>{rgb}, /*timeout_ms=*/1000);
    if (out.empty())
      throw std::runtime_error("missing output tensor");

    // 3) Post-run diagnostics: counters, per-element report, and a summary string.
    auto stats = run.stats();
    std::cout << "stats.inputs_enqueued=" << stats.inputs_enqueued
              << " outputs_pulled=" << stats.outputs_pulled << "\n";
    std::cout << "report.size=" << run.report().size() << "\n";
    std::cout << "diagnostics_summary=" << run.diagnostics_summary() << "\n";

    std::cout << "[OK] 011_diagnose_a_pipeline\n";
    return 0;
  } catch (const std::exception& e) {
    std::cerr << "[FAIL] " << e.what() << "\n";
    return 1;
  }
}

Concept​

Learning Process​

Run​

In Practice​

GraphReport​

Error taxonomy​

Programmatic handling​

Debug knobs (environment)​

Debug workflow​

Common failures → fixes​

Debugging plugin failures​

Code​

Source​