Skip to main content

Diagnose and Profile a Pipeline

FieldValue
DifficultyIntermediate
Estimated Read Time<10 minutes
Labelsdiagnostics, debugging, observability

Concept

Run three checks — graph.validate(), one metrics-enabled run.run(), and run.stats() + run.report() — to answer whether a pipeline is wired correctly and how it is performing. This is the triage baseline before deep debugging.

The three checks answer three questions:

  1. Is the Graph contract/build valid? (validate())
  2. Does one run succeed with metrics enabled? (build(..., RunOptions(enable_metrics=True)) + run.run())
  3. What do runtime diagnostics report? (run.stats(), run.report(), run.diagnostics_summary())

Especially useful when onboarding new models or environments — repeatable, fast, and catches most misconfiguration before it becomes a multi-hour debugging session.

APIs introduced

  • graph.validate() — contract-level check, returns a report with error_code.
  • pyneat.RunOptions() with enable_metrics=True and output_memory=OutputMemory.Owned.
  • run.stats()inputs_enqueued, outputs_pulled, avg/min/max_latency_ms.
  • run.report(), run.diagnostics_summary() — structured runtime diagnostics.

Prerequisites Chapter 002 or 003 (Graph/Run basics).

References

Learning Process

  1. Validate Graph contract and backend parse path (validate()).
  2. Run one deterministic frame with metrics enabled.
  3. Inspect runtime stats/report/diagnostic summary outputs.

Run

Python:

python3 share/sima-neat/tutorials/011_diagnose_a_pipeline/diagnose_a_pipeline.py

C++ (prebuilt):

./lib/sima-neat/tutorials/tutorial_011_diagnose_a_pipeline

C++ (build from source):

./build.sh --target tutorial_011_diagnose_a_pipeline
./build/tutorials-standalone/tutorial_011_diagnose_a_pipeline

To integrate this chapter's C++ source into your own project with a custom CMakeLists.txt (no extras folder required), see How to Run Tutorials on the landing page.

In Practice

Structured diagnostics, the error taxonomy, debug knobs, and the plugin-failure workflow you reach for when validate() / stats() / report() point at a problem.

GraphReport

GraphReport captures structured diagnostics:

  • pipeline string (for reproduction)
  • canonical error_code (machine triage)
  • repro_note (human summary + hint)
  • node reports and owned element names
  • bus messages and error details
  • optional flow/timing counters

When an error occurs, NeatError carries a GraphReport you can log or serialize.

Error taxonomy

Framework errors use stable code families:

Error codeMeaningTypical fix
misconfig.pipeline_shapeNode order/shape contract violationEnsure Input() first for push pipelines and Output() last for pull pipelines
misconfig.capsCaps negotiation/override mismatchAlign caps_override, format, and downstream caps
misconfig.input_shapeInput tensor/frame/sample shape/layout mismatchValidate width/height/depth, layout, dtype, storage
build.parse_launchgst_parse_launch failedValidate fragment syntax and plugin availability
runtime.pullRuntime pull/timeout/closed-output failureCheck sink output production, queue pressure, and upstream errors
io.parseSaved-graph JSON parse/schema failureValidate JSON and required node fields
io.openGraph save/load file open/read/write failureCheck path existence, permissions, and storage health

PullError.code uses the same taxonomy (not only exception paths).

Programmatic handling

#include "pipeline/ErrorCodes.h"
#include "pipeline/NeatError.h"

try {
auto run = graph.build(input);
simaai::neat::Sample out;
simaai::neat::PullError perr;
const auto st = run.pull(500, out, &perr);
if (st == simaai::neat::PullStatus::Error &&
perr.code == simaai::neat::error_codes::kRuntimePull) {
// runtime pull triage path
}
} catch (const simaai::neat::NeatError& e) {
if (e.report().error_code == simaai::neat::error_codes::kParseLaunch) {
// build/parse-launch triage path
}
}

Debug knobs (environment)

Key environment variables (see Architecture for detail):

  • SIMA_GST_DOT_DIR: write DOT graphs for failures
  • SIMA_GST_BOUNDARY_PROBES: boundary flow counters
  • SIMA_GST_ELEMENT_TIMINGS: per-element timings
  • SIMA_GST_FLOW_DEBUG: per-element flow counters
  • SIMA_GST_ENFORCE_NAMES: enforce naming contract

Debug workflow

  1. Capture GraphReport.error_code and bucket the failure by taxonomy first.
  2. Capture GraphReport.repro_note for concrete context and built-in hint.
  3. Capture pipeline text: Graph::describe_backend() or last_pipeline().
  4. Capture structured diagnostics: Run::report() or NeatError::report().
  5. Inspect GraphReport.bus for first terminal ERROR source + detail.
  6. If runtime stalls/timeouts, enable boundary/element probes to localize flow stop.

Recommended support bundle:

  • error_code
  • repro_note
  • full pipeline_string
  • first 3-5 terminal bus errors (GraphReport.bus)
  • environment overrides used in run/validate

Common failures → fixes

SymptomLikely causeFix
missing ... pluginGStreamer plugin not foundCheck GST_PLUGIN_PATH, run gst-inspect-1.0 <plugin>
appsink 'mysink' not foundMissing terminal Output()Ensure Output is the last node in run/build pipelines
caps_override is set; renegotiation disabledcaps pinnedRemove caps_override or keep input caps fixed
tensor caps change not supportedTensor shape/dtype change at runtimeKeep tensor shape/dtype stable (no renegotiation)

Debugging plugin failures

When a plugin fails, NEAT raises a NeatError whose message contains the GStreamer error and a structured debug string. Use the fields to locate the root cause quickly.

  1. Read the structured fields. Look for the debug key/value fields in the error text:

    • node: the failing element name in the pipeline
    • config_path: JSON config file (if applicable)
    • model_path: model/pack path (if applicable)
    • hint: actionable fix guidance
    • detail: extra context such as missing keys or allocator state

    See the Error Format Reference for the full list.

  2. Confirm the pipeline context. Use the pipeline string from Graph::last_pipeline() or from the error report:

    • Verify the node name appears in the pipeline.
    • Confirm the config_path exists and is readable.
    • For caps errors, check upstream elements that negotiate into the failing node.
  3. Apply common fixes.

    • Config errors: verify JSON syntax, required keys, and any model paths.
    • Caps errors: add or fix parser elements (e.g., h264parse), ensure caps include required fields like parsed=true, stream-format=byte-stream, alignment=au.
    • Allocator errors: ensure upstream elements use the required allocator type (system vs. simaai memory/segment).
  4. Capture more diagnostics with the debug knobs above (SIMA_GST_DOT_DIR, SIMA_GST_FLOW_DEBUG, SIMA_GST_ELEMENT_TIMINGS).

Code

tutorials/011_diagnose_a_pipeline/diagnose_a_pipeline.cpp
// Three diagnostic commands: Graph::validate, Run::stats, Run::report / diagnostics_summary.
//
// Usage:
// tutorial_011_diagnose_a_pipeline

#include "neat.h"

#include <opencv2/core.hpp>

#include <iostream>
#include <stdexcept>

int main() {
try {
cv::Mat rgb(96, 128, CV_8UC3, cv::Scalar(22, 44, 66));
if (!rgb.isContinuous())
rgb = rgb.clone();

simaai::neat::Graph graph;
simaai::neat::InputOptions in;
in.format = "RGB";
in.width = rgb.cols;
in.height = rgb.rows;
in.depth = rgb.channels();
graph.add(simaai::neat::nodes::Input(in));
graph.add(simaai::neat::nodes::Output());

// CORE LOGIC
// 1) validate() checks the Graph before build() and prints any caps problems.
auto report = graph.validate();
std::cout << "validate.error_code=" << report.error_code << "\n";

// 2) Run the Graph with metrics enabled so stats() has data.
simaai::neat::RunOptions run_opt;
run_opt.enable_metrics = true;
run_opt.output_memory = simaai::neat::OutputMemory::Owned;
auto run = graph.build(std::vector<cv::Mat>{rgb}, simaai::neat::RunMode::Sync, run_opt);
simaai::neat::TensorList out = run.run(std::vector<cv::Mat>{rgb}, /*timeout_ms=*/1000);
if (out.empty())
throw std::runtime_error("missing output tensor");

// 3) Post-run diagnostics: counters, per-element report, and a summary string.
auto stats = run.stats();
std::cout << "stats.inputs_enqueued=" << stats.inputs_enqueued
<< " outputs_pulled=" << stats.outputs_pulled << "\n";
std::cout << "report.size=" << run.report().size() << "\n";
std::cout << "diagnostics_summary=" << run.diagnostics_summary() << "\n";

std::cout << "[OK] 011_diagnose_a_pipeline\n";
return 0;
} catch (const std::exception& e) {
std::cerr << "[FAIL] " << e.what() << "\n";
return 1;
}
}

Source