Diagnose and Profile a Pipeline
| Field | Value |
|---|---|
| Difficulty | Intermediate |
| Estimated Read Time | <10 minutes |
| Labels | diagnostics, debugging, observability |
Concept
Run three checks — graph.validate(), one metrics-enabled run.run(), and run.stats() + run.report() — to answer whether a pipeline is wired correctly and how it is performing. This is the triage baseline before deep debugging.
The three checks answer three questions:
- Is the Graph contract/build valid? (
validate()) - Does one run succeed with metrics enabled? (
build(..., RunOptions(enable_metrics=True))+run.run()) - What do runtime diagnostics report? (
run.stats(),run.report(),run.diagnostics_summary())
Especially useful when onboarding new models or environments — repeatable, fast, and catches most misconfiguration before it becomes a multi-hour debugging session.
APIs introduced
graph.validate()— contract-level check, returns a report witherror_code.pyneat.RunOptions()withenable_metrics=Trueandoutput_memory=OutputMemory.Owned.run.stats()—inputs_enqueued,outputs_pulled,avg/min/max_latency_ms.run.report(),run.diagnostics_summary()— structured runtime diagnostics.
Prerequisites Chapter 002 or 003 (Graph/Run basics).
References
Learning Process
- Validate Graph contract and backend parse path (
validate()). - Run one deterministic frame with metrics enabled.
- Inspect runtime stats/report/diagnostic summary outputs.
Run
Python:
python3 share/sima-neat/tutorials/011_diagnose_a_pipeline/diagnose_a_pipeline.py
C++ (prebuilt):
./lib/sima-neat/tutorials/tutorial_011_diagnose_a_pipeline
C++ (build from source):
./build.sh --target tutorial_011_diagnose_a_pipeline
./build/tutorials-standalone/tutorial_011_diagnose_a_pipeline
To integrate this chapter's C++ source into your own project with a custom CMakeLists.txt (no extras folder required), see How to Run Tutorials on the landing page.
In Practice
Structured diagnostics, the error taxonomy, debug knobs, and the plugin-failure workflow you reach for when validate() / stats() / report() point at a problem.
GraphReport
GraphReport captures structured diagnostics:
- pipeline string (for reproduction)
- canonical
error_code(machine triage) repro_note(human summary + hint)- node reports and owned element names
- bus messages and error details
- optional flow/timing counters
When an error occurs, NeatError carries a GraphReport you can log or serialize.
Error taxonomy
Framework errors use stable code families:
| Error code | Meaning | Typical fix |
|---|---|---|
misconfig.pipeline_shape | Node order/shape contract violation | Ensure Input() first for push pipelines and Output() last for pull pipelines |
misconfig.caps | Caps negotiation/override mismatch | Align caps_override, format, and downstream caps |
misconfig.input_shape | Input tensor/frame/sample shape/layout mismatch | Validate width/height/depth, layout, dtype, storage |
build.parse_launch | gst_parse_launch failed | Validate fragment syntax and plugin availability |
runtime.pull | Runtime pull/timeout/closed-output failure | Check sink output production, queue pressure, and upstream errors |
io.parse | Saved-graph JSON parse/schema failure | Validate JSON and required node fields |
io.open | Graph save/load file open/read/write failure | Check path existence, permissions, and storage health |
PullError.code uses the same taxonomy (not only exception paths).
Programmatic handling
#include "pipeline/ErrorCodes.h"
#include "pipeline/NeatError.h"
try {
auto run = graph.build(input);
simaai::neat::Sample out;
simaai::neat::PullError perr;
const auto st = run.pull(500, out, &perr);
if (st == simaai::neat::PullStatus::Error &&
perr.code == simaai::neat::error_codes::kRuntimePull) {
// runtime pull triage path
}
} catch (const simaai::neat::NeatError& e) {
if (e.report().error_code == simaai::neat::error_codes::kParseLaunch) {
// build/parse-launch triage path
}
}
Debug knobs (environment)
Key environment variables (see Architecture for detail):
SIMA_GST_DOT_DIR: write DOT graphs for failuresSIMA_GST_BOUNDARY_PROBES: boundary flow countersSIMA_GST_ELEMENT_TIMINGS: per-element timingsSIMA_GST_FLOW_DEBUG: per-element flow countersSIMA_GST_ENFORCE_NAMES: enforce naming contract
Debug workflow
- Capture
GraphReport.error_codeand bucket the failure by taxonomy first. - Capture
GraphReport.repro_notefor concrete context and built-in hint. - Capture pipeline text:
Graph::describe_backend()orlast_pipeline(). - Capture structured diagnostics:
Run::report()orNeatError::report(). - Inspect
GraphReport.busfor first terminalERRORsource + detail. - If runtime stalls/timeouts, enable boundary/element probes to localize flow stop.
Recommended support bundle:
error_coderepro_note- full
pipeline_string - first 3-5 terminal bus errors (
GraphReport.bus) - environment overrides used in run/validate
Common failures → fixes
| Symptom | Likely cause | Fix |
|---|---|---|
missing ... plugin | GStreamer plugin not found | Check GST_PLUGIN_PATH, run gst-inspect-1.0 <plugin> |
appsink 'mysink' not found | Missing terminal Output() | Ensure Output is the last node in run/build pipelines |
caps_override is set; renegotiation disabled | caps pinned | Remove caps_override or keep input caps fixed |
tensor caps change not supported | Tensor shape/dtype change at runtime | Keep tensor shape/dtype stable (no renegotiation) |
Debugging plugin failures
When a plugin fails, NEAT raises a NeatError whose message contains the GStreamer error and a structured debug string. Use the fields to locate the root cause quickly.
-
Read the structured fields. Look for the
debugkey/value fields in the error text:node: the failing element name in the pipelineconfig_path: JSON config file (if applicable)model_path: model/pack path (if applicable)hint: actionable fix guidancedetail: extra context such as missing keys or allocator state
See the Error Format Reference for the full list.
-
Confirm the pipeline context. Use the pipeline string from
Graph::last_pipeline()or from the error report:- Verify the
nodename appears in the pipeline. - Confirm the
config_pathexists and is readable. - For caps errors, check upstream elements that negotiate into the failing node.
- Verify the
-
Apply common fixes.
- Config errors: verify JSON syntax, required keys, and any model paths.
- Caps errors: add or fix parser elements (e.g.,
h264parse), ensure caps include required fields likeparsed=true,stream-format=byte-stream,alignment=au. - Allocator errors: ensure upstream elements use the required allocator type (system vs. simaai memory/segment).
-
Capture more diagnostics with the debug knobs above (
SIMA_GST_DOT_DIR,SIMA_GST_FLOW_DEBUG,SIMA_GST_ELEMENT_TIMINGS).
Code
// Three diagnostic commands: Graph::validate, Run::stats, Run::report / diagnostics_summary.
//
// Usage:
// tutorial_011_diagnose_a_pipeline
#include "neat.h"
#include <opencv2/core.hpp>
#include <iostream>
#include <stdexcept>
int main() {
try {
cv::Mat rgb(96, 128, CV_8UC3, cv::Scalar(22, 44, 66));
if (!rgb.isContinuous())
rgb = rgb.clone();
simaai::neat::Graph graph;
simaai::neat::InputOptions in;
in.format = "RGB";
in.width = rgb.cols;
in.height = rgb.rows;
in.depth = rgb.channels();
graph.add(simaai::neat::nodes::Input(in));
graph.add(simaai::neat::nodes::Output());
// CORE LOGIC
// 1) validate() checks the Graph before build() and prints any caps problems.
auto report = graph.validate();
std::cout << "validate.error_code=" << report.error_code << "\n";
// 2) Run the Graph with metrics enabled so stats() has data.
simaai::neat::RunOptions run_opt;
run_opt.enable_metrics = true;
run_opt.output_memory = simaai::neat::OutputMemory::Owned;
auto run = graph.build(std::vector<cv::Mat>{rgb}, simaai::neat::RunMode::Sync, run_opt);
simaai::neat::TensorList out = run.run(std::vector<cv::Mat>{rgb}, /*timeout_ms=*/1000);
if (out.empty())
throw std::runtime_error("missing output tensor");
// 3) Post-run diagnostics: counters, per-element report, and a summary string.
auto stats = run.stats();
std::cout << "stats.inputs_enqueued=" << stats.inputs_enqueued
<< " outputs_pulled=" << stats.outputs_pulled << "\n";
std::cout << "report.size=" << run.report().size() << "\n";
std::cout << "diagnostics_summary=" << run.diagnostics_summary() << "\n";
std::cout << "[OK] 011_diagnose_a_pipeline\n";
return 0;
} catch (const std::exception& e) {
std::cerr << "[FAIL] " << e.what() << "\n";
return 1;
}
}