Triton Server Error with TAO FasterRCNN model: Validation failed: libNamespace == nullptr

Please provide the following information when requesting support.

• Hardware: Ubuntu 22.04 RTX 4090
• Network Type: FasterRCNN TAO model
• TAO version: 5.5.0
• Training spec file(If have, please share here)
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)

Training spec:

# Copyright (c) 2017-2020, NVIDIA CORPORATION.  All rights reserved.
random_seed: 42

verbose: True
model_config {
input_image_config {
image_type: RGB
image_channel_order: 'bgr'
size_height_width {
height: 640
width: 640
}
    image_channel_mean {
        key: 'b'
        value: 103.939
}
    image_channel_mean {
        key: 'g'
        value: 116.779
}
    image_channel_mean {
        key: 'r'
        value: 123.68
}
image_scaling_factor: 1.0
max_objects_num_per_image: 100
}
arch: "resnet:18"
anchor_box_config {
scale: 64.0
scale: 128.0
scale: 256.0
ratio: 1.0
ratio: 0.5
ratio: 2.0
}
freeze_bn: True
freeze_blocks: 0
freeze_blocks: 1
roi_mini_batch: 256
rpn_stride: 16
use_bias: False
roi_pooling_config {
pool_size: 7
pool_size_2x: False
}
all_projections: True
use_pooling:False
}
dataset_config {
  data_sources: {
    tfrecords_path: "/workspace/tao-experiments/data/faster_rcnn/tfrecords/new_trainval/new_trainval*"
    image_directory_path: "/workspace/tao-experiments/data/training"
  }
image_extension: 'png'
target_class_mapping {
key: 'item'
value: 'item'
}
target_class_mapping {
key: 'person'
value: 'person'
}
validation_fold: 0
}
augmentation_config {
preprocessing {
output_image_width: 640
output_image_height: 640
output_image_channel: 3
min_bbox_width: 1.0
min_bbox_height: 1.0
enable_auto_resize: True
}
spatial_augmentation {
hflip_probability: 0.5
vflip_probability: 0.0
zoom_min: 1.0
zoom_max: 1.0
translate_max_x: 0
translate_max_y: 0
}
color_augmentation {
hue_rotation_max: 0.0
saturation_shift_max: 0.0
contrast_scale_max: 0.0
contrast_center: 0.5
}
}
training_config {
visualizer {
    enabled: False
    num_images: 3
}
enable_augmentation: True
enable_qat: False
batch_size_per_gpu: 8
num_epochs: 12
rpn_min_overlap: 0.3
rpn_max_overlap: 0.7
classifier_min_overlap: 0.0
classifier_max_overlap: 0.5
gt_as_roi: False
std_scaling: 1.0
classifier_regr_std {
key: 'x'
value: 10.0
}
classifier_regr_std {
key: 'y'
value: 10.0
}
classifier_regr_std {
key: 'w'
value: 5.0
}
classifier_regr_std {
key: 'h'
value: 5.0
}

rpn_mini_batch: 256
rpn_pre_nms_top_N: 12000
rpn_nms_max_boxes: 2000
rpn_nms_overlap_threshold: 0.7

regularizer {
type: L2
weight: 1e-4
}

optimizer {
sgd {
lr: 0.02
momentum: 0.9
decay: 0.0
nesterov: False
}
}

learning_rate {
soft_start {
base_lr: 0.02
start_lr: 0.002
soft_start: 0.1
annealing_points: 0.8
annealing_points: 0.9
annealing_divider: 10.0
}
}

lambda_rpn_regr: 1.0
lambda_rpn_class: 1.0
lambda_cls_regr: 1.0
lambda_cls_class: 1.0
}
inference_config {
images_dir: '/workspace/tao-experiments/data/test_samples'
batch_size: 1
detection_image_output_dir: '/workspace/tao-experiments/faster_rcnn/inference_results_imgs_retrain'
labels_dump_dir: '/workspace/tao-experiments/faster_rcnn/inference_dump_labels_retrain'
rpn_pre_nms_top_N: 6000
rpn_nms_max_boxes: 300
rpn_nms_overlap_threshold: 0.7
object_confidence_thres: 0.0001
bbox_visualize_threshold: 0.6
classifier_nms_max_boxes: 100
classifier_nms_overlap_threshold: 0.3
nms_score_bits: 8
}
evaluation_config {
batch_size: 1
validation_period_during_training: 1
rpn_pre_nms_top_N: 6000
rpn_nms_max_boxes: 300
rpn_nms_overlap_threshold: 0.7
classifier_nms_max_boxes: 100
classifier_nms_overlap_threshold: 0.3
object_confidence_thres: 0.0001
use_voc07_11point_metric:False
gt_matching_iou_threshold: 0.5
}

Hello, I am having issues using transfer learning with the TAO FasterRCNN model, or more specifically with the Triton Inference Server after exporting as a TRT engine. I trained following the guidelines in the following notebook:

github.com/NVIDIA/tao_tutorials

notebooks/tao_launcher_starter_kit/faster_rcnn/faster_rcnn.ipynb

main

{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Object Detection using TAO FasterRCNN\n",
    "\n",
    "Transfer learning is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and re-train to use it on a different task. \n",
    "\n",
    "Train Adapt Optimize (TAO) Toolkit  is a simple and easy-to-use Python based AI toolkit for taking purpose-built AI models and customizing them with users' own data.\n",
    "\n",
    "<img align=\"center\" src=\"https://d29g4g2dyqv443.cloudfront.net/sites/default/files/akamai/TAO/tlt-tao-toolkit-bring-your-own-model-diagram.png\" width=\"1080\"> "
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},

This file has been truncated. show original

Training was successful and inference looked normal. However, when doing inference, I was receiving the error:

[02/10/2025-16:19:07] [TRT] [F] Validation failed: libNamespace == nullptr 
/workspace/trt_oss_src/TensorRT/plugin/proposalPlugin/proposalPlugin.cpp:528 
 [02/10/2025-16:19:07] [TRT] [E] std::exception

Note: I also received this error without any custom data and just the tutorial data, so to reproduce, you can use the tutorial data or I can send the tutorial model.

This error caused no issues with the inference using the TAO CLI. But when I attempted to launch a Triton Server instance with this model to test inference times, the server crashed due to this error. Is there a way to cause the server to ignore this validation issue or to fix this error with the model?

Do note this is a listed limitation with the TAO Toolkit 5.2.0 in the release notes of 5.3.0 as listed in the below link:
https://docs.nvidia.com/tao/archive/5.3.0/text/release_notes.html

Also, I used Triton Server version 24.04 as it is the last with TensorRT 8, as the TAO toolkit does not currently support TRT 10 yet from what I can see. Here is the line used to launch the triton server:

docker run --gpus=1 --rm -p 8000:8000 -p 8001:8001 -p 8002:8002 -v /home/ubuntu-testing/model_repository:/models nvcr.io/nvidia/tritonserver:24.04-py3 tritonserver --model-repository=/models

Here is my server config file for the model. I am not certain the output shapes are correct, but from what I can see it is not even getting to the config file before the server stops.

name: "FRCNN-resnet50"
platform: "tensorrt_plan"
max_batch_size : 0
input [
  {
    name: "input_image"
    data_type: TYPE_FP16
    dims: [ 3, 640, 640 ]
    reshape { shape: [ 1, 3, 640, 640 ] }
  }
]
output [
  {
    name: "nms_out"
    data_type: TYPE_FP32
    dims: [ 1, 1, 100, 7 ]
    reshape { shape: [ 1, 1, 100, 7 ] }
  },
  {
    name: "nms_out_1"
    data_type: TYPE_FP32
    dims: [ 1, 1 , 1, 1]
    reshape { shape: [ 1, 1, 1, 1 ] }
  }
]

And here is the output from the Triton server when it does not launch:

NVIDIA Release 24.04 (build 90085237)
Triton Server Version 2.45.0

Copyright (c) 2018-2023, NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

I0210 23:05:46.603013 1 pinned_memory_manager.cc:275] Pinned memory pool is created at '0x7cb6d6000000' with size 268435456
I0210 23:05:46.604848 1 cuda_memory_manager.cc:107] CUDA memory pool is created on device 0 with size 67108864
I0210 23:05:46.608765 1 model_lifecycle.cc:469] loading: FRCNN-resnet50:1
I0210 23:05:46.634964 1 tensorrt.cc:65] TRITONBACKEND_Initialize: tensorrt
I0210 23:05:46.634975 1 tensorrt.cc:75] Triton TRITONBACKEND API version: 1.19
I0210 23:05:46.634977 1 tensorrt.cc:81] 'tensorrt' TRITONBACKEND API version: 1.19
I0210 23:05:46.634979 1 tensorrt.cc:105] backend configuration:
{"cmdline":{"auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","default-max-batch-size":"4"}}
I0210 23:05:46.636848 1 tensorrt.cc:231] TRITONBACKEND_ModelInitialize: FRCNN-resnet50 (version 1)
I0210 23:05:46.691943 1 logging.cc:46] Loaded engine size: 84 MiB
E0210 23:05:46.707516 1 logging.cc:40] Validation failed: libNamespace == nullptr
plugin/proposalPlugin/proposalPlugin.cpp:528

Thanks for your help!

7 posts - 2 participants

Read full topic

Triton Server Error with TAO FasterRCNN model: Validation failed: libNamespace == nullptr

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112