Quantcast
Channel: TAO Toolkit - NVIDIA Developer Forums
Viewing all articles
Browse latest Browse all 497

Deploy TAO Classification_pyt FAN for Jetson Nano

$
0
0

Hi,

With success we use TAO with yolo4-tiny or classification_tf1 for Jetson Nano. Of course, in this process, we use special libraries from Nvidia sites or for example DeepStream-Yolo for external models of yolo or even models from Darknet framework.

Past week we tried our forces with classification_pyt and got nice results. But we can’t use it with success on the Jetson Nano.
Of course we try to follow this site: https://docs.nvidia.com/tao/tao-toolkit/text/ds_tao/classification_ds.html#deepstream-configuration-file but without success, because we got this error:

ERROR: Deserialize engine failed because file path: /home/jetson/tao/export/epoch_26.onnx.engine open error
0:00:02.711928044 13461   0x5589ace0f0 WARN                 nvinfer gstnvinfer.cpp:635:gst_nvinfer_logger:<nvinfer1> NvDsInferContext[UID 2]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1889> [UID = 2]: deserialize engine from file :/home/jetson/tao/export/epoch_26.onnx.engine failed
0:00:02.712051328 13461   0x5589ace0f0 WARN                 nvinfer gstnvinfer.cpp:635:gst_nvinfer_logger:<nvinfer1> NvDsInferContext[UID 2]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1996> [UID = 2]: deserialize backend context from engine from file :/home/jetson/tao/export/epoch_26.onnx.engine failed, try rebuild
0:00:02.712086745 13461   0x5589ace0f0 INFO                 nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger:<nvinfer1> NvDsInferContext[UID 2]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1914> [UID = 2]: Trying to create engine from model files
WARNING: [TRT]: onnx2trt_utils.cpp:364: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
ERROR: [TRT]: ModelImporter.cpp:720: While parsing node number 53 [Range -> "/backbone/pos_embed/Range_output_0"]:
ERROR: [TRT]: ModelImporter.cpp:721: --- Begin node ---
ERROR: [TRT]: ModelImporter.cpp:722: input: "/backbone/pos_embed/Constant_1_output_0"
input: "/backbone/pos_embed/Cast_output_0"
input: "/backbone/pos_embed/Constant_2_output_0"
output: "/backbone/pos_embed/Range_output_0"
name: "/backbone/pos_embed/Range"
op_type: "Range"

ERROR: [TRT]: ModelImporter.cpp:723: --- End node ---
ERROR: [TRT]: ModelImporter.cpp:726: ERROR: builtin_op_importers.cpp:3172 In function importRange:
[8] Assertion failed: inputs.at(0).isInt32() && "For range operator with dynamic inputs, this version of TensorRT only supports INT32!"
ERROR: Failed to parse onnx file
ERROR: failed to build network since parsing model errors.
Caught SIGSEGV
#0  0x0000007f92f0ed5c in __waitpid (pid=<optimized out>, stat_loc=0x7fd4f8ab54, options=<optimized out>) at ../sysdeps/unix/sysv/linux/waitpid.c:30
#1  0x0000007f92f4a2e0 in g_on_error_stack_trace ()
#2  0x0000005556afcc3c in  ()
#3  0x0000005589fcc620 in  ()
Spinning.  Please run 'gdb gst-launch-1.0 13461' to continue debugging, Ctrl-C to quit, or Ctrl-\ to dump core.

This error is from Deepstream. We know that TensorRT can’t recognize some layers. Also here is output of trtexec:

/usr/src/tensorrt/bin/trtexec --onnx=classification_model_export.onnx --maxShapes="input_1":16x3x224x224 --minShapes="input_1":1x3x224x224 --optShapes="input_1":8x3x224x224 --saveEngine=model_fan.engine

[04/04/2024-09:01:56] [I] === Model Options ===
[04/04/2024-09:01:56] [I] Format: ONNX
[04/04/2024-09:01:56] [I] Model: classification_model_export.onnx
[04/04/2024-09:01:56] [I] Output:
[04/04/2024-09:01:56] [I] === Build Options ===
[04/04/2024-09:01:56] [I] Max batch: explicit
[04/04/2024-09:01:56] [I] Workspace: 16 MiB
[04/04/2024-09:01:56] [I] minTiming: 1
[04/04/2024-09:01:56] [I] avgTiming: 8
[04/04/2024-09:01:56] [I] Precision: FP32
[04/04/2024-09:01:56] [I] Calibration:
[04/04/2024-09:01:56] [I] Refit: Disabled
[04/04/2024-09:01:56] [I] Sparsity: Disabled
[04/04/2024-09:01:56] [I] Safe mode: Disabled
[04/04/2024-09:01:56] [I] Restricted mode: Disabled
[04/04/2024-09:01:56] [I] Save engine: model_fan.engine
[04/04/2024-09:01:56] [I] Load engine:
[04/04/2024-09:01:56] [I] NVTX verbosity: 0
[04/04/2024-09:01:56] [I] Tactic sources: Using default tactic sources
[04/04/2024-09:01:56] [I] timingCacheMode: local
[04/04/2024-09:01:56] [I] timingCacheFile:
[04/04/2024-09:01:56] [I] Input(s)s format: fp32:CHW
[04/04/2024-09:01:56] [I] Output(s)s format: fp32:CHW
[04/04/2024-09:01:56] [I] Input build shape: input_1=1x3x224x224+8x3x224x224+16x3x224x224
[04/04/2024-09:01:56] [I] Input calibration shapes: model
[04/04/2024-09:01:56] [I] === System Options ===
[04/04/2024-09:01:56] [I] Device: 0
[04/04/2024-09:01:56] [I] DLACore:
[04/04/2024-09:01:56] [I] Plugins:
[04/04/2024-09:01:56] [I] === Inference Options ===
[04/04/2024-09:01:56] [I] Batch: Explicit
[04/04/2024-09:01:56] [I] Input inference shape: input_1=8x3x224x224
[04/04/2024-09:01:56] [I] Iterations: 10
[04/04/2024-09:01:56] [I] Duration: 3s (+ 200ms warm up)
[04/04/2024-09:01:56] [I] Sleep time: 0ms
[04/04/2024-09:01:56] [I] Streams: 1
[04/04/2024-09:01:56] [I] ExposeDMA: Disabled
[04/04/2024-09:01:56] [I] Data transfers: Enabled
[04/04/2024-09:01:56] [I] Spin-wait: Disabled
[04/04/2024-09:01:56] [I] Multithreading: Disabled
[04/04/2024-09:01:56] [I] CUDA Graph: Disabled
[04/04/2024-09:01:56] [I] Separate profiling: Disabled
[04/04/2024-09:01:56] [I] Time Deserialize: Disabled
[04/04/2024-09:01:56] [I] Time Refit: Disabled
[04/04/2024-09:01:56] [I] Skip inference: Disabled
[04/04/2024-09:01:56] [I] Inputs:
[04/04/2024-09:01:56] [I] === Reporting Options ===
[04/04/2024-09:01:56] [I] Verbose: Disabled
[04/04/2024-09:01:56] [I] Averages: 10 inferences
[04/04/2024-09:01:56] [I] Percentile: 99
[04/04/2024-09:01:56] [I] Dump refittable layers:Disabled
[04/04/2024-09:01:56] [I] Dump output: Disabled
[04/04/2024-09:01:56] [I] Profile: Disabled
[04/04/2024-09:01:56] [I] Export timing to JSON file:
[04/04/2024-09:01:56] [I] Export output to JSON file:
[04/04/2024-09:01:56] [I] Export profile to JSON file:
[04/04/2024-09:01:56] [I]
[04/04/2024-09:01:56] [I] === Device Information ===
[04/04/2024-09:01:56] [I] Selected Device: NVIDIA Tegra X1
[04/04/2024-09:01:56] [I] Compute Capability: 5.3
[04/04/2024-09:01:56] [I] SMs: 1
[04/04/2024-09:01:56] [I] Compute Clock Rate: 0.9216 GHz
[04/04/2024-09:01:56] [I] Device Global Memory: 3956 MiB
[04/04/2024-09:01:56] [I] Shared Memory per SM: 64 KiB
[04/04/2024-09:01:56] [I] Memory Bus Width: 64 bits (ECC disabled)
[04/04/2024-09:01:56] [I] Memory Clock Rate: 0.01275 GHz
[04/04/2024-09:01:56] [I]
[04/04/2024-09:01:56] [I] TensorRT version: 8001
[04/04/2024-09:01:58] [I] [TRT] [MemUsageChange] Init CUDA: CPU +203, GPU +0, now: CPU 221, GPU 2465 (MiB)
[04/04/2024-09:01:58] [I] Start parsing network model
[04/04/2024-09:01:58] [I] [TRT] ----------------------------------------------------------------
[04/04/2024-09:01:58] [I] [TRT] Input filename:   classification_model_export.onnx
[04/04/2024-09:01:58] [I] [TRT] ONNX IR version:  0.0.7
[04/04/2024-09:01:58] [I] [TRT] Opset version:    12
[04/04/2024-09:01:58] [I] [TRT] Producer name:    pytorch
[04/04/2024-09:01:58] [I] [TRT] Producer version: 2.2.0
[04/04/2024-09:01:58] [I] [TRT] Domain:
[04/04/2024-09:01:58] [I] [TRT] Model version:    0
[04/04/2024-09:01:58] [I] [TRT] Doc string:
[04/04/2024-09:01:58] [I] [TRT] ----------------------------------------------------------------
[04/04/2024-09:01:58] [W] [TRT] onnx2trt_utils.cpp:364: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/04/2024-09:01:58] [E] [TRT] ModelImporter.cpp:720: While parsing node number 53 [Range -> "/backbone/pos_embed/Range_output_0"]:
[04/04/2024-09:01:58] [E] [TRT] ModelImporter.cpp:721: --- Begin node ---
[04/04/2024-09:01:58] [E] [TRT] ModelImporter.cpp:722: input: "/backbone/pos_embed/Constant_1_output_0"
input: "/backbone/pos_embed/Cast_output_0"
input: "/backbone/pos_embed/Constant_2_output_0"
output: "/backbone/pos_embed/Range_output_0"
name: "/backbone/pos_embed/Range"
op_type: "Range"

[04/04/2024-09:01:58] [E] [TRT] ModelImporter.cpp:723: --- End node ---
[04/04/2024-09:01:58] [E] [TRT] ModelImporter.cpp:726: ERROR: builtin_op_importers.cpp:3172 In function importRange:
[8] Assertion failed: inputs.at(0).isInt32() && "For range operator with dynamic inputs, this version of TensorRT only supports INT32!"
[04/04/2024-09:01:58] [E] Failed to parse onnx file
[04/04/2024-09:01:58] [I] Finish parsing network model
[04/04/2024-09:01:58] [E] Parsing model failed
[04/04/2024-09:01:58] [E] Engine creation failed
[04/04/2024-09:01:58] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8001] # /usr/src/tensorrt/bin/trtexec --onnx=classification_model_export.onnx --maxShapes=input_1:16x3x224x224 --minShapes=input_1:1x3x224x224 --optShapes=input_1:8x3x224x224 --saveEngine=model_fan.engine

From the NVIDIA documentation about deployment there is no mention that there is a need for any special libraries or patches for Classification PyTorch case.

Is it possible to deploy a model from classification_pyt for Jetson Nano Deepstream 6.0? What can we do?

Thank you

Darek

15 posts - 2 participants

Read full topic


Viewing all articles
Browse latest Browse all 497

Trending Articles