Please provide the following information when requesting support.
• Hardware (T4/V100/Xavier/Nano/etc): A5000 GPU
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc): RetinaNet
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here): TAO 5.2
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)
For our application, I want to run the TAO Retinanet model in Onnxruntime with TensorRT execution acceleration in Triton server. I am using Trition tritonserver:23.06-py3. This is the triton model config I am using:
config.txt (580 Bytes)
However, Onnxruntime can’t load the model due to custom TRT NMS:
UNAVAILABLE: Internal: onnx runtime error 10: Load model from /models/retinanet_onnxruntime/1/model.onnx failed:This is an invalid model. In Node, (“NMS”, NMSDynamic_TRT, “”, -1) : (“anchor_data”: tensor(float),“loc_data”: tensor(float),“conf_data”: tensor(float),) → (“NMS”: tensor(float),“NMS_1”: tensor(float),) , Error No Op registered for NMSDynamic_TRT with domain_version of 15
Running the model in tensorrt works fine, however, for our application we prefer to use onnxruntime with tensorRT acceleration.
Is there a way to load the model in onnxruntime and execute it with tensorrt acceleration?
1 post - 1 participant