I use tao to train models and export to TensorRT to use in our C++ applications.
As I needed to use some tao5 models, I had to install newer versions of CUDA and TensorRT on all computers.
This made the tao3 models unusable since they were created with the older versions, so we proceeded to test retraining one vgg16 unet model under tao5, to our surprise of major differences that resulted in the catastrophic failure of the previous applications to run.
The major difference we found so far is the size of the resulting feature vectors where in tao3 unet, the resulting feature vector is of size rows * columns * number of classes, and each feature contains the probability value for the pixel being in each class. outputDims: (1, 704, 1280, 6)
for 6 classes
In tao5 unet, the feature vector is rows * columns * 1. outputDims: (1, 704, 1280, 1)
. I am assuming that the value is the class prediction. Although at this time all values return 0.
In tao3 unet, we did two things to the input buffer:
First, we normalized the image with
cv::subtract(image, cv::Scalar(127.5f, 127.5f, 127.5f), image, cv::noArray(), -1);
cv::divide (image, cv::Scalar(127.5f, 127.5f, 127.5f), image, 1, -1);
And second, we did a NHWC to NCHW conversion.
My specific questions for tao5 unet are:
- What pre-processing do we need to do to each video frame?
- What values are returned in the feature vector after inference?
- Is there model documentation on the input and output specifications?
Many thanks!
David
1 post - 1 participant