• Hardware (RTX3070)
• Network Type (Facenet)
• TAO Version (5.0.0-tf1.15.5, 5.2.0-deploy)
• Training spec files (getting_started_v5.1.0/notebooks/tao_launcher_starter_kit/facenet/specs)
Hi,
I’ve passed through all the steps in the Jupiter facenet notebook.
Here are intermediate results:
Evaluate the trained model:
tao model detectnet_v2 evaluate -e $SPECS_DIR/facenet_train_resnet18_kitti.txt -k nvidia_tlt -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights/resnet18_detector.hdf5
Validation cost: 0.002349
Mean average_precision (in %): 38.0197
Evaluate the retrained model:
tao model detectnet_v2 evaluate -e $SPECS_DIR/facenet_retrain_resnet18_kitti.txt -k nvidia_tlt -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/resnet18_detector_pruned.hdf5
Validation cost: 0.002336
Mean average_precision (in %): 39.2308
Deploy:
tao model detectnet_v2 export -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/resnet18_detector_pruned.hdf5 -k nvidia_tlt -o $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.onnx
Loaded model
The ONNX operator number change on the optimization: 138 -> 53
2024-02-26 21:07:14,342 [TAO Toolkit] [INFO] keras2onnx 347: The ONNX operator number change on the optimization: 138 -> 53
2024-02-26 21:07:14,343 [TAO Toolkit] [WARNING] onnxmltools 71: The maximum opset needed by this model is only 9.
Execution status: PASS
tao deploy detectnet_v2 gen_trt_engine -m $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.onnx -k nvidia_tlt --data_type int8 --batches 10 --batch_size 16 --max_batch_size 16 --engine_file $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.trt.int8 --cal_cache_file $USER_EXPERIMENT_DIR/experiment_dir_final/calibration.bin -e $SPECS_DIR/facenet_retrain_resnet18_kitti.txt --results_dir $USER_EXPERIMENT_DIR/export --verbose
Export finished successfully.
2024-02-26 21:11:38,626 [TAO Toolkit] [INFO] root 174: Gen_trt_engine finished successfully.
Evaluate Deployed Model:
tao deploy detectnet_v2 evaluate -e $SPECS_DIR/facenet_retrain_resnet18_kitti.txt -m $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.trt.int8 -i $DATA_DOWNLOAD_DIR/training/images_val -l $DATA_DOWNLOAD_DIR/training/labels_val -r $USER_EXPERIMENT_DIR/experiment_dir_retrain/
2024-02-26 21:25:05,939 [INFO] root: *******************************
2024-02-26 21:25:05,940 [INFO] root: face AP 0.09091
2024-02-26 21:25:05,940 [INFO] root: mAP 0.091
2024-02-26 21:25:05,940 [INFO] root: *******************************
As you can see after deployment the precision has dropped greatly.
Could you please clarify why this happened?
Thank you.
Best regards,
Alexander Ivanov
3 posts - 2 participants