Please provide the following information when requesting support.
• Hardware (T4/V100/Xavier/Nano/etc): NVIDIA GeForce RTX 4070
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc): Detectnet_v2
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)
• Training spec file(If have, please share here)
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)
I need to train the DetectNet_v2 model in a custom data. To start, follow the initial issues:
- Object Detection – KITTI Format:
Data Annotation Format - NVIDIA Docs
For DetectNet_v2, the train tool does not support training on images of multiple resolutions, or resizing images during training. All of the images must be resized offline to the final training size and the corresponding bounding boxes must be scaled accordingly. Online resizing is supported for other detection model architectures:
Q1. Is there a specific resolution for the images. I mean, do I need to resize the images to a specific size before annotation (label)?
- Label Files:
For detection the Toolkit only requires the class name and bbox coordinates fields to be populated. This is because the TAO training pipe supports training only for class and bbox coordinates. The remaining fields may be set to 0. Here is a sample file for a custom annotated dataset:
car 0.00 0 0.00 587.01 173.33 614.12 200.12 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Q2. The following meaning are ok?
(X_min, Y_min) = (587.01 173.33)
(X_max, X_max) = (614.12 200.12)
Q3. For DetectNet_v2, the train tool does support training with rotated rectangle annotation?
[image]
Thank you in advance for guide me in this task!
2 posts - 1 participant