Please provide the following information when requesting support.
• Hardware (RTX 4090D)
• Network Type (OCDNet)
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)
Configuration of the TAO Toolkit Instance
task_group:
model:
dockers:
nvidia/tao/tao-toolkit:
5.5.0-pyt:
docker_registry: nvcr. io
tasks:
1. action_recognition
2. centerpose
3. visual_changenet
4. deformable_detr
5. dino
6. grounding_dino
7. mask_grounding_dino
8. mask2former
9. mal
10. ml_recog
11. ocdnet
12. ocrnet
13. optical_inspection
14. pointpillars
15. pose_classification
16. re_identification
17. classification_pyt
18. segformer
19. bevfusion
5.0.0-tf1.15.5:
docker_registry: nvcr. io
tasks:
1. bpnet
2. classification_tf1
3. converter
4. detectnet_v2
5. dssd
6. efficientdet_tf1
7. faster_rcnn
8. fpenet
9. lprnet
10. mask_rcnn
11. multitask_classification
12. retinanet
13. ssd
14. unet
15. yolo_v3
16. yolo_v4
17. yolo_v4_tiny
5.5.0-tf2:
docker_registry: nvcr.io
tasks:
1. classification_tf2
2. efficientdet_tf2
dataset:
dockers:
nvidia/tao/tao-toolkit:
5.5.0-data-services:
docker_registry: nvcr.io
tasks:
1. augmentation
2. auto_label
3. annotations
4. analytics
deploy:
dockers:
nvidia/tao/tao-toolkit:
5.5.0-deploy:
docker_registry: nvcr.io
tasks: …
format_version: 3.0
toolkit_version: 5.5.0
published_date: 08/26/2024
• Training spec file(If have, please share here)
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)
I have noticed that this issue once occured in TAO 5.3 and probabaly because of request 2.32.0. However, I am sure that I am using request 2.31.0, and I can login into NVCR successfully.
The “! tao info --verbose” also works and the details are as shown above.
Beside, I can directly using tao in cli, but without notebook the trainning and evaluation becomes troubling.
Btw, I have tried the whole installation on another machine with ubuntu 22.04, and it also occured.
2 posts - 2 participants