Trying to get tao toolkit up and running using this tutorial and having issues with the Docker.
!tao model yolo_v4 dataset_convert -d $SPECS_DIR/yolo_v4_tfrecords_kitti_train.txt \
-o $DATA_DOWNLOAD_DIR/yolo_v4/tfrecords/train \
-r $USER_EXPERIMENT_DIR/
#--gpus 1 --debug/
2025-01-28 22:58:19,933 [TAO Toolkit] [INFO] root 160: Registry: ['nvcr.io']
2025-01-28 22:58:20,025 [TAO Toolkit] [INFO] nvidia_tao_cli.components.instance_handler.local_instance 360: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5
2025-01-28 22:58:20,064 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 301: Printing tty value True
What's next:
Try Docker Debug for seamless, persistent debugging tools in any container or image → docker debug f228ac51bcb901c9206c4772e25830c541d1bd3329c1f33c18dc8c0e13acbb4d
Learn more at https://docs.docker.com/go/debug-cli/
Error response from daemon: No such container: f228ac51bcb901c9206c4772e25830c541d1bd3329c1f33c18dc8c0e13acbb4d
2025-01-28 22:58:22,421 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 363: Stopping container.
tao info
Configuration of the TAO Toolkit Instance
task_group: ['model', 'dataset', 'deploy']
format_version: 3.0
toolkit_version: 5.5.0
published_date: 08/26/2024
docker login nvcr.io
.......
Login Succeeded
Maybe the following will give a hint for the reason?
docker run --rm --gpus all ubuntu nvidia-smi
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].
now with sudo
sudo docker run --rm --gpus all ubuntu nvidia-smi
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01 Driver Version: 535.183.01 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 4060 ... Off | 00000000:01:00.0 Off | N/A |
| N/A 47C P4 10W / 55W | 8MiB / 8188MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
+---------------------------------------------------------------------------------------+
Some system info:
$ cat /etc/docker/daemon.json
{
"default-runtime": "nvidia",
"runtimes": {
"nvidia": {
"args": [],
"path": "/usr/bin/nvidia-container-runtime"
}
}
}
$ cat ~/.docker/config.json
{
"auths": {
"nvcr.io": {
"auth": "************************************************************8"
}
},
"credStore": "desktop",
"currentContext": "desktop-linux",
"plugins": {
"debug": {
"hooks": "exec"
},
"scout": {
"hooks": "pull,buildx build"
}
},
"features": {
"hooks": "true"
}
}
Also, every restart the credStore somehow became credsStore, which prevents me from the docker login to nvcr.io, unless i change it.
3 posts - 2 participants