Please provide the following information when requesting support.
• Hardware (NVIDIA A10)
• Network Type ( DetectNet_v2 detector with ResNet18)
• TLT Version (nvcr.io/nvidia/tlt-streamanalytics:v3.0-py3)
• Training spec file(
spec.txt (6.4 KB)
)
Input images are in jpg format with different resolution (1080720, 19201080, etc)
2025-01-09 13:25:11,940 [INFO] iva.detectnet_v2.evaluation.evaluation: step 1110 / 1201, 1.43s/step
2025-01-09 13:25:26,209 [INFO] iva.detectnet_v2.evaluation.evaluation: step 1120 / 1201, 1.43s/step
2025-01-09 13:25:41,289 [INFO] iva.detectnet_v2.evaluation.evaluation: step 1130 / 1201, 1.51s/step
2025-01-09 13:25:55,287 [INFO] iva.detectnet_v2.evaluation.evaluation: step 1140 / 1201, 1.40s/step
2025-01-09 13:26:09,981 [INFO] iva.detectnet_v2.evaluation.evaluation: step 1150 / 1201, 1.47s/step
2025-01-09 13:26:24,005 [INFO] iva.detectnet_v2.evaluation.evaluation: step 1160 / 1201, 1.40s/step
2025-01-09 13:26:38,006 [INFO] iva.detectnet_v2.evaluation.evaluation: step 1170 / 1201, 1.40s/step
2025-01-09 13:26:52,730 [INFO] iva.detectnet_v2.evaluation.evaluation: step 1180 / 1201, 1.47s/step
2025-01-09 13:27:06,819 [INFO] iva.detectnet_v2.evaluation.evaluation: step 1190 / 1201, 1.41s/step
2025-01-09 13:27:21,501 [INFO] iva.detectnet_v2.evaluation.evaluation: step 1200 / 1201, 1.47s/step
Matching predictions to ground truth, class 1/4.: 100%|████████████████████████████████████| 4/4 [00:00<00:00, 28679.00it/s]
Epoch 1/20
=========================
Validation cost: 0.005425
Mean average_precision (in %): 0.0000
class name average precision (in %)
------------- --------------------------
four_wheeler 0
heavy 0
three_wheeler 0
two_wheeler 0
Median Inference Time: 0.008612
INFO:tensorflow:epoch = 1.0, learning_rate = 5.000004e-06, loss = 0.0053756707, step = 7383 (1733.481 sec)
2025-01-09 13:27:23,069 [INFO] tensorflow: epoch = 1.0, learning_rate = 5.000004e-06, loss = 0.0053756707, step = 7383 (1733.481 sec)
2025-01-09 13:27:23,070 [INFO] iva.detectnet_v2.tfhooks.task_progress_monitor_hook: Epoch 1/20: loss: 0.00538 learning rate: 0.00001 Time taken: 0:40:05.801975 ETA: 12:41:50.237516
2025-01-09 13:27:24,461 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 0.058
2025-01-09 13:27:26,667 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 45.345
INFO:tensorflow:epoch = 1.0078558851415413, learning_rate = 5.0912686e-06, loss = 0.005273092, step = 7441 (5.113 sec)
2025-01-09 13:27:28,182 [INFO] tensorflow: epoch = 1.0078558851415413, learning_rate = 5.0912686e-06, loss = 0.005273092, step = 7441 (5.113 sec)
Training log :
training.log (122.7 KB)
Can someone please take a look at the files and please let me know if I have done something wrong. I looked at other posts similar to this and followed their solutions but didn’t get anywhere, so I am not sure if I am missing something obvious.
26 posts - 2 participants