Excute tao model detectnet_v2 train but Failed

I am sorry I asked several times byt I am stuck with TAO detectnet_v2.

I got a lot of YOLO format data and image data.

3 0.498760 0.490741 0.859623 0.768519
0 0.728671 0.213955 0.252976 0.082672
1 0.478423 0.380622 0.730655 0.085317
2 0.544643 0.530093 0.416667 0.184524

However, it is said TAO Toolkit can not recognize YOLO format data, so I converted the YOLO format data to KITTI format data.

card 0 0 0 99.28583999999995 204.44448 1337.14296 1680.0009599999998 0 0 0 0 0 0 0 
barcode 0 0 0 867.1435199999999 331.42848000000004 1231.42896 490.15871999999996 0 0 0 0 0 0 0
name 0 0 0 162.85751999999994 648.8899200000001 1215.00072 812.69856 0 0 0 0 0 0 0
price 0 0 0 484.28567999999996 840.63552 1084.2861599999999 1194.9216 0 0 0 0 0 0 0

Then I used TAO Dataset Convert Tool to convert KITTI data to TFRecords files.

# Setting file
!cat $LOCAL_SPECS_DIR/spec_tfrecords_kitti.txt
kitti_config {
  root_directory_path: "/workspace/tao-experiments/data/"
  image_dir_name: "training/images_aikata"
  label_dir_name: "training/labels_aikata"
  image_extension: ".jpg"
  partition_mode: "random"
  num_partitions: 2
  val_split: 20
  num_shards: 10
}
image_directory_path: "/workspace/tao-experiments/data/"
target_class_mapping {
  key: "barcode"
  value: "barcode"
}
target_class_mapping {
  key: "name"
  value: "name"
}
target_class_mapping {
  key: "price"
  value: "price"
}
target_class_mapping {
  key: "card"
  value: "card"
}
target_class_mapping {
  key: "card7p"
  value: "card7p"
}
target_class_mapping {
  key: "unknown"
  value: "unknown"
}

# Setting file
!cat $LOCAL_SPECS_DIR/spec_train_kitti.txt
random_seed: 42
dataset_config {
  data_sources {
    tfrecords_path: "/workspace/tao-experiments/data/tfrecords_aikata/*"
    image_directory_path: "/workspace/tao-experiments/data/training"
  }
  image_extension: "jpg"
  target_class_mapping {
    key: "barcode"
    value: "barcode"
  }
  target_class_mapping {
    key: "name"
    value: "name"
  }
  target_class_mapping {
    key: "price"
    value: "price"
  }
  target_class_mapping {
    key: "card"
    value: "card"
  }
  target_class_mapping {
    key: "card7p"
    value: "card7p"
  }
  target_class_mapping {
    key: "unknown"
    value: "unknown"
  }
  validation_fold: 0
}
model_config {
  pretrained_model_file: "/workspace/tao-experiments/detectnet_v2_test/pretrained_resnet18/pretrained_detectnet_v2_vresnet18/resnet18.hdf5"
  num_layers: 18
  use_batch_norm: true
  objective_set {
    bbox {
      scale: 35.0
      offset: 0.5
    }
    cov {
    }
  }
  arch: "resnet"
}
augmentation_config {
  preprocessing {
    output_image_width: 1440
    output_image_height: 1920
    min_bbox_width: 1.0
    min_bbox_height: 1.0
    output_image_channel: 3
  }
  spatial_augmentation {
    hflip_probability: 0.5
    zoom_min: 1.0
    zoom_max: 1.0
    translate_max_x: 8.0
    translate_max_y: 8.0
  }
  color_augmentation {
    hue_rotation_max: 25.0
    saturation_shift_max: 0.20000000298
    contrast_scale_max: 0.10000000149
    contrast_center: 0.5
  }
}
postprocessing_config {
  target_class_config {
    key: "barcode"
    value {
      clustering_config {
        clustering_algorithm: DBSCAN
        dbscan_confidence_threshold: 0.9
        coverage_threshold: 0.00499999988824
        dbscan_eps: 0.20000000298
        dbscan_min_samples: 1
        minimum_bounding_box_height: 20
      }
    }
  }
  target_class_config {
    key: "name"
    value {
      clustering_config {
        clustering_algorithm: DBSCAN
        dbscan_confidence_threshold: 0.9
        coverage_threshold: 0.00499999988824
        dbscan_eps: 0.15000000596
        dbscan_min_samples: 1
        minimum_bounding_box_height: 20
      }
    }
  }
  target_class_config {
    key: "price"
    value {
      clustering_config {
        clustering_algorithm: DBSCAN
        dbscan_confidence_threshold: 0.9
        coverage_threshold: 0.00749999983236
        dbscan_eps: 0.230000004172
        dbscan_min_samples: 1
        minimum_bounding_box_height: 20
      }
    }
  }
  target_class_config {
    key: "card"
    value {
      clustering_config {
        clustering_algorithm: DBSCAN
        dbscan_confidence_threshold: 0.9
        coverage_threshold: 0.00749999983236
        dbscan_eps: 0.230000004172
        dbscan_min_samples: 1
        minimum_bounding_box_height: 20
      }
    }
  }
  target_class_config {
    key: "card7p"
    value {
      clustering_config {
        clustering_algorithm: DBSCAN
        dbscan_confidence_threshold: 0.9
        coverage_threshold: 0.00749999983236
        dbscan_eps: 0.230000004172
        dbscan_min_samples: 1
        minimum_bounding_box_height: 20
      }
    }
  }
  target_class_config {
    key: "unknown"
    value {
      clustering_config {
        clustering_algorithm: DBSCAN
        dbscan_confidence_threshold: 0.9
        coverage_threshold: 0.00749999983236
        dbscan_eps: 0.230000004172
        dbscan_min_samples: 1
        minimum_bounding_box_height: 20
      }
    }
  }
}
evaluation_config {
  validation_period_during_training: 10
  first_validation_epoch: 30
  minimum_detection_ground_truth_overlap {
    key: "barcode"
    value: 0.699999988079
  }
  minimum_detection_ground_truth_overlap {
    key: "name"
    value: 0.5
  }
  minimum_detection_ground_truth_overlap {
    key: "price"
    value: 0.5
  }
  minimum_detection_ground_truth_overlap {
    key: "card"
    value: 0.5
  }
  minimum_detection_ground_truth_overlap {
    key: "card7p"
    value: 0.5
  }
  minimum_detection_ground_truth_overlap {
    key: "unknown"
    value: 0.5
  }
  evaluation_box_config {
    key: "barcode"
    value {
      minimum_height: 20
      maximum_height: 9999
      minimum_width: 10
      maximum_width: 9999
    }
  }
  evaluation_box_config {
    key: "name"
    value {
      minimum_height: 20
      maximum_height: 9999
      minimum_width: 10
      maximum_width: 9999
    }
  }
  evaluation_box_config {
    key: "price"
    value {
      minimum_height: 20
      maximum_height: 9999
      minimum_width: 10
      maximum_width: 9999
    }
  }
  evaluation_box_config {
    key: "card"
    value {
      minimum_height: 20
      maximum_height: 9999
      minimum_width: 10
      maximum_width: 9999
    }
  }
  evaluation_box_config {
    key: "card7p"
    value {
      minimum_height: 20
      maximum_height: 9999
      minimum_width: 10
      maximum_width: 9999
    }
  }
  evaluation_box_config {
    key: "unknown"
    value {
      minimum_height: 20
      maximum_height: 9999
      minimum_width: 10
      maximum_width: 9999
    }
  }
  average_precision_mode: INTEGRATE
}
cost_function_config {
  target_classes {
    name: "barcode"
    class_weight: 1.0
    coverage_foreground_weight: 0.0500000007451
    objectives {
      name: "cov"
      initial_weight: 1.0
      weight_target: 1.0
    }
    objectives {
      name: "bbox"
      initial_weight: 10.0
      weight_target: 10.0
    }
  }
  target_classes {
    name: "name"
    class_weight: 8.0
    coverage_foreground_weight: 0.0500000007451
    objectives {
      name: "cov"
      initial_weight: 1.0
      weight_target: 1.0
    }
    objectives {
      name: "bbox"
      initial_weight: 10.0
      weight_target: 1.0
    }
  }
  target_classes {
    name: "price"
    class_weight: 4.0
    coverage_foreground_weight: 0.0500000007451
    objectives {
      name: "cov"
      initial_weight: 1.0
      weight_target: 1.0
    }
    objectives {
      name: "bbox"
      initial_weight: 10.0
      weight_target: 10.0
    }
  }
  target_classes {
    name: "card"
    class_weight: 4.0
    coverage_foreground_weight: 0.0500000007451
    objectives {
      name: "cov"
      initial_weight: 1.0
      weight_target: 1.0
    }
    objectives {
      name: "bbox"
      initial_weight: 10.0
      weight_target: 10.0
    }
  }
  target_classes {
    name: "card7p"
    class_weight: 4.0
    coverage_foreground_weight: 0.0500000007451
    objectives {
      name: "cov"
      initial_weight: 1.0
      weight_target: 1.0
    }
    objectives {
      name: "bbox"
      initial_weight: 10.0
      weight_target: 10.0
    }
  }
  target_classes {
    name: "unknown"
    class_weight: 4.0
    coverage_foreground_weight: 0.0500000007451
    objectives {
      name: "cov"
      initial_weight: 1.0
      weight_target: 1.0
    }
    objectives {
      name: "bbox"
      initial_weight: 10.0
      weight_target: 10.0
    }
  }
  enable_autoweighting: false
  max_objective_weight: 0.999899983406
  min_objective_weight: 9.99999974738e-05
}
training_config {
  batch_size_per_gpu: 4
  num_epochs: 120
  learning_rate {
    soft_start_annealing_schedule {
      min_learning_rate: 5e-07
      max_learning_rate: 5e-05
      soft_start: 0.10000000149
      annealing: 0.699999988079
    }
  }
  regularizer {
    type: L1
    weight: 3.00000002618e-09
  }
  optimizer {
    adam {
      epsilon: 9.99999993923e-09
      beta1: 0.899999976158
      beta2: 0.999000012875
    }
  }
  cost_scaling {
    initial_exponent: 20.0
    increment: 0.005
    decrement: 1.0
  }
  visualizer{
    enabled: true
    num_images: 3
    scalar_logging_frequency: 50
    infrequent_logging_frequency: 5
    target_class_config {
      key: "barcode"
      value: {
        coverage_threshold: 0.005
      }
    }
    target_class_config {
      key: "name"
      value: {
        coverage_threshold: 0.005
      }
    }
    target_class_config {
      key: "price"
      value: {
        coverage_threshold: 0.005
      }
    }
    target_class_config {
      key: "card"
      value: {
        coverage_threshold: 0.005
      }
    }
    target_class_config {
      key: "card7p"
      value: {
        coverage_threshold: 0.005
      }
    }
    target_class_config {
      key: "unknown"
      value: {
        coverage_threshold: 0.005
      }
    }
    clearml_config{
      project: "TAO Toolkit ClearML Demo"
      task: "detectnet_v2_resnet18_clearml"
      tags: "detectnet_v2"
      tags: "training"
      tags: "resnet18"
      tags: "unpruned"
    }
    wandb_config{
      project: "TAO Toolkit Wandb Demo"
      name: "detectnet_v2_resnet18_wandb"
      tags: "detectnet_v2"
      tags: "training"
      tags: "resnet18"
      tags: "unpruned"
    }
  }
  checkpoint_interval: 10
}
bbox_rasterizer_config {
  target_class_config {
    key: "barcode"
    value {
      cov_center_x: 0.5
      cov_center_y: 0.5
      cov_radius_x: 0.40000000596
      cov_radius_y: 0.40000000596
      bbox_min_radius: 1.0
    }
  }
  target_class_config {
    key: "name"
    value {
      cov_center_x: 0.5
      cov_center_y: 0.5
      cov_radius_x: 1.0
      cov_radius_y: 1.0
      bbox_min_radius: 1.0
    }
  }
  target_class_config {
    key: "price"
    value {
      cov_center_x: 0.5
      cov_center_y: 0.5
      cov_radius_x: 1.0
      cov_radius_y: 1.0
      bbox_min_radius: 1.0
    }
  }
  target_class_config {
    key: "card"
    value {
      cov_center_x: 0.5
      cov_center_y: 0.5
      cov_radius_x: 1.0
      cov_radius_y: 1.0
      bbox_min_radius: 1.0
    }
  }
  target_class_config {
    key: "card7p"
    value {
      cov_center_x: 0.5
      cov_center_y: 0.5
      cov_radius_x: 1.0
      cov_radius_y: 1.0
      bbox_min_radius: 1.0
    }
  }
  target_class_config {
    key: "unknown"
    value {
      cov_center_x: 0.5
      cov_center_y: 0.5
      cov_radius_x: 1.0
      cov_radius_y: 1.0
      bbox_min_radius: 1.0
    }
  }
  deadzone_radius: 0.400000154972

# TAO Dataset Converter
!tao model detectnet_v2 dataset_convert -d /workspace/tao-experiments/detectnet_v2/specs/spec_tfrecords_kitti.txt \
                                        -o /workspace/tao-experiments/data/tfrecords_aikata
...                              
2024-06-03 09:04:18,874 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 105: Class map. 
Label in GT: Label in tfrecords file 
b'card': b'card'
b'name': b'name'
b'price': b'price'
b'barcode': b'barcode'
b'card7p': b'card7p'
b'unknown': b'unknown'
For the dataset_config in the experiment_spec, please use labels in the tfrecords file, while writing the classmap.

2024-06-03 09:04:18,874 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 114: Tfrecords generation complete.
Execution status: PASS
2024-06-03 18:04:23,922 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 363: Stopping container.

So, I thought it’s ready to train dataset with TAO and run these commands, but FAILED.

!tao model detectnet_v2 train -e $SPECS_DIR/spec_train_kitti.txt \
                        -r $USER_EXPERIMENT_DIR/experiment_dir_unpruned \
                        -k aikata \
                        -n resnet18_detector \
                        --gpus $NUM_GPUS
2024-06-03 18:10:45,022 [TAO Toolkit] [INFO] root 160: Registry: ['nvcr.io']
2024-06-03 18:10:45,086 [TAO Toolkit] [INFO] nvidia_tao_cli.components.instance_handler.local_instance 360: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5
2024-06-03 18:10:45,321 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 301: Printing tty value True
2024-06-03 09:10:46.078291: I tensorflow/stream_executor/platform/default/dso_loader.cc:50] Successfully opened dynamic library libcudart.so.12
2024-06-03 09:10:46,128 [TAO Toolkit] [WARNING] tensorflow 40: Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
2024-06-03 09:10:47,178 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable  TF_ALLOW_IOLIBS=1.
2024-06-03 09:10:47,205 [TAO Toolkit] [WARNING] tensorflow 42: TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable  TF_ALLOW_IOLIBS=1.
2024-06-03 09:10:47,209 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable  TF_ALLOW_IOLIBS=1.
2024-06-03 09:10:48,220 [TAO Toolkit] [WARNING] matplotlib 500: Matplotlib created a temporary config/cache directory at /tmp/matplotlib-9thmcfat because the default path (/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
2024-06-03 09:10:48,390 [TAO Toolkit] [INFO] matplotlib.font_manager 1633: generated new fontManager
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
WARNING:tensorflow:TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable  TF_ALLOW_IOLIBS=1.
...
2024-06-03 09:10:53,326 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 133: Loading weights from pretrained model file. /workspace/tao-experiments/detectnet_v2_test/pretrained_resnet18/pretrained_detectnet_v2_vresnet18/resnet18.hdf5
2024-06-03 09:10:53,326 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer input_1 weights set from pre-trained model.
2024-06-03 09:10:53,446 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer conv1 weights set from pre-trained model.
2024-06-03 09:10:53,560 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer bn_conv1 weights set from pre-trained model.
2024-06-03 09:10:53,560 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer activation_1 weights set from pre-trained model.
2024-06-03 09:10:53,675 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_1a_conv_1 weights set from pre-trained model.
2024-06-03 09:10:53,815 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_1a_bn_1 weights set from pre-trained model.
2024-06-03 09:10:53,962 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_1a_conv_2 weights set from pre-trained model.
2024-06-03 09:10:54,099 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_1a_conv_shortcut weights set from pre-trained model.
2024-06-03 09:10:54,225 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_1a_bn_2 weights set from pre-trained model.
2024-06-03 09:10:54,342 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_1a_bn_shortcut weights set from pre-trained model.
2024-06-03 09:10:54,342 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer add_1 weights set from pre-trained model.
2024-06-03 09:10:54,455 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_1b_conv_1 weights set from pre-trained model.
2024-06-03 09:10:54,632 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_1b_bn_1 weights set from pre-trained model.
2024-06-03 09:10:54,770 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_1b_conv_2 weights set from pre-trained model.
2024-06-03 09:10:54,885 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_1b_bn_2 weights set from pre-trained model.
2024-06-03 09:10:54,885 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer add_2 weights set from pre-trained model.
2024-06-03 09:10:54,996 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_2a_conv_1 weights set from pre-trained model.
2024-06-03 09:10:55,111 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_2a_bn_1 weights set from pre-trained model.
2024-06-03 09:10:55,225 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_2a_conv_2 weights set from pre-trained model.
2024-06-03 09:10:55,337 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_2a_conv_shortcut weights set from pre-trained model.
2024-06-03 09:10:55,454 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_2a_bn_2 weights set from pre-trained model.
2024-06-03 09:10:55,572 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_2a_bn_shortcut weights set from pre-trained model.
2024-06-03 09:10:55,572 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer add_3 weights set from pre-trained model.
2024-06-03 09:10:55,685 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_2b_conv_1 weights set from pre-trained model.
2024-06-03 09:10:55,808 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_2b_bn_1 weights set from pre-trained model.
2024-06-03 09:10:55,931 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_2b_conv_2 weights set from pre-trained model.
2024-06-03 09:10:56,051 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_2b_bn_2 weights set from pre-trained model.
2024-06-03 09:10:56,051 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer add_4 weights set from pre-trained model.
2024-06-03 09:10:56,168 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_3a_conv_1 weights set from pre-trained model.
2024-06-03 09:10:56,360 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_3a_bn_1 weights set from pre-trained model.
2024-06-03 09:10:56,570 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_3a_conv_2 weights set from pre-trained model.
2024-06-03 09:10:56,770 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_3a_conv_shortcut weights set from pre-trained model.
2024-06-03 09:10:56,969 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_3a_bn_2 weights set from pre-trained model.
2024-06-03 09:10:57,139 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_3a_bn_shortcut weights set from pre-trained model.
2024-06-03 09:10:57,139 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer add_5 weights set from pre-trained model.
2024-06-03 09:10:57,278 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_3b_conv_1 weights set from pre-trained model.
2024-06-03 09:10:57,399 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_3b_bn_1 weights set from pre-trained model.
2024-06-03 09:10:57,534 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_3b_conv_2 weights set from pre-trained model.
2024-06-03 09:10:57,749 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_3b_bn_2 weights set from pre-trained model.
2024-06-03 09:10:57,749 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer add_6 weights set from pre-trained model.
2024-06-03 09:10:57,887 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_4a_conv_1 weights set from pre-trained model.
2024-06-03 09:10:58,022 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_4a_bn_1 weights set from pre-trained model.
2024-06-03 09:10:58,152 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_4a_conv_2 weights set from pre-trained model.
2024-06-03 09:10:58,271 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_4a_conv_shortcut weights set from pre-trained model.
2024-06-03 09:10:58,402 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_4a_bn_2 weights set from pre-trained model.
2024-06-03 09:10:58,537 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_4a_bn_shortcut weights set from pre-trained model.
2024-06-03 09:10:58,537 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer add_7 weights set from pre-trained model.
2024-06-03 09:10:58,689 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_4b_conv_1 weights set from pre-trained model.
2024-06-03 09:10:58,903 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_4b_bn_1 weights set from pre-trained model.
2024-06-03 09:10:59,074 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_4b_conv_2 weights set from pre-trained model.
2024-06-03 09:10:59,198 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_4b_bn_2 weights set from pre-trained model.
2024-06-03 09:10:59,198 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer add_8 weights set from pre-trained model.
2024-06-03 09:10:59,274 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.objectives.bbox_objective 78: Default L1 loss function will be used.
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            (None, 3, 1920, 1440 0                                            
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, 64, 960, 720) 9472        input_1[0][0]                    
__________________________________________________________________________________________________
bn_conv1 (BatchNormalization)   (None, 64, 960, 720) 256         conv1[0][0]                      
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 64, 960, 720) 0           bn_conv1[0][0]                   
__________________________________________________________________________________________________
block_1a_conv_1 (Conv2D)        (None, 64, 480, 360) 36928       activation_1[0][0]               
__________________________________________________________________________________________________
block_1a_bn_1 (BatchNormalizati (None, 64, 480, 360) 256         block_1a_conv_1[0][0]            
__________________________________________________________________________________________________
block_1a_relu_1 (Activation)    (None, 64, 480, 360) 0           block_1a_bn_1[0][0]              
__________________________________________________________________________________________________
block_1a_conv_2 (Conv2D)        (None, 64, 480, 360) 36928       block_1a_relu_1[0][0]            
__________________________________________________________________________________________________
block_1a_conv_shortcut (Conv2D) (None, 64, 480, 360) 4160        activation_1[0][0]               
__________________________________________________________________________________________________
block_1a_bn_2 (BatchNormalizati (None, 64, 480, 360) 256         block_1a_conv_2[0][0]            
__________________________________________________________________________________________________
block_1a_bn_shortcut (BatchNorm (None, 64, 480, 360) 256         block_1a_conv_shortcut[0][0]     
__________________________________________________________________________________________________
add_1 (Add)                     (None, 64, 480, 360) 0           block_1a_bn_2[0][0]              
                                                                 block_1a_bn_shortcut[0][0]       
__________________________________________________________________________________________________
block_1a_relu (Activation)      (None, 64, 480, 360) 0           add_1[0][0]                      
__________________________________________________________________________________________________
block_1b_conv_1 (Conv2D)        (None, 64, 480, 360) 36928       block_1a_relu[0][0]              
__________________________________________________________________________________________________
block_1b_bn_1 (BatchNormalizati (None, 64, 480, 360) 256         block_1b_conv_1[0][0]            
__________________________________________________________________________________________________
block_1b_relu_1 (Activation)    (None, 64, 480, 360) 0           block_1b_bn_1[0][0]              
__________________________________________________________________________________________________
block_1b_conv_2 (Conv2D)        (None, 64, 480, 360) 36928       block_1b_relu_1[0][0]            
__________________________________________________________________________________________________
block_1b_bn_2 (BatchNormalizati (None, 64, 480, 360) 256         block_1b_conv_2[0][0]            
__________________________________________________________________________________________________
add_2 (Add)                     (None, 64, 480, 360) 0           block_1b_bn_2[0][0]              
                                                                 block_1a_relu[0][0]              
__________________________________________________________________________________________________
block_1b_relu (Activation)      (None, 64, 480, 360) 0           add_2[0][0]                      
__________________________________________________________________________________________________
block_2a_conv_1 (Conv2D)        (None, 128, 240, 180 73856       block_1b_relu[0][0]              
__________________________________________________________________________________________________
block_2a_bn_1 (BatchNormalizati (None, 128, 240, 180 512         block_2a_conv_1[0][0]            
__________________________________________________________________________________________________
block_2a_relu_1 (Activation)    (None, 128, 240, 180 0           block_2a_bn_1[0][0]              
__________________________________________________________________________________________________
block_2a_conv_2 (Conv2D)        (None, 128, 240, 180 147584      block_2a_relu_1[0][0]            
__________________________________________________________________________________________________
block_2a_conv_shortcut (Conv2D) (None, 128, 240, 180 8320        block_1b_relu[0][0]              
__________________________________________________________________________________________________
block_2a_bn_2 (BatchNormalizati (None, 128, 240, 180 512         block_2a_conv_2[0][0]            
__________________________________________________________________________________________________
block_2a_bn_shortcut (BatchNorm (None, 128, 240, 180 512         block_2a_conv_shortcut[0][0]     
__________________________________________________________________________________________________
add_3 (Add)                     (None, 128, 240, 180 0           block_2a_bn_2[0][0]              
                                                                 block_2a_bn_shortcut[0][0]       
__________________________________________________________________________________________________
block_2a_relu (Activation)      (None, 128, 240, 180 0           add_3[0][0]                      
__________________________________________________________________________________________________
block_2b_conv_1 (Conv2D)        (None, 128, 240, 180 147584      block_2a_relu[0][0]              
__________________________________________________________________________________________________
block_2b_bn_1 (BatchNormalizati (None, 128, 240, 180 512         block_2b_conv_1[0][0]            
__________________________________________________________________________________________________
block_2b_relu_1 (Activation)    (None, 128, 240, 180 0           block_2b_bn_1[0][0]              
__________________________________________________________________________________________________
block_2b_conv_2 (Conv2D)        (None, 128, 240, 180 147584      block_2b_relu_1[0][0]            
__________________________________________________________________________________________________
block_2b_bn_2 (BatchNormalizati (None, 128, 240, 180 512         block_2b_conv_2[0][0]            
__________________________________________________________________________________________________
add_4 (Add)                     (None, 128, 240, 180 0           block_2b_bn_2[0][0]              
                                                                 block_2a_relu[0][0]              
__________________________________________________________________________________________________
block_2b_relu (Activation)      (None, 128, 240, 180 0           add_4[0][0]                      
__________________________________________________________________________________________________
block_3a_conv_1 (Conv2D)        (None, 256, 120, 90) 295168      block_2b_relu[0][0]              
__________________________________________________________________________________________________
block_3a_bn_1 (BatchNormalizati (None, 256, 120, 90) 1024        block_3a_conv_1[0][0]            
__________________________________________________________________________________________________
block_3a_relu_1 (Activation)    (None, 256, 120, 90) 0           block_3a_bn_1[0][0]              
__________________________________________________________________________________________________
block_3a_conv_2 (Conv2D)        (None, 256, 120, 90) 590080      block_3a_relu_1[0][0]            
__________________________________________________________________________________________________
block_3a_conv_shortcut (Conv2D) (None, 256, 120, 90) 33024       block_2b_relu[0][0]              
__________________________________________________________________________________________________
block_3a_bn_2 (BatchNormalizati (None, 256, 120, 90) 1024        block_3a_conv_2[0][0]            
__________________________________________________________________________________________________
block_3a_bn_shortcut (BatchNorm (None, 256, 120, 90) 1024        block_3a_conv_shortcut[0][0]     
__________________________________________________________________________________________________
add_5 (Add)                     (None, 256, 120, 90) 0           block_3a_bn_2[0][0]              
                                                                 block_3a_bn_shortcut[0][0]       
__________________________________________________________________________________________________
block_3a_relu (Activation)      (None, 256, 120, 90) 0           add_5[0][0]                      
__________________________________________________________________________________________________
block_3b_conv_1 (Conv2D)        (None, 256, 120, 90) 590080      block_3a_relu[0][0]              
__________________________________________________________________________________________________
block_3b_bn_1 (BatchNormalizati (None, 256, 120, 90) 1024        block_3b_conv_1[0][0]            
__________________________________________________________________________________________________
block_3b_relu_1 (Activation)    (None, 256, 120, 90) 0           block_3b_bn_1[0][0]              
__________________________________________________________________________________________________
block_3b_conv_2 (Conv2D)        (None, 256, 120, 90) 590080      block_3b_relu_1[0][0]            
__________________________________________________________________________________________________
block_3b_bn_2 (BatchNormalizati (None, 256, 120, 90) 1024        block_3b_conv_2[0][0]            
__________________________________________________________________________________________________
add_6 (Add)                     (None, 256, 120, 90) 0           block_3b_bn_2[0][0]              
                                                                 block_3a_relu[0][0]              
__________________________________________________________________________________________________
block_3b_relu (Activation)      (None, 256, 120, 90) 0           add_6[0][0]                      
__________________________________________________________________________________________________
block_4a_conv_1 (Conv2D)        (None, 512, 120, 90) 1180160     block_3b_relu[0][0]              
__________________________________________________________________________________________________
block_4a_bn_1 (BatchNormalizati (None, 512, 120, 90) 2048        block_4a_conv_1[0][0]            
__________________________________________________________________________________________________
block_4a_relu_1 (Activation)    (None, 512, 120, 90) 0           block_4a_bn_1[0][0]              
__________________________________________________________________________________________________
block_4a_conv_2 (Conv2D)        (None, 512, 120, 90) 2359808     block_4a_relu_1[0][0]            
__________________________________________________________________________________________________
block_4a_conv_shortcut (Conv2D) (None, 512, 120, 90) 131584      block_3b_relu[0][0]              
__________________________________________________________________________________________________
block_4a_bn_2 (BatchNormalizati (None, 512, 120, 90) 2048        block_4a_conv_2[0][0]            
__________________________________________________________________________________________________
block_4a_bn_shortcut (BatchNorm (None, 512, 120, 90) 2048        block_4a_conv_shortcut[0][0]     
__________________________________________________________________________________________________
add_7 (Add)                     (None, 512, 120, 90) 0           block_4a_bn_2[0][0]              
                                                                 block_4a_bn_shortcut[0][0]       
__________________________________________________________________________________________________
block_4a_relu (Activation)      (None, 512, 120, 90) 0           add_7[0][0]                      
__________________________________________________________________________________________________
block_4b_conv_1 (Conv2D)        (None, 512, 120, 90) 2359808     block_4a_relu[0][0]              
__________________________________________________________________________________________________
block_4b_bn_1 (BatchNormalizati (None, 512, 120, 90) 2048        block_4b_conv_1[0][0]            
__________________________________________________________________________________________________
block_4b_relu_1 (Activation)    (None, 512, 120, 90) 0           block_4b_bn_1[0][0]              
__________________________________________________________________________________________________
block_4b_conv_2 (Conv2D)        (None, 512, 120, 90) 2359808     block_4b_relu_1[0][0]            
__________________________________________________________________________________________________
block_4b_bn_2 (BatchNormalizati (None, 512, 120, 90) 2048        block_4b_conv_2[0][0]            
__________________________________________________________________________________________________
add_8 (Add)                     (None, 512, 120, 90) 0           block_4b_bn_2[0][0]              
                                                                 block_4a_relu[0][0]              
__________________________________________________________________________________________________
block_4b_relu (Activation)      (None, 512, 120, 90) 0           add_8[0][0]                      
__________________________________________________________________________________________________
output_bbox (Conv2D)            (None, 24, 120, 90)  12312       block_4b_relu[0][0]              
__________________________________________________________________________________________________
output_cov (Conv2D)             (None, 6, 120, 90)   3078        block_4b_relu[0][0]              
==================================================================================================
Total params: 11,210,718
Trainable params: 11,200,990
Non-trainable params: 9,728
__________________________________________________________________________________________________
2024-06-03 09:10:59,297 [TAO Toolkit] [INFO] root 2102: DetectNet V2 model built.
2024-06-03 09:10:59,298 [TAO Toolkit] [INFO] root 2102: Building rasterizer.
2024-06-03 09:10:59,298 [TAO Toolkit] [INFO] root 2102: Rasterizers built.
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/training/training_proto_utilities.py:102: The name tf.train.get_or_create_global_step is deprecated. Please use tf.compat.v1.train.get_or_create_global_step instead.
...
INFO:tensorflow:Graph was finalized.
2024-06-03 09:11:10,956 [TAO Toolkit] [INFO] tensorflow 240: Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmpjaunkjin/model.ckpt-120
2024-06-03 09:11:10,958 [TAO Toolkit] [INFO] tensorflow 1284: Restoring parameters from /tmp/tmpjaunkjin/model.ckpt-120
2024-06-03 09:11:11,739 [TAO Toolkit] [INFO] root 2102: Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

2 root error(s) found.
  (0) Not found: Key cost_sums/barcode-bbox not found in checkpoint
	 [[node save/RestoreV2 (defined at /tensorflow_core/python/framework/ops.py:1748) ]]
  (1) Not found: Key cost_sums/barcode-bbox not found in checkpoint
	 [[node save/RestoreV2 (defined at /tensorflow_core/python/framework/ops.py:1748) ]]
	 [[save/RestoreV2/_793]]
0 successful operations.
0 derived errors ignored.
...
During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/training/saver.py", line 1300, in restore
    names_to_keys = object_graph_key_mapping(save_path)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/training/saver.py", line 1618, in object_graph_key_mapping
    object_graph_string = reader.get_tensor(trackable.OBJECT_GRAPH_PROTO_KEY)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/pywrap_tensorflow_internal.py", line 915, in get_tensor
    return CheckpointReader_GetTensor(self, compat.as_bytes(tensor_str))
tensorflow.python.framework.errors_impl.NotFoundError: Key _CHECKPOINTABLE_OBJECT_GRAPH not found in checkpoint
...
2024-06-03 09:11:12,086 [TAO Toolkit] [ERROR] tensorflow 70: ==================================
Object was never used (type <class 'tensorflow.python.framework.ops.Tensor'>):
<tf.Tensor 'IsVariableInitialized_302:0' shape=() dtype=bool>
If you want to mark it as used call its "mark_used()" method.
It was originally created here:
  File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/training/utilities.py", line 154, in get_singular_monitored_session
    return tf.train.SingularMonitoredSession(hooks=hooks,  File "/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1100, in __init__
    super(SingularMonitoredSession, self).__init__(  File "/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/training/monitored_session.py", line 727, in __init__
    self._sess = self._coordinated_creator.create_session()  File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/core/hooks/hooks.py", line 286, in begin
    self._variables_initialized.append(  File "/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/util/tf_should_use.py", line 198, in wrapped
    return _add_should_use_warning(fn(*args, **kwargs))
==================================
Execution status: FAIL
2024-06-03 18:11:16,861 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 363: Stopping container.

It shows FAIL, but I do not have any clue from the stacktrace.
What might be the reason to be failed?
Because of my setting files or KITTI files ?

2 posts - 2 participants

Read full topic

Excute tao model detectnet_v2 train but Failed

Trending Articles

Bath man appears in court charged with attempted murder of a man...

MACLEAN, Allan

Black Angus Grilled Artichokes

Practice Sheet of Right form of verbs for HSC Students

Police blotter for Jan. 12

99 God Status for Whatsapp, Facebook

Rajasthan Board 12th Science Result 2018 name wise- RBSE 12th commerce result...

Notorious Naushad of Ippa gang nabbed

Child Kidnapping: Amy McNeil was kidnapped on her way to school by 5 adults;...

Sonible Smartlimit v1.1.5-R2R

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Arrow Flash 2 – Sinhala Dubbed – Episode 23 – 20th March 2016

[GET] AI Traffic Goldmine

[E² Plugin] HDF-Radio

Universal Multi-Patch v1.3 By RADIXX11

IWAN – Thanks and Praise ( Throw Back Thursday )

RONALD P SONDERGAARD Arrested by Miami-Dade County Corrections on Mar 03, 2017

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

HSSC Excise & Taxation Inspector Result 2017 Scorecard/ Category Wise Merit List