Quantcast
Channel: TAO Toolkit - NVIDIA Developer Forums
Viewing all articles
Browse latest Browse all 497

Makenet Training Error

$
0
0

Please provide the following information when requesting support.

• Hardware (NVIDIA RTX 3080Ti)
• Network Type (makenet)
• TLT Version (tao_tf1)
• Training spec file
model_config {

Model Architecture can be chosen from:

[‘resnet’, ‘vgg’, ‘googlenet’, ‘alexnet’]

arch: “resnet”

for resnet → n_layers can be [10, 18, 50]

for vgg → n_layers can be [16, 19]

n_layers: 101
use_batch_norm: True
use_bias: False
all_projections: False
use_pooling: True
retain_head: True
resize_interpolation_method: BICUBIC

if you want to use the pretrained model,

image size should be “3,224,224”

otherwise, it can be “3, X, Y”, where X,Y >= 16

input_image_size: “3,224,224”
}
train_config {
train_dataset_path: “/workspace/tao-tf1/nvidia_tao_tf1/cv/dataset/imagent10/train”
val_dataset_path: “/workspace/tao-tf1/nvidia_tao_tf1/cv/dataset/imagent10/val”

Only [‘sgd’, ‘adam’] are supported for optimizer

optimizer {
sgd {
lr: 0.01
decay: 0.0
momentum: 0.9
nesterov: False
}
}
batch_size_per_gpu: 50
n_epochs: 150

Number of CPU cores for loading data

n_workers: 16

regularizer

reg_config {
# regularizer type can be “L1”, “L2” or “None”.
type: “L2”
# if the type is not “None”,
# scope can be either “Conv2D” or “Dense” or both.
scope: “Conv2D,Dense”
# 0 < weight decay < 1
weight_decay: 0.000015
}

learning_rate

lr_config {
cosine {
learning_rate: 0.04
soft_start: 0.0
}
}
enable_random_crop: True
enable_center_crop: True
enable_color_augmentation: True
mixup_alpha: 0.2
label_smoothing: 0.1
}

  • Detailed Log Details:

root@INTVMLT3947:/workspace/tao-tf1# python nvidia_tao_tf1/cv/makenet/scripts/train.py -e /workspace/tao-tf1/nvidia_tao_tf1/cv/makenet/experiment_specs/resnet101_imagenet2012.txt -r /workspace/tao-tf1/nvidia_tao_tf1/cv/makenet/results/exp1
Using TensorFlow backend.
2024-07-21 08:35:32.662740: I tensorflow/stream_executor/platform/default/dso_loader.cc:50] Successfully opened dynamic library libcudart.so.12
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
WARNING:tensorflow:TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable TF_ALLOW_IOLIBS=1.
2024-07-21 08:35:33,310 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable TF_ALLOW_IOLIBS=1.
WARNING:tensorflow:TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable TF_ALLOW_IOLIBS=1.
2024-07-21 08:35:33,330 [TAO Toolkit] [WARNING] tensorflow 42: TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable TF_ALLOW_IOLIBS=1.
WARNING:tensorflow:TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable TF_ALLOW_IOLIBS=1.
2024-07-21 08:35:33,331 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable TF_ALLOW_IOLIBS=1.
/workspace/tao-tf1/nvidia_tao_tf1/cv/makenet/utils/helper.py:150: NumbaDeprecationWarning: The ‘nopython’ keyword argument was not supplied to the ‘numba.jit’ decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See Deprecation Notices — Numba 0+untagged.871.g53e976f.dirty documentation for details.
def random_hue(img, max_delta=10.0):
/workspace/tao-tf1/nvidia_tao_tf1/cv/makenet/utils/helper.py:173: NumbaDeprecationWarning: The ‘nopython’ keyword argument was not supplied to the ‘numba.jit’ decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See Deprecation Notices — Numba 0+untagged.871.g53e976f.dirty documentation for details.
def random_saturation(img, max_shift):
/workspace/tao-tf1/nvidia_tao_tf1/cv/makenet/utils/helper.py:183: NumbaDeprecationWarning: The ‘nopython’ keyword argument was not supplied to the ‘numba.jit’ decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See Deprecation Notices — Numba 0+untagged.871.g53e976f.dirty documentation for details.
def random_contrast(img, center, max_contrast_scale):
/workspace/tao-tf1/nvidia_tao_tf1/cv/makenet/utils/helper.py:192: NumbaDeprecationWarning: The ‘nopython’ keyword argument was not supplied to the ‘numba.jit’ decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See Deprecation Notices — Numba 0+untagged.871.g53e976f.dirty documentation for details.
def random_shift(x_img, shift_stddev):
2024-07-21 08:35:33.887885: I tensorflow/stream_executor/platform/default/dso_loader.cc:50] Successfully opened dynamic library libcudart.so.12
2024-07-21 08:35:36.525165: I tensorflow/stream_executor/platform/default/dso_loader.cc:50] Successfully opened dynamic library libcuda.so.1
[1721550936.528377] [INTVMLT3947:542 :0] sock.c:502 UCX WARN unable to read somaxconn value from /proc/sys/net/core/somaxconn file
2024-07-21 08:35:36,545 [TAO Toolkit] [INFO] main 388: Loading experiment spec at /workspace/tao-tf1/nvidia_tao_tf1/cv/makenet/experiment_specs/resnet101_imagenet2012.txt.
Traceback (most recent call last):
File “nvidia_tao_tf1/cv/makenet/scripts/train.py”, line 668, in
main()
File “/workspace/tao-tf1/nvidia_tao_tf1/cv/common/utils.py”, line 717, in return_func
raise e
File “/workspace/tao-tf1/nvidia_tao_tf1/cv/common/utils.py”, line 705, in return_func
return func(*args, **kwargs)
File “nvidia_tao_tf1/cv/makenet/scripts/train.py”, line 664, in main
raise e
File “nvidia_tao_tf1/cv/makenet/scripts/train.py”, line 645, in main
run_experiment(config_path=args.experiment_spec_file,
File “nvidia_tao_tf1/cv/makenet/scripts/train.py”, line 391, in run_experiment
es = load_experiment_spec(config_path, merge_from_default=False)
File “/workspace/tao-tf1/nvidia_tao_tf1/cv/makenet/spec_handling/spec_loader.py”, line 127, in load_experiment_spec
validate_spec(experiment_spec, validation_schema=validation_schema)
File “/workspace/tao-tf1/nvidia_tao_tf1/cv/makenet/spec_handling/spec_loader.py”, line 67, in validate_spec
spec_validator.validate(spec, schema[“required_msg”])
File “/workspace/tao-tf1/nvidia_tao_tf1/cv/common/spec_validator.py”, line 211, in validate
spec_validator(spec, required_msg)
File “/workspace/tao-tf1/nvidia_tao_tf1/cv/common/spec_validator.py”, line 189, in spec_validator
spec_validator(spec=value, required_msg=required_msg_next)
File “/workspace/tao-tf1/nvidia_tao_tf1/cv/common/spec_validator.py”, line 207, in spec_validator
check_value(desc.name, value, value_checker)
File “/workspace/tao-tf1/nvidia_tao_tf1/cv/common/spec_validator.py”, line 126, in check_value
assert comp_op(value, limit), error_info
AssertionError: Experiment Spec Setting Error: preprocess_mode should in [‘tf’, ‘caffe’, ‘torch’]. Wrong value:
root@INTVMLT3947:/workspace/tao-tf1#

2 posts - 2 participants

Read full topic


Viewing all articles
Browse latest Browse all 497

Trending Articles