Hello,
I’m trying to evaluate the Retail Object Recognition model from NVIDIA TAO to see if it fits my needs. My goal is to run inference using the pretrained model, but I’ve encountered issues along the way. My current hardware used is 1 NVIDIA A100
Steps Taken:
I followed the official tutorial from NVIDIA: Retail Object Recognition Notebook
However, the notebook is primarily focused on transfer learning, and I couldn’t find clear instructions on how to directly test the pretrained model.
I downloaded the model using:
!ngc registry model download-version nvidia/tao/retail_object_recognition:trainable_head_fan_base_v2.0 --dest $HOST_MODEL_DIR/
I modified the infer.yaml
file as follows:
results_dir: "???"
model:
backbone: **fan_base**
input_width: 224
input_height: 224
feat_dim: 1024
dataset:
workers: 8
val_dataset:
reference: "???"
query: ""
inference:
inference_input_type: classification_folder
input_path: "???"
batch_size: 16
I attempted to run inference with:
# run inference on known classes
! tao model ml_recog inference \
-e $SPECS_DIR/infer.yaml \
results_dir=$RESULTS_DIR \
inference.checkpoint=$MODEL_DIR/retail_object_recognition_vtrainable_head_fan_base_v2.0/retail_object_recognition_head_fan_base_v2.0.pth \
dataset.val_dataset.reference=$DATA_DIR/$DATA_FOLDER/known_classes/reference \
inference.input_path=$DATA_DIR/$DATA_FOLDER/known_classes/test
Encountered Errors:
I received the following error:
KeyError: ‘pytorch-lightning_version’
It seems that the checkpoint file lacks the required pytorch-lightning_version key. I attempted to manually modify the checkpoint by loading it in PyTorch and adding:
new_ckpt['pytorch-lightning_version'] = '0.0.0'
new_ckpt['global_step'] = None
new_ckpt['epoch'] = None
However, this did not resolve the issue. The model’s state_dict is missing several keys, and adding them manually does not work. The error I receive is:
RuntimeError: Error(s) in loading state_dict for MLRecogModel:
Missing key(s) in state_dict: "model.embedder.classifier_feat.0.weight", "model.embedder.classifier_feat.0.bias", ... etc...
Is this model intended only for fine-tuning, or should it work for direct inference? I do not want to train, I just want to see if it fits my needs and then perform a finetune.
Thank you in advance.
1 post - 1 participant