Quantcast
Channel: TAO Toolkit - NVIDIA Developer Forums
Viewing all articles
Browse latest Browse all 537

Low accuracy during efficientdet evaluation and inference

$
0
0

Please provide the following information when requesting support.

• Hardware (T4/V100/Xavier/Nano/etc) - A10
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc) - Efficientdet
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here) - 5.5.0-tf2
• Training spec file(If have, please share here)
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)

We are using efficientdet-d0 model to fine tune on an internal dataset. The evaluation accuracy reported as part of the training loop is good but the evaluation accuracy from a standalone ‘efficientdet_tf2 evaluate’ command from the same checkpoint is extremely low(like mAP50 during training loop is at 0.55 Vs standalone evaluate gives 0.01’

Browsing the code and looking at differences between the evaluate flow within training loop Vs standalone evaluate flow, the evaluate.py script uses MODE=“eval” for the below statement resulting in eval_graph.json used to initialize the model

model = helper.load_model(cfg.evaluate.checkpoint, cfg, MODE, is_qat=cfg.train.qat)

Changing the above line to the below to force using the ‘train_graph.json’ fixes the issue.
model = helper.load_model(cfg.evaluate.checkpoint, cfg, “train”, is_qat=cfg.train.qat)

While the above fix is an experiment to narrow down, it does not look to be the right one since the “train” hardcode results in batch normalization layers to have is_training flag=True, which is incorrect for inference runs.

Any insights would be greatly appreciated.

Thanks,
Ajith

1 post - 1 participant

Read full topic


Viewing all articles
Browse latest Browse all 537

Trending Articles