Quantcast
Channel: TAO Toolkit - NVIDIA Developer Forums
Viewing all articles
Browse latest Browse all 497

Re_identification_net in TAO 5.3.0 checkpoint_interval configuration is not respected ( no checkpoints/ missed checkpoints)

$
0
0

When checkpoint_interval=10 is specified, no checkpoints are saved when training NVIDIA TAO 5.3 ReIdentification model.
When the checkpoint interval is set to 1, training generates checkpoints. However, it’s undesirable to use 1 because it saves too many files.

When checkpoint_interval=5 is specified, only some checkpoints are saved. Following is an example:

In TAO documentation,checkpoint_interval is defined as the interval at which the checkpoints are saved, and no other explanation is provided.

Can you please explain how it is determined which epochs are saved and how we can determine checkpoint_interval to be set in training configuration to achieve predictable checkpoints?

4 posts - 2 participants

Read full topic


Viewing all articles
Browse latest Browse all 497

Trending Articles