I am trying to fine tune a Metric Learning Recognition model with TAO 5.5. During training, only training loss is saved under train/lightning_logs/version_1
, and therefore can only monitor training loss via tensorboard.
The validation metrics are generated as logs only. With large number of epochs and with validation_interval=1, the logs are too long to be able to useful for monitoring training.
Is there a way for us to add validation metrics (validation loss, validation accuracy) to the train/lightning_logs/version_1
events so that they can be monitored via tensorboard too?
2 posts - 2 participants