• Hardware 128GB RAM, Intel i9 14000KF, RTX4090
• Network Type Mask_rcnn
• TAO Version 5.3.0
• Training spec file default.
I train some dataset of images and the avg widthxheight of images in dataset are 170x200. I have 100k images with 5 classes and image_size = 256x256.
For a test I did testing training like in original notebook with original COCO dataset. And during training RAM usage was around 30GB. But with my dataset it gets 60GB. The spec only differ with image_size=256x256 and number_of_classes=6(5+BG). I did many tests through whole past week and I don’t understand why with similar size dataset like in COCO and even much lower image_size I got 2x bigger RAM usage… This is a little blocky, because I wan’t to provide some augmentation and for example I can’t provide full dataset I have around 300k samples but it exceeds my RAM. Also I did same commands for tao maskrcnn data convert. I tried PNG and JPG. I tried to be as much close to the original as I could.
I saw notification in documentation about OOM, but my dataset should proceed with lower RAM memory usage then original if specs and tfrecods are nearly the same.
Best regards,
Darek
1 post - 1 participant