This is a more R&D topic, I am looking for a way to calculate the max batch size for a GPU, given we know the model size and the GPU memory size, the models that are of primary interest to me are CNN models. From my research I have found that for a single sample the memory usage is the size of the input image * bytes per value (precision) + the size of the model + the size of the forward pass (size of the output of each layers) + size of the backward pass (size of the gradients calculated for each weights in the model), the memory of the GPU divided by this sum should then give us the maximum number of sample the GPU can support. We can calculate the size of the intermediate output layers given we know the model’s layers precision and input size and gradients are calculated for each weights and I guess this would approximate to the model size.
What I am looking for is for scrutiny for my analysis and if there are any approximations you guys use on your end since my calculations can be tedious especially if the model is deep or any tools that can help me with this. Any help is appreciated, thanks.
2 posts - 2 participants