I am currently working on a project that involves utilizing OCRNet for optical character recognition tasks. Through preliminary testing with the ICDAR dataset, I have observed that OCRNet performs adequately on horizontal images, achieving an accuracy rate of approximately 75%. However, my project demands the use of predominantly vertical images as input data.
My question pertains to the adaptability of OCRNet to vertical images, specifically those with dimensions of a minimum height * width = 380*85. The default configuration of OCRNet utilizes grayscaled images with dimensions of 32 * 100 / 64 * 100.
Could you please provide insight into whether OCRNet can maintain satisfactory performance levels when presented with vertical images of the aforementioned dimensions? Additionally, any recommendations or best practices for optimizing OCRNet’s performance with vertical images would be greatly appreciated.
• Hardware (T4)
• Network Type (OCRnet)
• TLT Version
task_group: [‘model’, ‘dataset’, ‘deploy’]
format_version: 3.0
toolkit_version: 5.2.0)
8 posts - 2 participants