Quantcast
Channel: TAO Toolkit - NVIDIA Developer Forums
Viewing all articles
Browse latest Browse all 497

OCRNet Limitations on Input Images

$
0
0

I am currently working on a project that involves utilizing OCRNet for optical character recognition tasks. Through preliminary testing with the ICDAR dataset, I have observed that OCRNet performs adequately on horizontal images, achieving an accuracy rate of approximately 75%. However, my project demands the use of predominantly vertical images as input data.

My question pertains to the adaptability of OCRNet to vertical images, specifically those with dimensions of a minimum height * width = 380*85. The default configuration of OCRNet utilizes grayscaled images with dimensions of 32 * 100 / 64 * 100.

Could you please provide insight into whether OCRNet can maintain satisfactory performance levels when presented with vertical images of the aforementioned dimensions? Additionally, any recommendations or best practices for optimizing OCRNet’s performance with vertical images would be greatly appreciated.

• Hardware (T4)
• Network Type (OCRnet)
• TLT Version
task_group: [‘model’, ‘dataset’, ‘deploy’]
format_version: 3.0
toolkit_version: 5.2.0)

8 posts - 2 participants

Read full topic


Viewing all articles
Browse latest Browse all 497

Trending Articles