Over 10 years we help companies reach their financial and branding goals. Engitech is a values-driven technology agency dedicated.

Gallery

Contacts

285A Gateshead Road, Borehamwood, Greater London, WD6 5LZ.

info@vbridge.co.uk

+44 203 488 9980

+0203 488 0088

Blog Uncategorized

Optical Character Recognition (OCR) is a widely used technology for extracting text from the scanned or camera images containing text. There are different types of Open source and Commercial OCR Software. In this article, We will compare between the best of the available OCR software in the Open source and Commercial.
 
 
A general Comparison between different OCR software is given in this wiki article. But There is no sample data on the OCR efficiency. The best available Open Source OCR software is 1. Tesseract, but still its efficiency is not good enough to be compared with Commercial ones. From the list of Commercial OCR software we will select 2. Abbyy finereader and 3. Maestro.
 
What we have done is Process an image with these 3 softwares and compared the raw text output.

 
From the comparison given above We can clearly see that Abbyy finereader or Maestro OCR software are far better than standard Tesseract OCR software in detecting more characters with a level of accuracy. We followed the same process for several images and found that  Abbyy finereader and Maestro OCR trumped Tesseract consistently.
 
 
But since the commercial packages are quite expensive, we will have to see if we can somehow improve the performance of Tesseract. It may be cost-effective to work on improving Tesseract performance rather than purchase the commercial options.
 
Note: Read how we made Tesseract perform as well as or even better than these commercial packages here