Extracting text from images has been worked on for many years now and finds applications in many domains like Banking , Legal, Healthcare, education and entertainment!
With the advent of machine learning, text extraction from images is being offered as a Cognitive API by many AI/ML providers like AWS Rekognition, Azure Computer Vision,and Google CloudVision .
While all three do a good job when it comes to default text detection we used the Cognitive API Integrator to compare the responses of these 3 major cognitive API providers on 3 parameters for the English language:-
- Different orientation
- Different fonts
- Reverse order text
While there are no clear winners here Google does perform a notch better than Azure and AWS in the 3 parameters we compared them for.
Here is a brief summary:-
- Google does a great job at detecting vertical text irrespective of the top down or bottom up orientation
- Google and Azure both give reverse order text(upside down text) a good shot whereas AWS is never able to decipher it.
- AWS does a great job detecting texts written in different fonts.
- Azure needs handwritten mode on in order to detect different fonts.
Lets take a look at a few examples
Example 1:- Vertical Text in bottom up orientation
- AWS totally misses detecting the vertical text
- Google and Azure are able to detect the text correctly.
Example 2:- Vertical Text in top down orientation
- Google gives the best result
- AWS again gives it a miss
- Azure is also unable to read vertical text in top down orientation.
Example 3:- Bottom up text
- Clearly Google does the best job here
- Azure gives it a try and AWS misses it completely
Example 4:- Mixed Orientation
- While none of these three providers is able to hand mixed orientations correctly Google plays is safe and reads only one orientation but reads that correctly.
- Azure tries to read all the orientations and reads one of the two orientations incorrectly.
- AWS can only read the default orientation correctly.
Example 5:- Mixed Fonts
- While all providers detect different fonts AWS seems to be doing a better job than the other two!
Check out the Findings page for various similar conclusions drawn by the community while working with these APIs. Send us your findings and feedback at daksh@cennest.com.
About the Cognitive API Integrator
The Cognitive API Integrator aggregates cognitive services across major providers (currently Microsoft Azure, Amazon Web Services & Google Cloud) . Use it to compare responses for various Cognitive APIs before making your selection of which provider you will integrate with.
Note:- The Cognitive API Integrator does not aim to promote or downplay any Cognitive API Provider. Cognitive Analysis is a machine learning exercise where results are bound to improve with more data and usage. Conclusions drawn here can be subjective and users are encouraged to use the tool to form their own conclusions.
THE COGNITIVE API INTEGRATOR is no longer active. If its features did interest you and you want to know more please connect with daksh@cennest.com