Can we use Vision API to extract text from image and use that text to categorize it. For example get improtant infromation from scanned document.
Can we use Vision API + NLP to read text from image and categorize like Cloud Vision API ?
Nope. Vision only gives you the rectangles that contain text, it does not have an API to convert these image regions to text.
Actually it's possible by adding a CoreML model such as MNIST.
So, with Vision, you detect the bounding box, then you extract the image portion inside the bounding box and you give it to MNIST model.
One precompiled model for CoreML is here:
Take a look at my blog:
neurosurg dot de
It also includes training and samples ..