0

Image or PDF may contains

**

  1. Printed text,
  2. Handwritten text,
  3. Paragraphs,
  4. Key value pairs,
  5. Complex Tables.

**

While training, we will assign the tags/keywords for the document. When testing will look for the tag and read the result for the tag.

Ambi
  • 59
  • 5

1 Answers1

1

You need to do 3 steps:

  1. First of all you should write basic object recognition algorithm for an image. The algorithm must crop your image to the ROIs (regions of interests), then it should classify each ROI by elements from your content type list. For this part you can use some heuristics rules (tables, for example, sometimes has a rectangle boundaries) to get ROIs features. Then you may use a lightweight classifier like a decision tree.

  2. Next you should provide algorithm for reading your data structure defined by a ROI type. For example, for table you should find all cells at the image. Then you need to find each word or number from your data structure and crop it to the symbols sets.

  3. When you have do it, you will have to classify each symbol by your text-image classifier. On this step, you can use a Multilayer Perceptron for example or Bayesian Naive Classifier, and another type of classifiers which usually used for image recognition.

In the practice, you could try OpenCV library, which already has almost all algorithms you need to do your stuff.

For better understanding of 3rd step you could watch my project for captcha recognizing based on OpenCV Artificial Neural Network feature usage.

Egor Zamotaev
  • 334
  • 1
  • 9
  • Thank you for the immediate response. Is there any Cognitive Services to do the same task? – Ambi Dec 04 '19 at 07:05
  • https://www.onlineocr.net/ for example, or most popular https://finereaderonline.com – Egor Zamotaev Dec 04 '19 at 07:16
  • I would like to create the trained model first, then read the specific text that might be available as key value pair or inside the table or sentence from a paragraph – Ambi Dec 04 '19 at 07:33
  • Your task is quite bit specific, I don't sure that somewhere there is a ready solution for it. – Egor Zamotaev Dec 04 '19 at 08:10