Using Artificial Intelligence to train OCR templates

You are here:
< All Topics

Modern Forms Processing applications have AI-based training algorithms that let users point and click on the location of data in their documents and create OCR templates automatically.

This bypasses the technical requirements of creating complex OCR templates, especially for varied documents like Invoices where the data doesn’t always appear in the same place.

But how good are these AI-based training systems?

In our experience they work well when you have:

  • Good quality scanned images
  • Clearly labeled data
  • Tables with regular columns

Point and click style training doesn’t work quite as well with:

  • Poor quality images
  • Data that appears within paragraphs
  • Tables with overlapping columns, subtotal rows, etc.

These types of documents can still be captured with OCR but they will usually require an experienced technician to manually configure the template.

For natural language data like legal documents, a new technology called NLP (Natural Language Processing) is available. These work by attempting to “understand” the language used in documents to interpret the location of data points based on meaning. ABBYY FlexiCapture also supports -based training for these types of documents.

Previous Using ABBYY Vantage Document Skills
Next Using FlexiLayout Studio to Design Data Capture Templates
Contact Us for FREE Consultation on Your OCR Project
=
Table of Contents

Title

Go to Top