Using Artificial Intelligence to train OCR templates

You are here:
< All Topics

Modern Forms Processing applications have AI-based training algorithms that let users point and click on the location of data in their documents and create templates automatically.

This bypasses the technical requirements of creating complex templates, especially for varied documents like Invoices where the data doesn’t always appear in the same place.

But how good are these AI-based training systems?

In our experience they work well when you have:

  • Good quality scanned images
  • Clearly labeled data
  • Tables with regular columns

Point and click style training doesn’t work quite as well with:

  • Poor quality images
  • Data that appears within paragraphs
  • Tables with overlapping columns, subtotal rows, etc.

These types of documents can still be captured with but they will usually require an experienced technician to manually configure the template.

For natural language data like legal documents, a new technology called NLP (Natural Language Processing) is available. These work by attempting to “understand” the language used in documents to interpret the location of data points based on meaning. ABBYY FlexiCapture also supports NLP-based training for these types of documents.

Previous How to use Zone OCR when the data can be in different locations?
Next Using OCR to capture data from tables and reports
Table of Contents
Contact Us for FREE Consultation on Your OCR Project
Go to Top