Using Artificial Intelligence to train OCR templates

Created OnJanuary 20, 2021

Last Updated OnJanuary 25, 2021

byaaron

Modern Forms Processing applications have AI-based training algorithms that let users point and click on the location of data in their documents and create OCR templates automatically.

This bypasses the technical requirements of creating complex OCR templates, especially for varied documents like Invoices where the data doesn’t always appear in the same place.

But how good are these AI-based training systems?

In our experience they work well when you have:

Good quality scanned images
Clearly labeled data
Tables with regular columns

Point and click style training doesn’t work quite as well with:

Poor quality images
Data that appears within paragraphs
Tables with overlapping columns, subtotal rows, etc.

These types of documents can still be captured with OCR but they will usually require an experienced technician to manually configure the template.

For natural language data like legal documents, a new artificial intelligence technology called NLP (Natural Language Processing) is available. These work by attempting to “understand” the language used in documents to interpret the location of data points based on meaning. ABBYY FlexiCapture also supports NLP-based training for these types of documents.

Using Artificial Intelligence to train OCR templates

How Can We Help?

Using Artificial Intelligence to train OCR templates

Title