Software that can convert large batches of PDF documents to text with Optical Character Recognition (OCR), extract data and export it to Excel or SQL databases. Native PDF files can be processed without OCR with dramatic speed and 100% accuracy.
Modern Forms Processing software can use rules-based templates for locating data on documents based on label keywords, data types, regular expression pattern matching and other methods.
The most common example in business is an Invoice. Businesses receive invoices from 1000s of different vendors, each with important information like the Invoice Number, Due Date and Total needed to process the document, but each vendor invoice is formatted a little differently than the others.
Software like ABBYY FlexiCapture will look for keywords like “Invoice Number” or variations like “Inv #” and “Invoice No.” to locate the invoice number value on each invoice.
In recent years, artificial intelligence based training has made it possible to simply point and click on the location of data on documents as you process them and generate these templates automatically, dramatically reducing the need for ongoing expert help these systems require.