Text recognition from scanned documents using Optical Character Recognition (OCR) software. Automate data entry or create searchable PDF files from scanned text documents.
OCR training was once a critical part of the conversion process. After a document was read, the operator would review the results to correct mistaken characters and these corrections would be used to train the engine so the next time you read a similar document the results are improved.
Modern OCR applications no longer rely on user training for accuracy unless you have very non-standard fonts. These engines have had decades of development and billions of samples used to train their algorithms. In most cases, the introduction of user training will only diminish the results for any documents that are different than the ones being trained.
The training functions still exist for these edge cases, but they are no longer an integral part of the OCR process.
Training in modern OCR is more likely to refer to enterprise data capture applications that use AI-based learning algorithms to find the locations of data points on documents with various different formats, such as invoices.