Reading Handprint, Checkmarks, and Forms with FlexiCapture and Vantage
ICR – Intelligent Character Recognition
Intelligent Character Recognition
- Intelligent Character Recognition (ICR) is an extended technology of the optical character recognition (OCR ). While the OCR technology is designed to extract machine-printed characters, the ICR technology retreives information provided as hand-printed characters
- The ICR technology can extract hand-printed characters that are separated and written as individualcharacters in areas/zones – these areas/zones needs to be specified as fixed fields of a machine readable forms. Alternativelly, they need to be automatically detected.
Example of a form containing hand-printed characters:
Important note: ICR is not able to extract texts in “cursive handwriting” as in this example:
- In most cases, the ICR technology is linked to Field Level / Zonal Recognition and forms processing.
- To enhance the ICR recognition accuracy, it is recommended to use meta data, for example regular expressions, dictionaries or database lookups.
ICR in ABBYY SDKs
The following ABBYY SDKs and products support ICR
- FineReader Engine
Since the version 12, Release 3, ICR is as well included in the Linux version. Since the Release 4 of the version 12, it is as well included in the Mac version of FineReader Engine (in lower versions, the ICR technology was only supported in the Windows version.
- FlexiCapture SDK – this SDK is designed for forms processing and data extraction, ICR and template matching for fixed forms are part of the default feature set. In addition, ABBYY offers this technology as a product in form of the FlexiCapture platform.
- Cloud OCR SDK – the ABBYY OCR service, allows reading zones that contain hand-printed, separated characters. This online OCR service for developers does not contain any automated template matching technologies, therefor the zones have to be defined when uploading the tasks into the service.
Recognizing Handprinted Text
Handwritten text can be recognized only if the characters are written separately (“handprinted text”).
Not working sample
- Not all recognition languages are available for handprint recognition. The languages which are available for handprint recognition are marked with a special comment in the List of predefined languages.
- The coordinates of the blocks that contain handprinted text must be specified manually.
Please see the details in “Help” → “Guided Tour” → “Advanced Techniques” → “Recognizing Handprinted Text“.
Checkmark Recognition (optical mark recognition – omr)
What is a checkmark?
A checkmark field is an element on a form – it is usually of a rectangular shape and therefor often called a “check box”. In this element, the user of a form should make ‘a sign' to indicate his opinion, decision or selection – a check/tick, an X, a large dot, inking over, or others.
- ABBYY SDKs, FineReader Engine and FlexiCapture SDK contain technology for recognition of checkmarks and are therefor able to read and process checkmarks. The process of extracting information from checkmarks is called “optical mark recognition” (omr)
- The ABBYY's OMR technology recognizes different types of checkmarks:
- simple checkmarks
- grouped checkmarks
- model checkmarks
- and even checkmarks with were later corrected by hand
- The ABBYY OMR delivers a very high accuracy rate of up to 99.995 %
Technical Implementation of OMR
The ABBYY layout analysis and the underlying recognition technology works with different blocks types, e.g. for
- Barcodes and also
- Checkmark Blocks and a Checkmark Group object
The state of a checkmark can be
- Not selected
- Checkmark was selected but was corrected later.
To get good recognition results, image preprocessing can/should be applied in this area:
ABBYY FineReader Engine supports different checkmark types:
Requirements for the NLP module in FlexiCapture 12
The following requirements should be met:
- The NLP module should be installed on all machines that are involved in direct image processing – the processing server machine and the machines with processing stations, including both machines with verification stations and machines with FlexiCapture developers. It shouldn't be installed on the application and the database server;
- The installed NLP module will appear in the list of the installed programs:
- A new menu item appears at the Project Setup Station in the document definition properties:
- No new items appear in the FlexiLayout Studio and other developer applications, except the Project Setup Station.