Use OCR and ICR to scan handwriting to text and automate data entry from forms filled out by hand.

SimpleIndex Cloud OCR

SimpleIndex Cloud OCR adds Amazon AWS Textract OCR to any SimpleIndex workstation or server license.

Textract capabilities include the most accurate OCR and handprint recognition available, automatic form field detection, accounts payable invoice and receipt processing.

Amazon Textract is only available as an API that requires custom programming to make it work. SimpleIndex turns it into a complete document and data capture application designed for easy batch processing on a workstation or server.

Requires an AWS account. Standard Textract transaction fees will apply.

Reading Handprint, Checkmarks, and Forms with FlexiCapture and Vantage

ICR – Intelligent Character Recognition

Intelligent Character Recognition

  • Intelligent Character Recognition (ICR) is an extended technology of the optical character recognition (OCR ). While the OCR technology is designed to extract machine-printed characters, the ICR technology retreives information provided as hand-printed characters
  • The ICR technology can extract hand-printed characters that are separated and written as individualcharacters in areas/zones – these areas/zones needs to be specified as fixed fields of a machine readable forms. Alternativelly, they need to be automatically detected.

Example of a form containing hand-printed characters:

icr-form-illu.png

Important note: ICR is not able to extract texts in “cursive handwriting” as in this example:

old-handwriting-illu.png

  • In most cases, the ICR technology is linked to Field Level / Zonal Recognition and forms processing.
  • To enhance the ICR recognition accuracy, it is recommended to use meta data, for example regular expressions, dictionaries or database lookups.

ICR in ABBYY SDKs

The following ABBYY SDKs and products support ICR

  • FineReader Engine
    Since the version 12, Release 3, ICR is as well included in the Linux version. Since the Release 4 of the version 12, it is as well included in the Mac version of FineReader Engine (in lower versions, the ICR technology was only supported in the Windows version.
  • FlexiCapture SDK – this SDK is designed for forms processing and data extraction, ICR and template matching for fixed forms are part of the default feature set. In addition, ABBYY offers this technology as a product in form of the FlexiCapture platform.
  • Cloud OCR SDK – the ABBYY OCR service, allows reading zones that contain hand-printed, separated characters. This online OCR service […]

OCR Data Capture

What is OCR Data Capture?

document OCR process automationOCR stands for Optical Character Recognition and is the technology that allows software to interpret text on scanned images. When this technology is applied to automating business data entry processes it’s referred to as OCR Data Capture.

Many are familiar with popular desktop OCR applications designed to convert scanned images to editable documents. When this process is applied to specific areas of the document containing data fields it’s called zone OCR. But OCR data capture software is more than just simple zone OCR. Modern applications use some or all of these technologies:

Enterprise data capture systems provide interfaces for scanning, recognition, data verification and export, as well as management and monitoring tools to track large volumes of documents and data through the workflow.

Who can benefit from OCR data capture software?

messy business information made easy with ocr data captureAny organization that collects data from paper documents, or electronic files like PDF and Office documents, can get a very high return on investment by automating the data entry with OCR data capture software.

You do need to have a significant number of documents to […]

Creating forms optimized for handprint recognition

Handprint recognition applications can provide dramatically different results in terms of accuracy depending on whether the form is designed with intelligent character recognition (ICR) in mind.

Forms Processing applications like ABBYY FlexiCapture have a built-in form design tool with ICR-optimized field layout elements and rules that validate whether your form uses best practices for recognition. These forms can be automatically converted to recognition templates for scanning for data capture. This saves you dozens of hours of trial and error during the design process and even more in data entry once the filled in forms are collected.

Best practice recommendations for ICR and OCR forms include:

  • Plenty of space between form elements and labels, at least 0.5cm / 0.25in
  • Use drop out colors for form backgrounds when possible
  • Hand printed characters should be constrained with boxes or combs to force filler to write legible, separated, printed characters
  • Use check boxes instead of handprint when possible since these are nearly 100% accurate
  • Use numeric codes instead of alphanumeric text when possible to reduce the number of possible characters and increase accuracy
  • Use validation rules to check against possible values and flag data with incorrect values
  • Check box fields can be used to verify the presence of signatures

What are the best scanner settings for OCR?

Most OCR applications are optimized for 300 dots per inch resolution images.

While color is supported and most often performs better than black & white images, OCR algorithms will generally convert the color to B&W automatically as part of the OCR process. With color input, the dynamic conversion usually produces the best result, but not always.

Especially when an image contains stray markings, stamps, notes, colored paper or other elements that can throw off the binarization process, OCR results can be improved by paying careful attention to image processing settings and using a pristine black & white image for OCR instead of a color scan.

In forms processing and handprint recognition applications, guide marks in the form can often be removed during the scanning process, improving the OCR results when the software doesn’t have to distinguish between the form background and the words being recognized.

Using drop-out forms, traditionally printed in red or green and then scanned with a corresponding red or green light, automatically removes the form background during scanning and leaves only the text to be recognized. This can dramatically improve recognition results, especially for handprinted data.

Older, black & white scanners would require you to change out the lamps in order to perform color drop-out. All but the least expensive modern color scanners have the ability to enable drop-out colors in the scanner driver.

Advanced forms processing applications can perform color drop-out on-the-fly with scanned color images. Though this is generally not quite as accurate as scanning with a drop-out lamp enabled, it has the advantage of retaining a full-color original copy of the image with the form element and labels visible.

Handprint Recognition Guide

What is ICR, Handprint Recognition?

ICR stands for Intelligent Character Recognition and is the technology that allows software to interpret hand printed text on scanned images.

Forms Processing Software uses ICR technology to automate data entry tasks involving hand-filled surveys, applications and forms. It provides interfaces for scanning, recognition, data verification and export, as well as management and monitoring tools to track large volumes of documents and data through the workflow.

Forms Processing also includes OCR (Optical Character Recognition) technology to recognize machine printed text, and OMR (Optical Mark Recognition) for check boxes and multiple choice bubbles.

Traditional forms processing relies on constrained handwriting, where boxes on the form force the filler to write with separated, printed block characters. Modern AI technology has dramatically improved the ability to recognized unconstrained handwriting and cursive script. Hand printed notes, free-form comments blocks, non-segmented fields, historic documents, and more can now be converted to text with acceptable accuracy where these were impossible just a few years ago.

Who can benefit from handwritten recognition software?

Any organization that collects data on paper-based forms, surveys or applications on a regular basis can get a very high return on investment by automating the data entry with forms processing software.

You do need to have a significant number of forms to justify the expense, at least a hundred forms per month or more depending on how much data is being captured. If the data entry task can be done in under 25 working hours then it is probably not a good candidate for automation with ICR software.

Organizations that have many separate departments that collect data on forms can share the budget for forms processing software by re-using it for other projects. Your current project may not be big enough to justify […]

Forms Processing

What is ICR, Survey & Forms Processing?

ICR stands for Intelligent Character Recognition and is the technology that allows software to interpret hand printed text on scanned images.

Data Capture Forms OCRForms Processing Software uses ICR technology to automate data entry tasks involving hand-filled surveys, applications and forms. It provides interfaces for scanning, recognition, data verification and export, as well as management and monitoring tools to track large volumes of documents and data through the workflow.

Forms Processing also includes OCR (Optical Character Recognition) technology to recognize machine printed text, and OMR (Optical Mark Recognition) for check boxes and multiple choice bubbles.

It is also possible to use these applications to automate data collection from PDF forms, Word documents, Excel spreadsheets, and other formats used to fill out forms electronically. Many include the ability to publish forms as paper, fillable PDF and web pages simultaneously to distribute and collect data from multiple sources into one dataset.

Who can benefit from forms processing software?

Any organization that collects data on paper-based forms, surveys or applications on a regular basis can get a very high return on investment by automating the data entry with forms processing software.

You do need to have a significant number of forms to justify the expense– at least a hundred forms per month or more depending on how much data is being captured. If the data entry task can be done in under 100 man-hours then it is not a good candidate for automation with ICR software.

Organizations that have many separate departments that collect data on forms can share the budget for forms processing software by re-using it for other projects. Your current project may not be big enough to justify the expense, but when combined with one or two others it would be.

How much do […]

Applications

When you scan a document that has text or numeric data on it, you are able to read and understand what is written in the scanned image. However, to a computer, the resulting image file is just as meaningless an assortment of pixels as a landscape photo. In order to transform this information into an editable format that you can search through, copy, and modify without retyping it manually, you will need the an Optical Character Recognition (OCR) software.

There is a wide variety of OCR software available. While they all share the ability to convert images of machine printed (not handwritten) text or numbers into an editable format, the various software often have different features, accuracy, prices, and language options.

You can find the various types of OCR software with a description of each below.

Users within a single department, working from home or who have a small business can simply scan their documents to a folder that is shared to everyone. In this “ad-hoc” scenario you only need some basic document scanning software to simplify and bring consistency to your filing system.

If you want to move to the next level, there are Desktop Document Management options that provide an all-in-one means for capture, storage, search and retrieval of documents. Additionally, they provide security, advanced capabilities and ease of use above that of the ad-hoc methods

And let’s not forget cloud-based options that alleviate the need to maintain storage servers or keep software up to date.

Need a simple, no frills OCR solution without spending hundreds of dollars on a professional software package? Look no further. There is a no cost, donation optional, OCR freeware solution for […]

Title

Go to Top