Automated document processing software for converting scanned documents, PDF files, Office documents and other formats to editable text or structured data that can be exported to CSV, XML, JSON or SQL databases.

Why are the prices of OCR applications so different?

OCR software ranges in price from freeware all the way up to tens of thousands of dollars. What explains the difference between these applications? Here’s the breakdown:

  • OCR Freeware uses the SimpleOCR or Tesseract engines and provide limited scanning and output format capabilities. Recognition quality is generally poor except for the highest quality document images.
  • PDF OCR Converters provide good quality OCR engines like ABBYY, IRIS and OmniPage, but limit the output to searchable PDF files. These cost less than $100.
  • Standard OCR applications range from $100-$200 and provide full OCR capabilities including converting scans to Word, Excel, HTML and other editable formats.
  • Corporate OCR applications add advanced features like automated hotfolder processing, concurrent licensing and other features useful for business applications. Pricing for these is $200-$500.
  • OCR Servers provide scalable, enterprise OCR services for processing very high volumes of documents or providing OCR capabilities to users throughout the organization. Prices start around $1,500 and go up based on processing volume.
  • Enterprise Data Capture and Forms Processing applications are used to capture structured data from complex documents like healthcare claim forms and invoices that include things like tables, handwriting, checkboxes, and movable zones. These solutions can cost anywhere from around $1,000 to hundreds of thousands of dollars depending on the document volume and complexity of the project.

Document Processing via Email in FineReader Server

In this video, learn how to configure a workflow for document input and processing via e-mail on FineReader Server.

See how in a few simple steps you can configure this workflow. You can even edit the e-mail subject and message. Watch how to setup usage scenario: Centralized document conversion service in this video.

From the document input to the document output, including the document processing, FineReader Server is designed to simplify, optimize and fasten your worflows. Scalable and easy to configure, FineReader Server can adapt to all your needs.

 

How to convert several emails from MS Outlook into PDF?

In order to convert several emails into a PDF file, you may use the virtual printer PDF-XChange 5.0 for FineReader.

Follow the steps below:

  1. Select the needed emails in MS Outlook.
  2. Press File>Print.

    File_Outlook.png

  3. Select ​PDF-XChange 5.0 for FineReader as a printer and press Print.

    virtual_printer.png

  4. Save the PDF file.

 

How to set up the import from the Gmail mailbox using the IMAP Image Import Profile?

  1. In your Gmail, create the folder (mailbox) that you want to import from.
  2. In your Gmail, create the folders (mailboxes) for the Exceptions and Processed emails.
  3. Enable the IMAP protocol in Gmail settings.IMAP_POP3_in_gmail.png
  4. Turn on the Less secure app access option in the Security section of Google Account Settings.Less_secure_apps_gmail_2.png
    Less_secure_apps_gmail_.png
  5. Create a new Image Import Profile (open the project in the Project Setup Station > Project > Image Import Profiles > New). Choose Hot Folder: IMAP Server.
  6. Specify the address of the IMAP server: imap.gmail.com.
  7. Click settings and specify your Gmail login and password. Choose Type of encrypted connection: SSL.

    mceclip0.png

  8. Click Browse and select […]

Using SharePoint in FineReader Server

Saving documents to the SharePoint in ABBYY Recognition Server

Note: In order to be able to communicate with the SharePoint Server, the Server Manager and the Remote Administration Console require Microsoft .NET Framework 4.5 to be installed.

To be able to save output documents to a SharePoint Server library, the ABBYY Recognition Server Server Manager service must be run under a user account that has read/write access to the SharePoint Server library. If during the installation, you chose to run the service under a Local System account, you should restart it under a user account.

To set up the publishing of documents to a SharePoint Server library:

  1. Run the Remote Administration Console under a user account that has read/write access to the SharePoint Server library.
  2. Create a new workflow or modify an existing one (see Creating a New Workflow). In the Output Format Settings dialogue select Save output file in SharePoint library.
  3. Enter the URL of the SharePoint Server site (e.g. http://myportal/mysite/) and click Connect. The Remote Administration Console will try to connect to the specified site and download the list of document libraries and folders from there. If the connection is successful, you will see the “Connected” message below the button, and the names of the document libraries will appear in the Select document library list.
  4. Select the document library from the list. Click the Settings…button to associate a document’s metadata fields with the corresponding columns of the selected document type in SharePoint.
  5. Select the folder in the document library using the Browse…button or leave the field empty to save documents in the root folder.
  6. Click OK in the Output Format Settings dialogue box.

If the Input folder has several subfolders containing image files, the output files will be saved in […]

Using ABBYY Vantage Document Skills

Processing Your First Documents with Vantage

Learn how easy it is to get started with Vantage – upload your documents and Vantage will take care of the rest.

 

How to Create and Train a Vantage Document Skill

Learn how to use the Vantage Skill Designer to create and train a new Document Skill with just a few sample documents.

 

How to Create and Train a Classification Skill in ABBYY Vantage

Learn how to use the Vantage Skill Designer to train a new Classification Skill. You need just a few samples of each document class.

 

 

How to Automate a Complete Workflow, by Creating a Vantage Process Skill

 

 

How to Edit a Document Skill

Learn how to adapt already existing skills to your specific documents and business requirements.

 

 

How to perform the first authentication in Vantage Swagger UI?

To get a first access token perform the initial authentification using the default client, one does not need to enter any passwords or client ID. The initial authentication is preconfigured. Just open a Swagger page (EU link or US link), click Authorize:

mceclip1.png

Select all scopes, and click Authorize again:

mceclip0.png

The password should be specified only for a custom client. A custom client can be created after the initial initialization.

References

EU Help: Getting a Tenant Identifier or US Help: Getting a Tenant Identifier

EU Help: Creating a Client or US Help: Creating a Client

Learn more at ABBYY […]

Using FlexiLayout Studio to Design Data Capture Templates

FlexiLayout: How to capture a table using Repeating Group if table header is on each page

In some cases, we might have a table that we are not able to capture correctly using a traditional method – Table element. In such cases, we usually use Repeating Group element.

But what if we come across a multi-page document that has a table header on each page?

mceclip0.png

We can use two following methods to capture such a table using the Repeating Groups.

Using Absolute search area constraints

To limit the search area to the table area so that it doesn’t capture unnecessary text outside of the table, we can use Absolute search area constraints in the Search Constraints tab.

You can measure the area with the Measure Rectangle tool.

mceclip0.png

Using nested Repeating groups

Sometimes it might be not suitable to use the Absolute search area constraints method because other tables using this layout might have different positions and lengths of elements, thus making it not convenient to use the method, because you will have to re-measure the area every single time.

In such a case, you can use the nested Repeating group method.

  1. Create the first, “main” Repeating group that will include the Table header and footer. mceclip1.png
  2. Next, create the nested RG in the first RG. The relations are as follows: mceclip2.png
  3. These are the main steps, other elements in the RG don’t need any specific settings and should be designed according to the needed results.

Additional information

FlexiLayout: Capturing a table using Repeating Group

 

How to reliably capture elements in FlexiLayout Studio if the image resolution can vary

When the image resolution varies, then the search area of elements based on absolute offsets can miss […]

Reading Handprint, Checkmarks, and Forms with FlexiCapture and Vantage

ICR – Intelligent Character Recognition

Intelligent Character Recognition

  • Intelligent Character Recognition (ICR) is an extended technology of the optical character recognition (OCR ). While the OCR technology is designed to extract machine-printed characters, the ICR technology retreives information provided as hand-printed characters
  • The ICR technology can extract hand-printed characters that are separated and written as individualcharacters in areas/zones – these areas/zones needs to be specified as fixed fields of a machine readable forms. Alternativelly, they need to be automatically detected.

Example of a form containing hand-printed characters:

icr-form-illu.png

Important note: ICR is not able to extract texts in “cursive handwriting” as in this example:

old-handwriting-illu.png

  • In most cases, the ICR technology is linked to Field Level / Zonal Recognition and forms processing.
  • To enhance the ICR recognition accuracy, it is recommended to use meta data, for example regular expressions, dictionaries or database lookups.

ICR in ABBYY SDKs

The following ABBYY SDKs and products support ICR

  • FineReader Engine
    Since the version 12, Release 3, ICR is as well included in the Linux version. Since the Release 4 of the version 12, it is as well included in the Mac version of FineReader Engine (in lower versions, the ICR technology was only supported in the Windows version.
  • FlexiCapture SDK – this SDK is designed for forms processing and data extraction, ICR and template matching for fixed forms are part of the default feature set. In addition, ABBYY offers this technology as a product in form of the FlexiCapture platform.
  • Cloud OCR SDK – the ABBYY OCR service, allows reading zones that contain hand-printed, separated characters. This online OCR service […]

Reading Barcodes with Digitech PaperFlow and PaperVision Capture

Does processing barcodes “on-the-fly” make any difference in speed or recognition?

On-the-fly processing is actually a preferable way of reading barcodes since it does not noticeably decrease scan speed. The recognition will be the same whether the barcode is processed on the fly or as a post-process.

 

PDF Processing with FineReader and FineReader Server

How to create a PDF from Microsoft® Word, Excel, or PowerPoint

 

How to convert emails to PDF

 

How to Split a PDF

Create new PDF documents or separate PDF documents combined in one easily with FineReader PDF 15.

Learn how to split PDFs and extract pages easily.

 

 

How to create and edit interactive PDF forms

Watch this video and see how to edit and create interactive PDF forms quickly and easily.

Form Editor tool in FineReader PDF 15 allows creating and editing fillable PDF forms with text and date fields, dropdown lists, list boxes, checkmarks, radio buttons, signature fileds and action buttons. Collect information and create effective document templates with ease!

 

How to extract text from scanned PDFs

 

 

How to extract tables

 

 

How can I verify if the digital signature is valid?

If you open a document with a valid digital signature in FineReader, you will see a green notification Valid on the left panel of ABBYY FineReader PDF 15:
 mceclip0.png

Recognizing a document with existing text layer in FineReader PDF 15

  1. Open FineReader PDF 15;
  2. Go to Tools > Options > OCR;
  3. In the PDF recognition mode select Use OCR option:
  4.  Click OK;
  5.  Recognize your document again.

 

 

How to convert a document into an accessible PDF/UA

Make your mixed documents—PDF, scanned, photographed, or papers— digital and accessible.

In this […]

How to configure a Batch Splitting step to split on a blank value

In PaperVision Capture a batch splitting step can be configured to meet one or more of many conditions. In some cases it may be desirable to split a batch based off a blank value within an index field. This can be achieved by using a String Comparison or Regular Expression.

The following steps should be used to configure batch splitting using a blank value. Note: These steps assume you will be splitting the batch based on an index field called “ExampleIndexField”. The index field should already exist in the job.

To split the batch on a blank value using the String Comparison type:

  1. Setup the Target Job Configuration.
  2. Add a batch split step.
  3. Add a New Condition.
    • The condition source: Capture Index
    • Choose Capture Index: “ExampleIndexField”
    • Choose Comparison Type: String Comparison
    • Leave the drop down on the equal sign “=” and leave the text box, blank.
    • Click Finish
  4. The condition should read (CI.ExampleIndexField = “”)

 

To split the batch on a blank value using the Regular Expression Comparison type:

  1. Setup the Target Job Configuration.
  2. Add a batch split step.
  3. Add a New Condition
    • The condition source: Capture Index
    • Choose Capture Index: “ExampleIndexField”
    • Choose Comparison Type: Regular Expression
    • Input the Regular Expression which represents any blank space characters: ^\s*$
    • Click Finish
  4. The condition should read (CI.ExampleIndexField RegEx.Match(“^\s*$”)

Scanning different sizes of paper within the same document

A document may contain pages with different sizes. The user wants to scan all pages based on the original size of the documents in one scanning process.

If the scanner supports auto page size detection, this setting can be enabled through the scanner driver.

To enable auto page size detection:

  1. From the Operator Console’s Scanner menu, select Scanner Settings.
  2. Choose the scanner from the drop-down list.
  3. Click the Properties button to the right of the scanner name.
  4. If available, enable the Auto Page Size Detection setting.

Note: Not all scanner models have an auto page size detection setting.

Regular Expression to Validate Date Formats

Within PaperVision Capture, regular expressions can be used to validate batch names and index fields populated by a user or an OCR process. Below are a few regular expressions that validate some common date formats:

Format 1:

MM/DD/YY HH:MM AM/PM

Regular Expression:

^([0]\d|[1][0-2])\/([0-2]\d|[3][0-1])\/\d{2}(\s([0]\d|[1][0-2])(:[0-5]\d){1,2})*\s*([aApP][mM]{0,2})?$

Examples:

12/31/2002
12/31/2002 08:00
12/31/2002 08:00 AM

Format 2:

DD/MM/YYYY HH:MM AM/PM

Regular Expression:

^([0]\d|[1][0-2])\/([0-2]\d|[3][0-1])\/\d{4}(\s([0]\d|[1][0-2])(:[0-5]\d){1,2})*\s*([aApP][mM]{0,2})?$

Examples:

31/12/2002
31/12/2002 08:00
31/12/2002 08:00 AM

Format 3:

 

YYYY/MM/DD HH:MM:SS

Regular Expression:

^\d{4}\/([0]\d|[1][0-2])\/([0-2]\d|[3][0-1])(\s([0]\d|[1][0-2])(:[0-5]\d){1,2})*\s*([aApP][mM]{0,2})?$

Examples:

2002/02/03
2002/02/03 12:12:18

Date Format OCR Capture

 

How to use the [CURRENTDATETIME] tag in WorkFlow Pre-Conditions

The [CURRENTDATETIME] tag can be used to set up WorkFlow pre-conditions where [CURRENTDATETIME] represents the current system time and also dates within Records Retention jobs.

When setting up Records Retention policies and WorkFlow definitions, it may be helpful to be able to reference the current system time to determine if documents should be selected.  The [CURRENTDATETIME] is the current system time of the automation server when it runs the specified operation.  This means the value is always changing.  Date ranges can also be used, for example:

[CURRENTDATETIME+1Y] = Current Date/Time plus 1 year

Example:

If you want to bring documents into a WorkFlow 90 days after a specified date index field, set up the WorkFlow pre-condition so the date fields From range is [CURRENTDATETIME-50Y] and the To range is [CURRENTDATETIME-90D].  This means that every time the WorkFlow checks for new documents it uses the time the operation runs for the CURRENTDATETIME value.  Any document where the date field falls in the range of the current system time minus 50 years to the current system time minus 90 days will be brought into the WorkFlow.  If a document is added with today’s date, it will not enter the WorkFlow.

General information about redaction and how to redact a sensitive document

Redaction is the process of removing information from documents, typically confidential information, before final publication.  This is most popular in the Legal industry when names and personal information is removed from a file before it is made accessible by the public.  During the redaction process, words can be blacked out to make them unreadable and unsearchable

How to use redact:

  1. Load, recognize and optionally proof and edit a document for redaction.
  2. Save the recognition results to have a clean copy of the original document (recommended).
  3. To mark for redaction by searching, click “Edit > Find and Mark Text”.
  4. On the “Mark Text” tab, enter a search string and click “Find Next”. To mark this occurrence and move to the next one, click “Mark for redacting”; to skip the occurrence click “Find Next”.
  5. Continue until there are no more occurrences. Perform new searches as desired.
  6. Review the marking in the Text Editor. If the Mark Text toolbar is not visible, click “View > Toolbars” and enable “Mark Text”.
  7. Click the “Mark for Redacting” button under the Text Editor tab, then select any other text strings you want marked.
  8. Click the button again to finish marking. To remove a marking, select it and click the “Mark for Redacting” button again.
  9. Save the OmniPage document (recommended), then export the marked recognition results to a convenient file type, e.g. PDF. You can pass this file to colleagues for confirmation of the marking.
  10. When the marking is reviewed, open the OmniPage document (if necessary) and make changes as required.
  11. To complete the redaction process, press the “Redact Document” button. A dialog box gives you the chance to have redaction applied to a copy. Choose this to get two documents, one marked for redaction and the new one completely redacted.  Then export these to store the marked version and distribute the redacted […]

Combining multiple documents into one PDF in order

How do I ensure that the list of documents I am combining will be in order when I right click “Combine files as one PDF”?

Pre sort the documents before combining and then use shift + click to select all files, then right click on the first file in the list and select “Combine files as one PDF”.  Note that if you right click on any other file in the group the files will be combined starting at that file and not the first file.

 

Document Management OCR

Applications

When you scan a document that has text or numeric data on it, you are able to read and understand what is written in the scanned image. However, to a computer, the resulting image file is just as meaningless an assortment of pixels as a landscape photo. In order to transform this information into an editable format that you can search through, copy, and modify without retyping it manually, you will need the an Optical Character Recognition (OCR) software.

There is a wide variety of OCR software available. While they all share the ability to convert images of machine printed (not handwritten) text or numbers into an editable format, the various software often have different features, accuracy, prices, and language options.

You can find the various types of OCR software with a description of each below.

Users within a single department, working from home or who have a small business can simply scan their documents to a folder that is shared to everyone. In this “ad-hoc” scenario you only need some basic document scanning software to simplify and bring consistency to your filing system.

If you want to move to the next level, there are Desktop Document Management options that provide an all-in-one means for capture, storage, search and retrieval of documents. Additionally, they provide security, advanced capabilities and ease of use above that of the ad-hoc methods

And let’s not forget cloud-based options that alleviate the need to maintain storage servers or keep software up to date.

Need a simple, no frills OCR solution without spending hundreds of dollars on a professional software package? Look no further. There is a no cost, donation optional, OCR freeware solution for […]

2022-06-21T12:06:06-04:00Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |

Convert Scanned Image to Text Document

The primary purpose of Optical Character Recognition is to quickly and automatically convert scanned images of machine-printed (typed) text – which to a computer are no more meaningful a collection of pixels than any other image, such as a landscape photo – into actual text data that you can search through and modify.

OCR Software comes in many different types, which vary in price range based on their features, speed, and accuracy. One of the main qualities that OCR producers are using to differentiate their products is volume of the documents OCR will allow you to process. That may be a bit counter intuitive but features that are needed to process hundreds, thousands or millions pages a year are rather different ones.

In case of several hundreds of pages (receipts, checks, medical, tax or legal forms, personal memorabilia)  you need to scan for personal use you would need light, highly versatile, easy to use, not expensive software that will convert images just to text. It may not have automation features, and processing data further will be done manually by you. Thou it is not too hard since volume of documents is not very large and you can treat each of them individually.

Small business users usually process thousands of pages a year and require some automation features. Images need to be converted not just to text, but also to spreadsheets to be processed further. Once the system is set up it is assumed that it will run without much of the interference, and people in charge of document processing would be able to do that with certain ease.

Larger companies processing millions of documents require much larger levels of automation when each small, fine tuned feature would save thousands of work hours in a long run. Multiple machines will be processing documents […]

OCR Servers

Enterprise OCR servers let you perform Optical Character Recognition on thousands of documents at a time, scaling to meet the demands of the largest document conversions.

Traditional Desktop OCR applications require a person to load the scanned document, run the OCR process and save the output files. This makes sense when you are converting individual documents, but large organizations with thousands or millions of documents need something much more automated and scalable.

OCR Server processing workflow

Typical Enterprise OCR Applications

As the cost of OCR software and hardware goes down each year and the quality goes up, full-text search is included in more and more records management applications. Typical applications include:

  • Data mining
  • Litigation support
  • Full-text searching
  • Document management

Features of Enterprise OCR Servers

  • OCR is performed in the background without a user interface
  • Files are imported automatically from hotfolders
  • Ability to use multiple CPUs and servers for processing
  • Management tools for remote administration
  • Web service & API integration to submit OCR jobs

What is the Best OCR Server?

The ABBYY FineReader Server offers the best combination of features, performance and pricing. It has flexible licensing, including an unlimited CPU-based license that does not limit the number of pages processed.

Foxit PDF Compressor has the lowest entry level pricing, OmniPage OCR and unique PDF compression technology that can dramatically reduce the size of searchable PDF documents, leading to faster viewing and lowered cloud storage and bandwidth costs.

The SimpleIndex Server offers affordable unattended OCR services coupled with advanced data extraction and indexing capabilities that organizes documents automatically or saves metadata to Excel or a SQL database. It doesn’t have the scalability, API interfaces or compression technology that other OCR servers have, but you can bundle the Standard Server version with them to add indexing, […]

Document Scanning

One Source, Many Solutions

There are many document scanning solutions to choose from. ScanStore offers many of the top document imaging solutions under one virtual roof. ScanStore‘s CDIA+ consultants can work with you to explain the strengths and weaknesses of each option and even provide a demo of the products using samples that you provide.

You’ll find flexibility with each of these products allowing a one-person shop to jump right in, or scale up to enterprise or service bureau proportions. If you need to throw some data capture into the document imaging mix, ScanStore also carries OCR, forms processing and document management tools.

Information and Advice

Take a look at the Scanning Solutions Comparison page to find in-depth information on the features of the available offerings and for more insight in finding the best fit.

And be sure not to miss the detailed comparison of the favorite Batch Scanning solutions in the exclusive Document Scanning Software Review.

What’s Right for You

You want a paperless office and document scanning is part of the path to get you there. Simply buying a scanner and feeding paper into it isn’t going to save you money. Automation of the scanning process is what holds costs down and drives up your Return on Investment.

For example, if an OCR automation costs $3,000 to implement, but by doing so you save a $15/hr employee 10 hours per week of data entry, the feature has paid for itself in 20 weeks.

So how do we automate the data capture? Here are a few possibilities:

  • Full-Page OCR turns a scan into a full-text document you can search

  • Barcodes on each document contain key data like a customer name or invoice number

  • A single field […]

Document Management

Simple Document Management SystemsThe phrase “document management” is rather broad and can apply to a variety of scenarios depending on the needs (and size) of the business.

Small businesses and departments may only need a system that provides an efficient way to scan paper and save it in an orderly, intuitive structure.

Most projects also require the ability to search and view documents in an integrated viewer or website, and provide ways to annotate images, making notes and markup that other users can see.

Likewise we may be working with more than just digitized paper files. Native born electronic documents such as MS Office docs, PDFs, CAD drawings and graphics files.

There can also advanced records management requirements like access audit trails, document retention, lifecycle and workflow. These features are especially important when dealing with regulatory compliance such as HIPAA and Sarbanes-Oxley.

Our document management solutions can fit any budget or support any project requirements. It’s not always possible to do both at once, but we will try our best!

Contact Us for a free evaluation of your document management project and online demo of our software recommendations.

Personal & Small Business

Users within a single department, working from home or who have a small business can simply scan their documents to a folder that is shared to everyone. In this “ad-hoc” scenario you only need some basic document scanning software to simplify and bring consistency to your filing system. Our SimpleIndex software is a perfect all-in-one scanning and document management tool for this purpose.

If you want to move to the next level, there are Desktop Document Management options that provide an all-in-one means for capture, storage, search and retrieval of documents. These solutions are affordable and focused on automating process of organizing and […]

Forms Processing

What is ICR, Survey & Forms Processing?

ICR stands for Intelligent Character Recognition and is the technology that allows software to interpret hand printed text on scanned images.

Data Capture Forms OCRForms Processing Software uses ICR technology to automate data entry tasks involving hand-filled surveys, applications and forms. It provides interfaces for scanning, recognition, data verification and export, as well as management and monitoring tools to track large volumes of documents and data through the workflow.

Forms Processing also includes OCR (Optical Character Recognition) technology to recognize machine printed text, and OMR (Optical Mark Recognition) for check boxes and multiple choice bubbles.

It is also possible to use these applications to automate data collection from PDF forms, Word documents, Excel spreadsheets, and other formats used to fill out forms electronically. Many include the ability to publish forms as paper, fillable PDF and web pages simultaneously to distribute and collect data from multiple sources into one dataset.

Who can benefit from forms processing software?

Any organization that collects data on paper-based forms, surveys or applications on a regular basis can get a very high return on investment by automating the data entry with forms processing software.

You do need to have a significant number of forms to justify the expense– at least a hundred forms per month or more depending on how much data is being captured. If the data entry task can be done in under 100 man-hours then it is not a good candidate for automation with ICR software.

Organizations that have many separate departments that collect data on forms can share the budget for forms processing software by re-using it for other projects. Your current project may not be big enough to justify the expense, but when combined with one or two others it would be.

How much do […]

Simple Software

SimpleIndex can bring speed and efficiency to your scanning or doc filing no matter the process. Even if all you are doing is hand keying a few basic details about a document, breaking those details into individual indexes and adding tools like drop down choice lists, automatic orientation, and blank page deletion ensure a smoother, more consistent process.

Automation

Here’s where things start to get interesting. From basic tasks like splitting individual documents within at stack of pages by spotting a blank page, a specific mark, or a barcode separator to capturing index data directly from the page or looking up additional details about a document in a database, SimpleIndex has a host of powerful tools to tame your piles of paper or drives full of digital files. Let’s look at a few.

OCR

Optical Character Recognition is the ability to take a scan, which is merely a picture of a page, and turn it into words that the computer can understand and use to index your files. SimpleIndex leverages the power of ABBYY FineReader, recognized as one of the best OCR engines on the market, to accurately capture names, dates, important numbers, document types, and other details about your file. Some products have you set a box and capture whatever information happens to fall in that zone. SimpleIndex takes it further with Dynamic Zone OCR to enable you to set an oversized zone that allows for shifting of the pages between scans, but still captures just the date you need by matching against templates, lists, or even Regular Expressions (RegEx). You can also skip the zones entirely and use the full text of a page to find matches for your index data.

Barcodes

Handprint Recognition Guide

What is ICR, Handprint Recognition?

ICR stands for Intelligent Character Recognition and is the technology that allows software to interpret hand printed text on scanned images.

Forms Processing Software uses ICR technology to automate data entry tasks involving hand-filled surveys, applications and forms. It provides interfaces for scanning, recognition, data verification and export, as well as management and monitoring tools to track large volumes of documents and data through the workflow.

Forms Processing also includes OCR (Optical Character Recognition) technology to recognize machine printed text, and OMR (Optical Mark Recognition) for check boxes and multiple choice bubbles.

Traditional forms processing relies on constrained handwriting, where boxes on the form force the filler to write with separated, printed block characters. Modern AI technology has dramatically improved the ability to recognized unconstrained handwriting and cursive script. Hand printed notes, free-form comments blocks, non-segmented fields, historic documents, and more can now be converted to text with acceptable accuracy where these were impossible just a few years ago.

Who can benefit from handwritten recognition software?

Any organization that collects data on paper-based forms, surveys or applications on a regular basis can get a very high return on investment by automating the data entry with forms processing software.

You do need to have a significant number of forms to justify the expense, at least a hundred forms per month or more depending on how much data is being captured. If the data entry task can be done in under 25 working hours then it is probably not a good candidate for automation with ICR software.

Organizations that have many separate departments that collect data on forms can share the budget for forms processing software by re-using it for other projects. Your current project may not be big enough to justify […]

Enterprise OCR Applications

Enterprise OCR Data Capture Software Enterprise OCR Data Capture Software

Enterprise OCR refers to applications designed with the features and scalability required for large businesses and service operations.

Speed and efficiency are the name of the game at the enterprise level so options like batch processing, multi-user and multi-server workflows, security and compliance auditing are found in these applications.

Enterprise OCR can also refer to Enterprise Site Licensing for desktop OCR applications that allow any user in your organization to install licensed OCR tools without incremental costs. Contact Us for a quote on any Site License.

Enterprise Data Capture Solutions Enterprise Constitution Class Starship

Enterprise Document Management

With the high volume of documents coming out of an enterprise OCR product, there is a need for robust Document Management applications with enhanced features that cover the stricter oversight needs of large organizations. Sorting through thousands or millions of pages can quickly turn digital documents into a quagmire without proper organization, tagging, search and workflow capabilities.

Enterprise Document Management features include:

  • Digital signatures
  • Document life cycle management
  • Version control
  • Advanced keyword searching & full-text indexing
  • Audit trails (HIPAA, Sarbanes compliance)
  • Cloud Based Document Management Apps Cloud Based Document Management Apps

    Email archiving

  • Workflow routing
  • Enterprise Report Processing (ERP)
  • Document access control

Our document management solutions work with any of the enterprise OCR products below to provide a secure end-to-end solution. Contact Us to see how they work together in an online demo or get a quote.

OCR Data Capture

What is OCR Data Capture?

document OCR process automationOCR stands for Optical Character Recognition and is the technology that allows software to interpret text on scanned images. When this technology is applied to automating business data entry processes it’s referred to as OCR Data Capture.

Many are familiar with popular desktop OCR applications designed to convert scanned images to editable documents. When this process is applied to specific areas of the document containing data fields it’s called zone OCR. But OCR data capture software is more than just simple zone OCR. Modern applications use some or all of these technologies:

Enterprise data capture systems provide interfaces for scanning, recognition, data verification and export, as well as management and monitoring tools to track large volumes of documents and data through the workflow.

Who can benefit from OCR data capture software?

messy business information made easy with ocr data captureAny organization that collects data from paper documents, or electronic files like PDF and Office documents, can get a very high return on investment by automating the data entry with OCR data capture software.

You do need to have a significant number of documents to […]

OCR Consulting Services

OCR Experts for Any Project

Our unique team of OCR experts are equipped to help out with OCR projects of any size or complexity. We have support specialists that can remotely configure desktop solutions in a matter of minutes and expert systems integrators with years of programming, database design, and robotic process automation experience.

Desktop OCR

Batch Document Scanning and OCRUse our online store to order desktop OCR applications and our staff will be happy to answer your setup questions via email or web chat.

Remote configuration and training services using GotoMeeting are available for a low hourly rate.

Let Us OCR That For You

Got a one-time conversion and don’t want to hassle with software? Upload your scanned document to us and we’ll send back the converted files. Optional verification service corrects recognition errors and layout issue for a low hourly rate.

Data processing for forms, reports, directories, and other documents is also available with output to CSV, Excel, XML, JSON, SQL, etc.

Contact us and if possible provide a sample, total pages, desired output and whether you want us to correct the results after OCR and we’ll reply back with a quote right away. Prices start at $50 for up to 1,000 pages.

Batch Scanning & OCR Servers

Data Capture Forms OCRAutomate document scanning and digital document archival processes using zone OCR, barcode recognition, database integration and other technologies.

Small business systems and single document workflows can be setup remotely via GotoMeeting, usually in just a few hours. Chat now if we’re online or leave a message to schedule a consultation.

Data Capture and Forms Processing

Advanced data extraction solutions that can turn the most complex documents into structured data ready […]

Title

Go to Top