The phrase “document management” is rather broad and can apply to a variety of scenarios depending on the needs (and size) of the business. In some instances we are talking about a system that merely gives us an efficient way to scan paper and save it out in an orderly, intuitive structure. Other times, we need to bring in search and viewing capabilities that are all handled inside a single piece of software. Likewise we may be working with more than just digital versions of paper files, such as MS Office documents, CAD drawings and graphics files. We can even run into the need for advanced features like document lifecycle management and workflow. These features are especially important when regulations such as HIPAA and Sarbanes-Oxley apply, to keep prying eyes away from critical information, and to limit liability by removing old docs once they’ve passed their business or regulatory prime.

More information and side-by-side feature comparisons are available on the Document Management Comparison page.

Combining multiple documents into one PDF in order

How do I ensure that the list of documents I am combining will be in order when I right click “Combine files as one PDF”?

Pre sort the documents before combining and then use shift + click to select all files, then right click on the first file in the list and select “Combine files as one PDF”.  Note that if you right click on any other file in the group the files will be combined starting at that file and not the first file.

 

Document Management OCR

General information about redaction and how to redact a sensitive document

Redaction is the process of removing information from documents, typically confidential information, before final publication.  This is most popular in the Legal industry when names and personal information is removed from a file before it is made accessible by the public.  During the redaction process, words can be blacked out to make them unreadable and unsearchable

How to use redact:

  1. Load, recognize and optionally proof and edit a document for redaction.
  2. Save the recognition results to have a clean copy of the original document (recommended).
  3. To mark for redaction by searching, click “Edit > Find and Mark Text”.
  4. On the “Mark Text” tab, enter a search string and click “Find Next”. To mark this occurrence and move to the next one, click “Mark for redacting”; to skip the occurrence click “Find Next”.
  5. Continue until there are no more occurrences. Perform new searches as desired.
  6. Review the marking in the Text Editor. If the Mark Text toolbar is not visible, click “View > Toolbars” and enable “Mark Text”.
  7. Click the “Mark for Redacting” button under the Text Editor tab, then select any other text strings you want marked.
  8. Click the button again to finish marking. To remove a marking, select it and click the “Mark for Redacting” button again.
  9. Save the OmniPage document (recommended), then export the marked recognition results to a convenient file type, e.g. PDF. You can pass this file to colleagues for confirmation of the marking.
  10. When the marking is reviewed, open the OmniPage document (if necessary) and make changes as required.
  11. To complete the redaction process, press the “Redact Document” button. A dialog box gives you the chance to have redaction applied to a copy. Choose this to get two documents, one marked for redaction and the new one completely redacted.  Then export these to store the marked version and distribute the redacted […]

How to use the [CURRENTDATETIME] tag in WorkFlow Pre-Conditions

The [CURRENTDATETIME] tag can be used to set up WorkFlow pre-conditions where [CURRENTDATETIME] represents the current system time and also dates within Records Retention jobs.

When setting up Records Retention policies and WorkFlow definitions, it may be helpful to be able to reference the current system time to determine if documents should be selected.  The [CURRENTDATETIME] is the current system time of the automation server when it runs the specified operation.  This means the value is always changing.  Date ranges can also be used, for example:

[CURRENTDATETIME+1Y] = Current Date/Time plus 1 year

Example:

If you want to bring documents into a WorkFlow 90 days after a specified date index field, set up the WorkFlow pre-condition so the date fields From range is [CURRENTDATETIME-50Y] and the To range is [CURRENTDATETIME-90D].  This means that every time the WorkFlow checks for new documents it uses the time the operation runs for the CURRENTDATETIME value.  Any document where the date field falls in the range of the current system time minus 50 years to the current system time minus 90 days will be brought into the WorkFlow.  If a document is added with today’s date, it will not enter the WorkFlow.

Regular Expression to Validate Date Formats

Within PaperVision Capture, regular expressions can be used to validate batch names and index fields populated by a user or an OCR process. Below are a few regular expressions that validate some common date formats:

Format 1:

MM/DD/YY HH:MM AM/PM

Regular Expression:

^([0]\d|[1][0-2])\/([0-2]\d|[3][0-1])\/\d{2}(\s([0]\d|[1][0-2])(:[0-5]\d){1,2})*\s*([aApP][mM]{0,2})?$

Examples:

12/31/2002
12/31/2002 08:00
12/31/2002 08:00 AM

Format 2:

DD/MM/YYYY HH:MM AM/PM

Regular Expression:

^([0]\d|[1][0-2])\/([0-2]\d|[3][0-1])\/\d{4}(\s([0]\d|[1][0-2])(:[0-5]\d){1,2})*\s*([aApP][mM]{0,2})?$

Examples:

31/12/2002
31/12/2002 08:00
31/12/2002 08:00 AM

Format 3:

 

YYYY/MM/DD HH:MM:SS

Regular Expression:

^\d{4}\/([0]\d|[1][0-2])\/([0-2]\d|[3][0-1])(\s([0]\d|[1][0-2])(:[0-5]\d){1,2})*\s*([aApP][mM]{0,2})?$

Examples:

2002/02/03
2002/02/03 12:12:18

Date Format OCR Capture

 

Scanning different sizes of paper within the same document

A document may contain pages with different sizes. The user wants to scan all pages based on the original size of the documents in one scanning process.

If the scanner supports auto page size detection, this setting can be enabled through the scanner driver.

To enable auto page size detection:

  1. From the Operator Console’s Scanner menu, select Scanner Settings.
  2. Choose the scanner from the drop-down list.
  3. Click the Properties button to the right of the scanner name.
  4. If available, enable the Auto Page Size Detection setting.

Note: Not all scanner models have an auto page size detection setting.

How to configure a Batch Splitting step to split on a blank value

In PaperVision Capture a batch splitting step can be configured to meet one or more of many conditions. In some cases it may be desirable to split a batch based off a blank value within an index field. This can be achieved by using a String Comparison or Regular Expression.

The following steps should be used to configure batch splitting using a blank value. Note: These steps assume you will be splitting the batch based on an index field called “ExampleIndexField”. The index field should already exist in the job.

To split the batch on a blank value using the String Comparison type:

  1. Setup the Target Job Configuration.
  2. Add a batch split step.
  3. Add a New Condition.
    • The condition source: Capture Index
    • Choose Capture Index: “ExampleIndexField”
    • Choose Comparison Type: String Comparison
    • Leave the drop down on the equal sign “=” and leave the text box, blank.
    • Click Finish
  4. The condition should read (CI.ExampleIndexField = “”)

 

To split the batch on a blank value using the Regular Expression Comparison type:

  1. Setup the Target Job Configuration.
  2. Add a batch split step.
  3. Add a New Condition
    • The condition source: Capture Index
    • Choose Capture Index: “ExampleIndexField”
    • Choose Comparison Type: Regular Expression
    • Input the Regular Expression which represents any blank space characters: ^\s*$
    • Click Finish
  4. The condition should read (CI.ExampleIndexField RegEx.Match(“^\s*$”)

PDF Processing with FineReader and FineReader Server

How to create a PDF from Microsoft® Word, Excel, or PowerPoint

 

How to convert emails to PDF

 

How to Split a PDF

Create new PDF documents or separate PDF documents combined in one easily with FineReader PDF 15.

Learn how to split PDFs and extract pages easily.

 

 

How to create and edit interactive PDF forms

Watch this video and see how to edit and create interactive PDF forms quickly and easily.

Form Editor tool in FineReader PDF 15 allows creating and editing fillable PDF forms with text and date fields, dropdown lists, list boxes, checkmarks, radio buttons, signature fileds and action buttons. Collect information and create effective document templates with ease!

 

How to extract text from scanned PDFs

 

 

How to extract tables

 

 

How can I verify if the digital signature is valid?

If you open a document with a valid digital signature in FineReader, you will see a green notification Valid on the left panel of ABBYY FineReader PDF 15:
 mceclip0.png

Recognizing a document with existing text layer in FineReader PDF 15

  1. Open FineReader PDF 15;
  2. Go to Tools > Options > OCR;
  3. In the PDF recognition mode select Use OCR option:
  4.  Click OK;
  5.  Recognize your document again.

 

 

How to convert a document into an accessible PDF/UA

Make your mixed documents—PDF, scanned, photographed, or papers— digital and accessible.

In this […]

Reading Barcodes with Digitech PaperFlow and PaperVision Capture

Does processing barcodes “on-the-fly” make any difference in speed or recognition?

On-the-fly processing is actually a preferable way of reading barcodes since it does not noticeably decrease scan speed. The recognition will be the same whether the barcode is processed on the fly or as a post-process.

 

Using ABBYY Vantage Document Skills

Processing Your First Documents with Vantage

Learn how easy it is to get started with Vantage – upload your documents and Vantage will take care of the rest.

 

How to Create and Train a Vantage Document Skill

Learn how to use the Vantage Skill Designer to create and train a new Document Skill with just a few sample documents.

 

How to Create and Train a Classification Skill in ABBYY Vantage

Learn how to use the Vantage Skill Designer to train a new Classification Skill. You need just a few samples of each document class.

 

 

How to Automate a Complete Workflow, by Creating a Vantage Process Skill

 

 

How to Edit a Document Skill

Learn how to adapt already existing skills to your specific documents and business requirements.

 

 

How to perform the first authentication in Vantage Swagger UI?

To get a first access token perform the initial authentification using the default client, one does not need to enter any passwords or client ID. The initial authentication is preconfigured. Just open a Swagger page (EU link or US link), click Authorize:

mceclip1.png

Select all scopes, and click Authorize again:

mceclip0.png

The password should be specified only for a custom client. A custom client can be created after the initial initialization.

References

EU Help: Getting a Tenant Identifier or US Help: Getting a Tenant Identifier

EU Help: Creating a Client or US Help: Creating a Client

Learn more at ABBYY […]

Using SharePoint in FineReader Server

Saving documents to the SharePoint in ABBYY Recognition Server

Note: In order to be able to communicate with the SharePoint Server, the Server Manager and the Remote Administration Console require Microsoft .NET Framework 4.5 to be installed.

To be able to save output documents to a SharePoint Server library, the ABBYY Recognition Server Server Manager service must be run under a user account that has read/write access to the SharePoint Server library. If during the installation, you chose to run the service under a Local System account, you should restart it under a user account.

To set up the publishing of documents to a SharePoint Server library:

  1. Run the Remote Administration Console under a user account that has read/write access to the SharePoint Server library.
  2. Create a new workflow or modify an existing one (see Creating a New Workflow). In the Output Format Settings dialogue select Save output file in SharePoint library.
  3. Enter the URL of the SharePoint Server site (e.g. http://myportal/mysite/) and click Connect. The Remote Administration Console will try to connect to the specified site and download the list of document libraries and folders from there. If the connection is successful, you will see the “Connected” message below the button, and the names of the document libraries will appear in the Select document library list.
  4. Select the document library from the list. Click the Settings…button to associate a document’s metadata fields with the corresponding columns of the selected document type in SharePoint.
  5. Select the folder in the document library using the Browse…button or leave the field empty to save documents in the root folder.
  6. Click OK in the Output Format Settings dialogue box.

If the Input folder has several subfolders containing image files, the output files will be saved in […]

Document Processing via Email in FineReader Server

In this video, learn how to configure a workflow for document input and processing via e-mail on FineReader Server.

See how in a few simple steps you can configure this workflow. You can even edit the e-mail subject and message. Watch how to setup usage scenario: Centralized document conversion service in this video.

From the document input to the document output, including the document processing, FineReader Server is designed to simplify, optimize and fasten your worflows. Scalable and easy to configure, FineReader Server can adapt to all your needs.

 

How to convert several emails from MS Outlook into PDF?

In order to convert several emails into a PDF file, you may use the virtual printer PDF-XChange 5.0 for FineReader.

Follow the steps below:

  1. Select the needed emails in MS Outlook.
  2. Press File>Print.

    File_Outlook.png

  3. Select ​PDF-XChange 5.0 for FineReader as a printer and press Print.

    br-lazy"

  4. Save the PDF file.

 

How to set up the import from the Gmail mailbox using the IMAP Image Import Profile?

  1. In your Gmail, create the folder (mailbox) that you want to import from.
  2. In your Gmail, create the folders (mailboxes) for the Exceptions and Processed emails.
  3. Enable the IMAP protocol in Gmail settings.br-lazy"
  4. Turn on the Less secure app access option in the Security section of Google Account Settings.br-lazy"
    br-lazy"
  5. Create a new Image Import Profile (open the project in the Project Setup Station > Project > Image Import Profiles > New). Choose Hot Folder: IMAP Server.
  6. Specify the address of the IMAP server: imap.gmail.com.
  7. Click settings and specify your Gmail login and password. Choose Type of encrypted connection: SSL.

    mceclip0.png

  8. Click Browse and select […]

ImageSilo Direct

What if you could include the critical business information you’re currently storing in paper files in your ImageSilo cloud information management service? Scan, import, index, and organize paper documents using your existing scanners and multi-function devices (MFD) to create convenient digital files and securely upload them to the cloud.

Start scanning documents right at your desk! Turn any vulnerable paper document into a useful digital file that can be securely managed in your ImageSilo cloud service.

PaperVision Direct

What if you could include the critical business information you’re currently storing in paper files in your PaperVision®.com cloud information management service? Scan, import, index, and organize paper documents using your existing scanners and multi-function devices (MFD) to create convenient digital files and securely upload them to the cloud.

Start scanning documents right at your desk! Turn any vulnerable paper document into a useful digital file that can be securely managed in your PaperVision.com cloud service.

Robotic Process Automation

Introducing Robotic Process Automation

RPA stands for Robotic Process Automation and it represents a new approach to business automation that helps minimize the technical hurdles required for implementing new workflows.

Robotic Process Automation of Data Entry

Traditional business process automations rely on application programming interfaces (APIs) to allow systems to exchange data. This approach has two main drawbacks:

  1. The application vendor must make those APIs available
  2. A programmer needs to write custom code to interface with them

If your software vendor does not provide an interface for consuming the data you need to automate, then you’re out of luck. And even if they do, the development costs can eliminate the ROI if the transaction volume isn’t large enough.

RPA tools avoid the API problem by interfacing directly with the application user interface just like a human would do. They use artificial intelligence and machine learning to “watch” the operator perform a task within the application then creates its own program (called a “bot”) to mimic it. This means that:

  1. Bots can do anything a human can do within the application
  2. Users can create a bot without writing code

Practically speaking, an experienced robotic process automation consultant with programming experience is required to roll out an RPA solution enterprise-wide, and most users will only be able to automate small, routine tasks without assistance. Business-critical, high-volume automations will still involve coding. But RPA dramatically reduces the implementation time and avoids the need to retrofit APIs for software applications that were not designed to support them.

Using RPA with OCR Data Capture

UiPath Robotic Process Automation RPA OCROCR Data Capture is one of the most common business processes to automate with RPA. Taking data stored in paper or electronic documents and […]

Computhink ContentVerse (Cloud or On-Premise)

Computhink’s ContentVerse® provides Document Management Solutions that integrate into your existing infrastructure, supplying the missing piece of the puzzle for secure information sharing and compliance, targeting small and medium size organizations. Computhink’s ContentVerse streamlines business processes, improves customer service, reduces costs and ensures compliance.

PaperVision Enterprise (Cloud or On-Premise)

PAPERVISION performs dependable electronic document management to automate office environments, conserve paper, time, money, and provide peace of mind. Retrieval solutions provide enterprise scalability and functionality with advanced features that enhance your efficiency and protect corporate data.

Electronic information is retrieved instantly with our user-friendly graphical interface that displays a complete overview of all your available projects. View, manipulate, print, fax, export, and e-mail documents directly from your PC.

On-premise installation or cloud hosted services available.

OCR Consulting Services

OCR Experts for Any Project

Our unique team of OCR experts are equipped to help out with OCR projects of any size or complexity. We have support specialists that can remotely configure desktop solutions in a matter of minutes and expert systems integrators with years of programming, database design, and robotic process automation experience.

Desktop OCR

Batch Document Scanning and OCRUse our online store to order desktop OCR applications and our staff will be happy to answer your setup questions via email or web chat.

Remote configuration and training services using GotoMeeting are available for a low hourly rate.

Batch Scanning & OCR Servers

Data Capture Forms OCRAutomate document scanning and digital document archival processes using zone OCR, barcode recognition, database integration and other technologies.

Small business systems and single document workflows can be setup remotely via GotoMeeting, usually in just a few hours. Chat now if we’re online or leave a message to schedule a consultation.

Data Capture and Forms Processing

Advanced data extraction solutions that can turn the most complex documents into structured data ready for use in business applications. Each member of our data capture consulting team has over 10 years experience designing and implementing advanced OCR solutions.

We are the most experienced system integrator in the US for our flagship data capture platform, ABBYY FlexiCapture. We saw its potential immediately when it was introduced and now over 15 years later it is the leading data capture solution and no team is more experienced than ours at implementing it. We are the ones that other ABBYY integrators call for their most complex implementations.

While we have designed capture solutions for all types of documents, we have particular expertise in the following areas:

    […]

Why are the prices of OCR applications so different?

OCR software ranges in price from freeware all the way up to tens of thousands of dollars. What explains the difference between these applications? Here’s the breakdown:

  • OCR Freeware uses the SimpleOCR or Tesseract engines and provide limited scanning and output format capabilities. Recognition quality is generally poor except for the highest quality document images.
  • PDF OCR Converters provide good quality OCR engines like ABBYY, IRIS and OmniPage, but limit the output to searchable PDF files. These cost less than $100.
  • Standard OCR applications range from $100-$200 and provide full OCR capabilities including converting scans to Word, Excel, HTML and other editable formats.
  • Corporate OCR applications add advanced features like automated hotfolder processing, concurrent licensing and other features useful for business applications. Pricing for these is $200-$500.
  • OCR Servers provide scalable, enterprise OCR services for processing very high volumes of documents or providing OCR capabilities to users throughout the organization. Prices start around $1,500 and go up based on processing volume.
  • Enterprise Data Capture and Forms Processing applications are used to capture structured data from complex documents like healthcare claim forms and invoices that include things like tables, handwriting, checkboxes, and movable zones. These solutions can cost anywhere from around $1,000 to hundreds of thousands of dollars depending on the document volume and complexity of the project.

Tungsten Kofax PaperPort Professional

Tungsten Kofax PaperPort Professional empowers your organization to take control of document management beyond the desktop. With Tungsten PaperPort Professional, office workers or individual professionals can save time and money with instant access to all documents—anytime, anywhere.

Tungsten Kofax PaperPort Standard

Tungsten Kofax PaperPort Standard allows individuals and small organizations to scan, share, search and organize documents in a simple, integrated solution. With Tungsten PaperPort, you can take individual information management to new levels of productivity and security using the ultimate digital filing cabinet.

SimpleIndex Barcode Suite

Simple Software SimpleIndex Product Suites offer you a better deal on bundles of essential products.

SimpleIndex Barcode Suite combines best Simple Software products to create a complete Barcode OCR solution. It includes:

  • SimpleIndex Barcode Server  license with built in Accusoft barcode engine and server functionality.
  • SimpleSend solution enables automated sending of document files via secure FTP or email. SimpleSend enhances the functionality of SimpleIndex in several ways as well as functioning as a standalone application.
  • SimpleExport license is designed to convert any delimited text file into any XML or formatted text file format using XSLT. It automates the process of applying XSLTs, especially for document imaging applications where the data has matching files that must be moved or renamed along with the data.
  • 5 licenses of SimpleCoversheet which is designed to work with data sources like SQL databases, spreadsheets and text files to dynamically build lists of barcodes to print. This is especially useful in document scanning applications where barcodes are used to identify and file documents automatically.

Enterprise OCR Applications

Enterprise OCR Data Capture Software Enterprise OCR Data Capture Software

Enterprise OCR refers to applications designed with the features and scalability required for large businesses and service operations.

Speed and efficiency are the name of the game at the enterprise level so options like batch processing, multi-user and multi-server workflows, security and compliance auditing are found in these applications.

Enterprise OCR can also refer to Enterprise Site Licensing for desktop OCR applications that allow any user in your organization to install licensed OCR tools without incremental costs. Contact Us for a quote on any Site License.

Enterprise Data Capture Solutions Enterprise Constitution Class Starship

Enterprise Document Management

With the high volume of documents coming out of an enterprise OCR product, there is a need for robust Document Management applications with enhanced features that cover the stricter oversight needs of large organizations. Sorting through thousands or millions of pages can quickly turn digital documents into a quagmire without proper organization, tagging, search and workflow capabilities.

Enterprise Document Management features include:

  • Digital signatures
  • Document life cycle management
  • Version control
  • Advanced keyword searching & full-text indexing
  • Audit trails (HIPAA, Sarbanes compliance)
  • Cloud Based Document Management Apps Cloud Based Document Management Apps

    Email archiving

  • Workflow routing
  • Enterprise Report Processing (ERP)
  • Document access control

Our document management solutions work with any of the enterprise OCR products below to provide a secure end-to-end solution. Contact Us to see how they work together in an online demo or get a quote.

SimpleView

Application for managing and viewing scanned documents, images and PDF files.

Unlike other freeware PDF viewers, SimpleView is designed to work with many files at once instead of one at a time. The free version also supports TWAIN scanning and the ability to move, rearrange and rotate pages.

SimpleIndex OCR Server 1M PPY

SimpleIndex  OCR Server 1 million pages per year – ABBYY FineReader OCR Server

Document capture solution with a one-click interface that automates your scanning and document filing by creating easy-to-find electronic content, saving you time and money.  It’s highly customizable to meet even the most detailed needs, with top quality technicians to support your requirements.

Title

Go to Top