Convert PDF files to text or Word documents that can be edited, or extract structured data to Excel, CSV, XML, JSON or SQL databases. Affordable desktop solutions, enterprise OCR servers and data capture solutions. SimpleOCR has optical character recognition solutions for any project and budget.

What Document Management Systems are supported by PowerPDF Advanced 2.1?

Q: What Document Management Systems are supported by PowerPDF Advanced 2.1?

A: Document Management System support in the Advanced Edition

  • Microsoft® SharePoint Server 2003, 2007, 2010 and 2013 Windows SharePoint Services (WSS)  3.0 and Microsoft Office 365
  • eDOCS DM (former Hummingbird Enterprise™ 5.3.1 and 10

Power PDF supports eDOCS systems  if a DM Extensions API is installed and configured on the client machine. The product is also integrated into Windows Explorer DM Extension.

  • Interwoven® WorkSite 8.3, 8.5 and 9.3

Power PDF supports Interwoven WorkSite systems if DeskSite 8.2 or FileSite 8.2  for WorkSite 8.3 or FileSite 8.5 for WorkSite 8.5 or DeskSite 9.3 or FileSite 9.3 for Work 9.2 is installed and configured on the client machine. The product is also integrated into DeskSite and FileSite clients. The Nuance implementation warns if a requested document is checked out to another user.

  • Livelink® ECM – Enterprise Server 9.7.0 and 10 from OpenText Corp.

Power PDF supports LiveLink ECM if a LiveLink Explorer Professional Windows Client is installed and configured on the client machine.

  • OpenText Enterprise Connect

Supports OpenText Content Server through Enterprise Connect if the  Enterprise Connect framework 10.5 or higher is installed and configured on the machine

  • NetDocuments SaaS cloud-based storage.

Save files to this web-based storage facility, providing Software as a Service (SaaS).

  • Worldox® GX3 and GX4
  • EMC2® Documentum 7.1,  6.7

Power PDF supports EMC2 Documentum if a DFC 6.5 client is installed and configured on the client machine.

  • Xerox DocuShare 6 and 6.5
  • OnBase 13+

To work with Hyland’s OnBase from Power PDF the Nuance module must be licensed on the OnBase server.

DMS PowerPDF Document Management Systems

How to have more control over the OCR process in PowerPDF

Q: How to have more control over the OCR process in PowerPDF?  For example, to edit the text in the OCR layer to correct mistakes.

A: As designed, Nuance PowerPDF does not offer this functionality .

Nuance Power PDF program offers a powerful built-in OCR engine but it only offers limited control over the OCR process.  To accomplish what the client is requesting you would specialized Optical Character Recognition (OCR) program such as Nuance® OmniPage®.

There are many advantages in using this Nuance® OmniPage® Optical Character Recognition (OCR) program if you want more control over the OCR process.

  • Choose from four formatting levels instead of two (see below)
  • Win full control over the OCR process, including:
    • The ability to manually zone pages
    • Access to multi-lingual spell checking and proofing
    • Dynamic verifier image display to speed up editing
    • Voice readback facility
    • And much more.
  • Scan new pages into the converted document
  • Add new pages from fax, image files or digital cameras
  • Save to other formats, including OmniPage’s internal format for document sharing with other OmniPage users.

The four formatting levels offered for saving in OmniPage are:

The pages retain the layout of the originals. Graphics and framed elements are placed in text boxes. Whenever possible, other text is transferred without using text boxes. Power PDF offers this under the name Flowing Column.

The pages retain the layout of the originals, but all elements are placed in text boxes, including text in columns. Power PDF offers this formatting.

Text is decolumnized, but text attributes, graphics and tables are retained.

  1. Flowing Page
  2. True Page
  3. Formatted Text
  4. Plain Text

Text is decolumnized and rendered as plain text. Graphics and tables are retained, but not in their original locations. This option is convenient for users who want to reformat the content.

 

Configuring Nuance PDF output settings to include more than 500 pages per PDF

By default when using the Nuance Full-Text step each PDF that is generated can only contain 500 pages. The following steps can be used to increase the maximum number of pages that each PDF can hold beyond 500 pages.

By default the Nuance full-text step is designed to create 500 page PDF’s. Use the following steps to increase the default number:

This change should be made on all Capture Automation servers that will be generating PDF’s.

  1. Make a copy of the ClientSettings.xml file located at C:\Program Data\Digitech Systems
  2. From the desktop, click Start > Run > type services.msc and press <Enter>
  3. Highlight and right-click the PaperVision ProcessInitiator1 service and choose Stop
  4. Edit the ClientSettings.xml file using Notepad
  5. Add the following line of text:<OCRFullTextMaxPagePerDoc>500</OCRFullTextMaxPagePerDoc>
  6. Change the value from “500” to the number of pages that will be used to create each PDF (e.g. 750)
  7. Save and close the ClientSettings.xml file
  8. From the desktop, click Start > Run > type services.msc and press <Enter>
  9. Highlight and right-click the PaperVision ProcessInitiator1 service and choose Start

PDF Processing with FineReader and FineReader Server

How to create a PDF from Microsoft® Word, Excel, or PowerPoint

 

How to convert emails to PDF

 

How to Split a PDF

Create new PDF documents or separate PDF documents combined in one easily with FineReader PDF 15.

Learn how to split PDFs and extract pages easily.

 

 

How to create and edit interactive PDF forms

Watch this video and see how to edit and create interactive PDF forms quickly and easily.

Form Editor tool in FineReader PDF 15 allows creating and editing fillable PDF forms with text and date fields, dropdown lists, list boxes, checkmarks, radio buttons, signature fileds and action buttons. Collect information and create effective document templates with ease!

 

How to extract text from scanned PDFs

 

 

How to extract tables

 

 

How can I verify if the digital signature is valid?

If you open a document with a valid digital signature in FineReader, you will see a green notification Valid on the left panel of ABBYY FineReader PDF 15:
 mceclip0.png

Recognizing a document with existing text layer in FineReader PDF 15

  1. Open FineReader PDF 15;
  2. Go to Tools > Options > OCR;
  3. In the PDF recognition mode select Use OCR option:
  4.  Click OK;
  5.  Recognize your document again.

 

 

How to convert a document into an accessible PDF/UA

Make your mixed documents—PDF, scanned, photographed, or papers— digital and accessible.

In this […]

ABBYY Cloud OCR SDK

ABBYY® Cloud OCR SDK is a web-based document processing service that will enhance your enterprise software systems, SaaS platforms, or your mobile apps with the ability to convert documents and utilize textual information from scans, PDFs, document images, smartphone photos, or screenshots.

Combining ABBYY’s latest AI-based technologies for information extraction with the highly scalable processing power of the Microsoft® Azure® computing infrastructure, this secure and reliable ABBYY cloud service can be easily integrated into your application via a REST API—empowering it to precisely convert virtually any number of pages within the shortest amount of time.

How to scan documents to searchable PDF files

Adobe Acrobat OCR to Searchable PDFIf you don’t already have a scanner, and scanning to searchable PDF files is the only thing you need to do, you will find many document scanners that can perform this function. Most desktop and high-speed document scanners come with software that has this basic capability. However these often have limited functionality and you may prefer a more robust application.

To create searchable PDFs with any scanner, use Desktop OCR software applications like FineReader, ReadIRIS, or OmniPage. These programs can also be used to convert images to MS Word, Excel, and other editable formats.

There are also more affordable PDF converters that have fewer OCR features and limit output to PDF files.

You can find a complete guide to OCR software here.

For high-volume applications, use OCR servers to give everyone on your network the ability to create searchable PDFs on a dedicated server.

Enterprise site licensing, concurrent user licensing and cloud-based solutions are also available. Please contact us for more information or a quote for desktop OCR and PDF converter site licensing options.

You may use SimpleIndex to automatically extract data from searchable PDFs for indexing, automatic file naming, and integration with custom database or document management applications. This is a very fast and accurate way to set keyword metadata for searching. It has both Tesseract and FineReader OCR options for creating searchable PDFs, and is available in desktop or server versions.

Tungsten Kofax OmniPage Server On-Premise

Tungsten Kofax OmniPage Server turns OmniPage into a true server-based OCR solution that is scalable to any volume by load-balancing across multiple servers. OmniPage Server is perfect for high-volume conversion projects or for distributing OCR throughout the enterprise.

SimpleIndex Pro Server 1M PPY

SimpleIndex Pro Server 1 million pages per year – ABBYY FineReader OCR Server, Accusoft Barcode Engine 1D/2D Client, DTK Barcode Engine 1D/2D Server & ISIS Scanning

Document capture solution with a one-click interface that automates your scanning and document filing by creating easy-to-find electronic content, saving you time and money.  It’s highly customizable to meet even the most detailed needs, with top quality technicians to support your requirements.

SimpleIndex OCR Workstation

Document capture solution with a one-click interface that automates your scanning and document filing by creating easy-to-find electronic content, saving you time and money.  It’s highly customizable to meet even the most detailed needs, with top quality technicians to support your requirements.

SimpleIndex OCR Workstation version
Includes:

basic text and barcode recognition,
ABBYY FineReader OCR Client,
TWAIN and ISIS scanning
1 Year Support & Upgrades

Why are the prices of OCR applications so different?

OCR software ranges in price from freeware all the way up to tens of thousands of dollars. What explains the difference between these applications? Here’s the breakdown:

  • OCR Freeware uses the SimpleOCR or Tesseract engines and provide limited scanning and output format capabilities. Recognition quality is generally poor except for the highest quality document images.
  • PDF OCR Converters provide good quality OCR engines like ABBYY, IRIS and OmniPage, but limit the output to searchable PDF files. These cost less than $100.
  • Standard OCR applications range from $100-$200 and provide full OCR capabilities including converting scans to Word, Excel, HTML and other editable formats.
  • Corporate OCR applications add advanced features like automated hotfolder processing, concurrent licensing and other features useful for business applications. Pricing for these is $200-$500.
  • OCR Servers provide scalable, enterprise OCR services for processing very high volumes of documents or providing OCR capabilities to users throughout the organization. Prices start around $1,500 and go up based on processing volume.
  • Enterprise Data Capture and Forms Processing applications are used to capture structured data from complex documents like healthcare claim forms and invoices that include things like tables, handwriting, checkboxes, and movable zones. These solutions can cost anywhere from around $1,000 to hundreds of thousands of dollars depending on the document volume and complexity of the project.

Does ReadIRIS, FineReader or OmniPage support Zone OCR?

The “Pro” versions of most Desktop OCR applications support the creation of zone templates that can be used to OCR specific regions on batches of documents.

Most OCR applications have “Lite” versions that don’t have the ability to manually create zones so it’s important to get the correct version.

With these applications it is often not possible to output this data as “fields” in a structured data file like CSV, Excel or XML. What you typically get a text file for each document with a line of text for each zone. The zones are designed more for excluding regions you don’t want or manually overriding the detection of text, tables and images in the document.

If you need to capture specific data in multiple documents and output them to structured data files or a SQL database, Batch OCR Applications are the best option for this.

If you need to capture data formatted in tables and output to CSV or Excel, desktop OCR applications do this quite well as long as the tables have a regular format with well-defined columns.

To capture handprint, irregular tables, large numbers of data points, or data that doesn’t always appear in the same place on every page, Forms Processing software is what you need.

Knowledge Base

The SimpleOCR Knowledge Base contains frequently asked questions and answers, technical guides and general information on a broad range of optical character recognition, handprint recognition, data capture, PDF OCR, AP invoice scanning and zone OCR applications.

Contact Us for FREE Consultation on Your OCR Project

Tungsten Kofax OmniPage – Ultimate

Tungsten Kofax OmniPage Ultimate has several unique features that make it stand out for a variety of applications. Some of these include auto-redaction, SharePoint integration, automatic filing with barcodes, PDF auto-bookmarking, form data collection and MFP support. Most of these new features are not available in the Standard edition.

Tungsten Kofax OmniPage – Standard

Tungsten Kofax OmniPage Standard converts paper, picture, and PDF files into editable documents to save you considerable time and money by eliminating retyping. Your documents look just like the original – complete with text, tables, and graphics. OmniPage uses superior character accuracy to precisely format your documents so you can easily make changes.

Tungsten Automation formerly Kofax OCR

Tungsten Automation formerly Kofax already had a large variety of products for your business automation like Tungsten Capture for high-volume document scanning and data capture, or Tungsten VRS Elite to deal with less then perfect images and to capture even the toughest to recognize documents.

Recently Tungsten Automation formerly Kofax had acquired Nuance’s Document Imaging Division and thus created one of the most powerful family of products for business automation. With products like OmniPage Ultimate or Standard offers you a good versatile OCR packages for small or mid level businesses. There is also an OmniPage Server option for much larger document volumes.

Kofax OmniPage OCR Software Nuance Scan Soft Ultimate Tungsten OmniPage converts paper, PDF files and forms into documents you can share, edit on your PC, listen to with natural speech, or archive in a document repository. Amazing accuracy, support for virtually any scanner, the best tools to customize your process, and automatic document routing make it the perfect choice to maximize productivity. Improved OCR engines deliver amazing accuracy for document conversion and archiving business critical documents.

Tungsten OmniPage Server is a cost-effective and reliable solution for business process owners to easily deploy a highly scalable, always-available OCR server solution for large volume of documents processing.

Tungsten Power PDF is the smart replacement for Adobe Acrobat for maximum savings without compromise. Power PDF allows you to make changes to PDF files with the fluidity, flexibility and interactivity of real word processing. In addition you can share, edit and discuss document changes using text or voice chat in real-time with multiple people. Plus you can have anywhere, anytime access to your documents using popular Cloud […]

SimpleIndex Barcode Suite

Simple Software SimpleIndex Product Suites offer you a better deal on bundles of essential products.

SimpleIndex Barcode Suite combines best Simple Software products to create a complete Barcode OCR solution. It includes:

  • SimpleIndex Barcode Server  license with built in Accusoft barcode engine and server functionality.
  • SimpleSend solution enables automated sending of document files via secure FTP or email. SimpleSend enhances the functionality of SimpleIndex in several ways as well as functioning as a standalone application.
  • SimpleExport license is designed to convert any delimited text file into any XML or formatted text file format using XSLT. It automates the process of applying XSLTs, especially for document imaging applications where the data has matching files that must be moved or renamed along with the data.
  • 5 licenses of SimpleCoversheet which is designed to work with data sources like SQL databases, spreadsheets and text files to dynamically build lists of barcodes to print. This is especially useful in document scanning applications where barcodes are used to identify and file documents automatically.

ABBYY FineReader Server On-Premise

ABBYY FineReader Server On-Premise

Innovative server-based OCR software for performing centralized enterprise-wide OCR processing. Allows anyone on the network to submit files for OCR. Complex XML job specifications can be submitted to control output. Support available for Arabic and Asian languages.

 

Available in CPU, Total Page Count and Pages Per Year licensing models.

 

SimpleView

Application for managing and viewing scanned documents, images and PDF files.

Unlike other freeware PDF viewers, SimpleView is designed to work with many files at once instead of one at a time. The free version also supports TWAIN scanning and the ability to move, rearrange and rotate pages.

SimpleIndex OCR Server 1M PPY

SimpleIndex  OCR Server 1 million pages per year – ABBYY FineReader OCR Server

Document capture solution with a one-click interface that automates your scanning and document filing by creating easy-to-find electronic content, saving you time and money.  It’s highly customizable to meet even the most detailed needs, with top quality technicians to support your requirements.

SimpleIndex Professional

Document capture solution with a one-click interface that automates your scanning and document filing by creating easy-to-find electronic content, saving you time and money.  It’s highly customizable to meet even the most detailed needs, with top quality technicians to support your requirements.

SimpleIndex Pro version Includes:

SimpleIndex Standard,

ISIS scanning,

FineReader OCR

Accusoft Barcode Upgrades

SimpleIndex Standard

Document capture solution with a one-click interface that automates your scanning and document filing by creating easy-to-find electronic content, saving you time and money.  It’s highly customizable to meet even the most detailed needs, with top quality technicians to support your requirements.

SimpleIndex Standard version
Includes:

basic text and barcode recognition,

TWAIN scanning

ABBYY FineReader PDF 15 Corporate, 1 Year Subscription

ABBYY FineReader PDF 15 Corporate, (1 Year Subscription) is an all-in-one business toolset for working with PDFs and document digitization. With FineReader PDF employees can work with both digitally created and scanned paper documents to fulfill various document-related tasks in the digital workplace effortlessly. ABBYY FineReader PDF 15 Corporate allows you to view, edit, search, comment and collaborate, sign and protect PDFs or compare document versions in different file formats to identify differences efficiently. Thanks to the seamlessly integrated AI-based OCR technology with FineReader you can also extract information from a PDF or convert the entire document to Word, Excel® for further editing. Document conversion can also be automated to prepare multiple documents for further processing.

ABBYY FineReader PDF 15 Standard, 1 Year Subscription

ABBYY FineReader PDF 15 Standard, (1 Year Subscription) is a PDF software application for working with PDF documents and scans. Powered by ABBYY’s AI-based OCR technology it allows you to convert and edit not only digital PDF documents, but also scanned paper documents with the same ease-of-use. With FineReader PDF you can view, edit, search, comment, sign, protect, extract text from PDFs and convert documents into Word, Excel® for further editing.

 

OCR Servers

Enterprise OCR servers let you perform Optical Character Recognition on thousands of documents at a time, scaling to meet the demands of the largest document conversions.

Traditional Desktop OCR applications require a person to load the scanned document, run the OCR process and save the output files. This makes sense when you are converting individual documents, but large organizations with thousands or millions of documents need something much more automated and scalable.

OCR Server processing workflow

Typical Enterprise OCR Applications

As the cost of OCR software and hardware goes down each year and the quality goes up, full-text search is included in more and more records management applications. Typical applications include:

  • Data mining
  • Litigation support
  • Full-text searching
  • Document management

Features of Enterprise OCR Servers

  • OCR is performed in the background without a user interface
  • Files are imported automatically from hotfolders
  • Ability to use multiple CPUs and servers for processing
  • Management tools for remote administration
  • Web service & API integration to submit OCR jobs

What is the Best OCR Server?

The ABBYY FineReader Server offers the best combination of features, performance and pricing. It has flexible licensing, including an unlimited CPU-based license that does not limit the number of pages processed.

Foxit PDF Compressor has the lowest entry level pricing, OmniPage OCR and unique PDF compression technology that can dramatically reduce the size of searchable PDF documents, leading to faster viewing and lowered cloud storage and bandwidth costs.

The SimpleIndex Server offers affordable unattended OCR services coupled with advanced data extraction and indexing capabilities that organizes documents automatically or saves metadata to Excel or a SQL database. It doesn’t have the scalability, API interfaces or compression technology that other OCR servers have, but you can bundle the Standard Server version with them to add indexing, […]

Convert Scanned Image to Text Document

The primary purpose of Optical Character Recognition is to quickly and automatically convert scanned images of machine-printed (typed) text – which to a computer are no more meaningful a collection of pixels than any other image, such as a landscape photo – into actual text data that you can search through and modify.

OCR Software comes in many different types, which vary in price range based on their features, speed, and accuracy. One of the main qualities that OCR producers are using to differentiate their products is volume of the documents OCR will allow you to process. That may be a bit counter intuitive but features that are needed to process hundreds, thousands or millions pages a year are rather different ones.

In case of several hundreds of pages (receipts, checks, medical, tax or legal forms, personal memorabilia)  you need to scan for personal use you would need light, highly versatile, easy to use, not expensive software that will convert images just to text. It may not have automation features, and processing data further will be done manually by you. Thou it is not too hard since volume of documents is not very large and you can treat each of them individually.

Small business users usually process thousands of pages a year and require some automation features. Images need to be converted not just to text, but also to spreadsheets to be processed further. Once the system is set up it is assumed that it will run without much of the interference, and people in charge of document processing would be able to do that with certain ease.

Larger companies processing millions of documents require much larger levels of automation when each small, fine tuned feature would save thousands of work hours in a long run. Multiple machines will be processing documents […]

Go to Top