Zonal OCR automates data entry from from documents by reading the text at specific coordinates using optical character recognition. Data is output to a database or structured text files like CSV, XML or JSON.
Q: How to have more control over the OCR process in PowerPDF? For example, to edit the text in the OCR layer to correct mistakes.
A: As designed, Nuance PowerPDF does not offer this functionality .
Nuance Power PDF program offers a powerful built-in OCR engine but it only offers limited control over the OCR process. To accomplish what the client is requesting you would specialized Optical Character Recognition (OCR) program such as Nuance® OmniPage®.
There are many advantages in using this Nuance® OmniPage® Optical Character Recognition (OCR) program if you want more control over the OCR process.
- Choose from four formatting levels instead of two (see below)
- Win full control over the OCR process, including:
- The ability to manually zone pages
- Access to multi-lingual spell checking and proofing
- Dynamic verifier image display to speed up editing
- Voice readback facility
- And much more.
- Scan new pages into the converted document
- Add new pages from fax, image files or digital cameras
- Save to other formats, including OmniPage’s internal format for document sharing with other OmniPage users.
The four formatting levels offered for saving in OmniPage are:
The pages retain the layout of the originals. Graphics and framed elements are placed in text boxes. Whenever possible, other text is transferred without using text boxes. Power PDF offers this under the name Flowing Column.
The pages retain the layout of the originals, but all elements are placed in text boxes, including text in columns. Power PDF offers this formatting.
Text is decolumnized, but text attributes, graphics and tables are retained.
- Flowing Page
- True Page
- Formatted Text
- Plain Text
Text is decolumnized and rendered as plain text. Graphics and tables are retained, but not in their original locations. This option is convenient for users who want to reformat the content.