Using ABBYY Vantage Document Skills

Processing Your First Documents with Vantage

Learn how easy it is to get started with Vantage – upload your documents and Vantage will take care of the rest.

 

How to Create and Train a Vantage Document Skill

Learn how to use the Vantage Skill Designer to create and train a new Document Skill with just a few sample documents.

 

How to Create and Train a Classification Skill in ABBYY Vantage

Learn how to use the Vantage Skill Designer to train a new Classification Skill. You need just a few samples of each document class.

 

 

How to Automate a Complete Workflow, by Creating a Vantage Process Skill

 

 

How to Edit a Document Skill

Learn how to adapt already existing skills to your specific documents and business requirements.

 

 

How to perform the first authentication in Vantage Swagger UI?

To get a first access token perform the initial authentification using the default client, one does not need to enter any passwords or client ID. The initial authentication is preconfigured. Just open a Swagger page (EU link or US link), click Authorize:

mceclip1.png

Select all scopes, and click Authorize again:

mceclip0.png

The password should be specified only for a custom client. A custom client can be created after the initial initialization.

References

EU Help: Getting a Tenant Identifier or US Help: Getting a Tenant Identifier

EU Help: Creating a Client or US Help: Creating a Client

Learn more at ABBYY […]

OCR SDK

The SimpleOCR SDK is a fast, lightweight OCR engine designed to let developers add basic OCR functions to an application with minimal cost and none of the drawbacks of open source solutions.

The ABBYY FineReader SDK is a fully-featured OCR engine with advanced features like handprint recognition, barcode recognition, ID and business card recognition, and support for 200+ languages including Asian scripts, Arabic and Hebrew. FineReader SDK is available in both Cloud and On-Premise versions.

The ABBYY FlexiCapture SDK gives you advanced, AI-based OCR data capture capabilities like document classification, forms processing, invoice processing, and machine learning for training data extraction templates.

You can shop for all of these in our OCR store, and our expert staff will be here to advise and assist in your OCR development project. Contact us to see how we can help!

Atalasoft provides OCR SDKs that can be integrated into your desktop or web applications for manual or automated batch processing of images.  These are an industry proven document transformation engines and add-ons to the DotImage SDK and can save countless hours and significantly improve accuracy. One of the main advantages is that it is mostly royalty free SDK with many different options and engines to choose from. Allowing you to create your own OCR components of your software with just one payment in front. Atalasoft OCR SDK has plenty of plugins to add more features like:

  • OmniPage OCR & ICR
  • Tesseract OCR
  • GlyphReader OCR
  • BarcodeReader 1D and 2D
  • Barcode Writer
  • DotTwain

Google Cloud Vision API offers powerful pre-trained machine learning models through REST and RPC APIs. Assign […]

OCR Guide

Optical Character Recognition

During your foray into the world of document scanning, you’ve likely encountered the term “OCR” and may even know that it stands for “Optical Character Recognition“. But what exactly is OCR and how can you make the best use of this sophisticated and valuable tool?

We’re here to give you a run-down of what you need to know about Optical Character Recognition, answer any questions you might have, and recommend the best OCR software solution for your scanning project.

Table of Contents:

What is OCR?

What Is OCR Barcode Scanning Recognition SoftwareThe primary purpose of Optical Character Recognition  is to quickly and automatically scanned or photographed document images into machine readable text that can be searched for keywords or edited in a word processor.

In general, an OCR engine analyzes the pixel data of scanned images and searches for patterns resembling letters, numbers, and other symbols to create a digitized record of characters.

The biggest OCR engines employ huge Artificial Intelligence (AI) and Machine Learning (ML) models that have been trained on billions of documents collected over decades of development.

While the exact mechanics of this process can be complicated, OCR engines are a key automation tool for the digital age. It bridges the gap between knowledge stored on physical documents and digital data that can be edited, searched or parsed into structured data to automate data entry tasks.

OCR Output Types

Search Document OCR Recognized TextFull Page OCR converts the entire document into one of the following formats:

    […]

Applications

When you scan a document that has text or numeric data on it, you are able to read and understand what is written in the scanned image. However, to a computer, the resulting image file is just as meaningless an assortment of pixels as a landscape photo. In order to transform this information into an editable format that you can search through, copy, and modify without retyping it manually, you will need the an Optical Character Recognition (OCR) software.

There is a wide variety of OCR software available. While they all share the ability to convert images of machine printed (not handwritten) text or numbers into an editable format, the various software often have different features, accuracy, prices, and language options.

You can find the various types of OCR software with a description of each below.

Users within a single department, working from home or who have a small business can simply scan their documents to a folder that is shared to everyone. In this “ad-hoc” scenario you only need some basic document scanning software to simplify and bring consistency to your filing system.

If you want to move to the next level, there are Desktop Document Management options that provide an all-in-one means for capture, storage, search and retrieval of documents. Additionally, they provide security, advanced capabilities and ease of use above that of the ad-hoc methods

And let’s not forget cloud-based options that alleviate the need to maintain storage servers or keep software up to date.

Need a simple, no frills OCR solution without spending hundreds of dollars on a professional software package? Look no further. There is a no cost, donation optional, OCR freeware solution for […]

2022-06-21T12:06:06-04:00Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |

PDF OCR

Searchable PDF OCR

Adobe Acrobat OCR to Searchable PDFCreating searchable PDF files using optical character recognition is one of the most common PDF OCR applications.

The PDF format works great with scanned documents because it allows the OCR text to be hidden in an invisible layer behind the original document image. So you see a perfect replica of the original instead of OCR text that lacks formatting and may contain artifacts and errors.

OCR PDF to Other Formats

Batch OCR PDF to Text, Excel, Word

PDF OCR can also mean converting scanned PDF files to Word, Excel, text and other formats. This can be done with any desktop OCR or OCR server application. However there are several OCR applications called PDF Converters that are only designed to convert documents to searchable PDF files rather than converting PDF files to other formats. This is an important distinction to make when searching for PDF OCR software.

PDF Converters often cost less than their full-featured desktop OCR counterparts since they only offer document scanning and conversion of images to searchable PDF files. They can also include the ability to convert other file formats like Word, Excel, PowerPoint, HTML, etc. to PDF automatically. Enterprise site licensing options let you enable this capability for any user in the organization. Contact us for a quote on site licenses for any PDF OCR application.

PDF OCR Compression

PDF also offers advanced compression options like MRC, JPEG2000 and JBIG that can produce much smaller files than traditional TIFF images. Foxit PDF Compressor is even able to parse the document and apply different compression to images, text and backgrounds to reduce the size even further. This can produce huge savings in cloud storage and access […]

Simple Software

SimpleIndex can bring speed and efficiency to your scanning or doc filing no matter the process. Even if all you are doing is hand keying a few basic details about a document, breaking those details into individual indexes and adding tools like drop down choice lists, automatic orientation, and blank page deletion ensure a smoother, more consistent process.

Automation

Here’s where things start to get interesting. From basic tasks like splitting individual documents within at stack of pages by spotting a blank page, a specific mark, or a barcode separator to capturing index data directly from the page or looking up additional details about a document in a database, SimpleIndex has a host of powerful tools to tame your piles of paper or drives full of digital files. Let’s look at a few.

OCR

Optical Character Recognition is the ability to take a scan, which is merely a picture of a page, and turn it into words that the computer can understand and use to index your files. SimpleIndex leverages the power of ABBYY FineReader, recognized as one of the best OCR engines on the market, to accurately capture names, dates, important numbers, document types, and other details about your file. Some products have you set a box and capture whatever information happens to fall in that zone. SimpleIndex takes it further with Dynamic Zone OCR to enable you to set an oversized zone that allows for shifting of the pages between scans, but still captures just the date you need by matching against templates, lists, or even Regular Expressions (RegEx). You can also skip the zones entirely and use the full text of a page to find matches for your index data.

Barcodes

Receipt Scanning

When you’re managing your small business’ finances, filing taxes or just dealing with a results of your shopping spree, it’s necessary to know and record where your money is being spent. But receipts from your purchases may get lost in a sea of other documents and miscellaneous papers.

And when it comes to record keeping, tracking a pile of receipts can become daunting, especially if you travel often for business and need to organize them for expense tracking, or you run your own company and want to write-off all expenses you can.

If you struggle to keep track of your receipts, a receipt scanner can become incredibly useful. Some receipt scanners include online tools and apps that allow you to keep and access your receipts from anywhere, so receipts will be consolidated and you can access them whenever you need them.

At the same time, just having a mobile phone with Scanner App offer a lighter and easier alternative. Receipt Scanner Apps make it easy to scan receipts with any mobile device. Having the ability to transcribe key data and record the information without manual data entry will save you time, and best of all, you can toss those paper receipts.



The Neat app transforms your device’s camera into a powerful mobile receipt scanner that’s always at your side, making it easy to stay organized. The Neat mobile app is especially helpful for tracking expenses while traveling for business or on the road. As soon as you have a receipt or document, just snap a pic and into Neat it goes. At the end of the day, you’ll be able to run an expense report with the click of a button.

Intuit offers a large accounting and record keeping ecosystem Quickbooks. Luckily, […]

Handprint Recognition Guide

What is ICR, Handprint Recognition?

ICR stands for Intelligent Character Recognition and is the technology that allows software to interpret hand printed text on scanned images.

Forms Processing Software uses ICR technology to automate data entry tasks involving hand-filled surveys, applications and forms. It provides interfaces for scanning, recognition, data verification and export, as well as management and monitoring tools to track large volumes of documents and data through the workflow.

Forms Processing also includes OCR (Optical Character Recognition) technology to recognize machine printed text, and OMR (Optical Mark Recognition) for check boxes and multiple choice bubbles.

Traditional forms processing relies on constrained handwriting, where boxes on the form force the filler to write with separated, printed block characters. Modern AI technology has dramatically improved the ability to recognized unconstrained handwriting and cursive script. Hand printed notes, free-form comments blocks, non-segmented fields, historic documents, and more can now be converted to text with acceptable accuracy where these were impossible just a few years ago.

Who can benefit from handwritten recognition software?

Any organization that collects data on paper-based forms, surveys or applications on a regular basis can get a very high return on investment by automating the data entry with forms processing software.

You do need to have a significant number of forms to justify the expense, at least a hundred forms per month or more depending on how much data is being captured. If the data entry task can be done in under 25 working hours then it is probably not a good candidate for automation with ICR software.

Organizations that have many separate departments that collect data on forms can share the budget for forms processing software by re-using it for other projects. Your current project may not be big enough to justify […]

Enterprise OCR Applications

Enterprise OCR Data Capture Software Enterprise OCR Data Capture Software

Enterprise OCR refers to applications designed with the features and scalability required for large businesses and service operations.

Speed and efficiency are the name of the game at the enterprise level so options like batch processing, multi-user and multi-server workflows, security and compliance auditing are found in these applications.

Enterprise OCR can also refer to Enterprise Site Licensing for desktop OCR applications that allow any user in your organization to install licensed OCR tools without incremental costs. Contact Us for a quote on any Site License.

Enterprise Data Capture Solutions Enterprise Constitution Class Starship

Enterprise Document Management

With the high volume of documents coming out of an enterprise OCR product, there is a need for robust Document Management applications with enhanced features that cover the stricter oversight needs of large organizations. Sorting through thousands or millions of pages can quickly turn digital documents into a quagmire without proper organization, tagging, search and workflow capabilities.

Enterprise Document Management features include:

  • Digital signatures
  • Document life cycle management
  • Version control
  • Advanced keyword searching & full-text indexing
  • Audit trails (HIPAA, Sarbanes compliance)
  • Cloud Based Document Management Apps Cloud Based Document Management Apps

    Email archiving

  • Workflow routing
  • Enterprise Report Processing (ERP)
  • Document access control

Our document management solutions work with any of the enterprise OCR products below to provide a secure end-to-end solution. Contact Us to see how they work together in an online demo or get a quote.

OCR Consulting Services

OCR Experts for Any Project

Our unique team of OCR experts are equipped to help out with OCR projects of any size or complexity. We have support specialists that can remotely configure desktop solutions in a matter of minutes and expert systems integrators with years of programming, database design, and robotic process automation experience.

Desktop OCR

Batch Document Scanning and OCRUse our online store to order desktop OCR applications and our staff will be happy to answer your setup questions via email or web chat.

Remote configuration and training services using GotoMeeting are available for a low hourly rate.

Let Us OCR That For You

Got a one-time conversion and don’t want to hassle with software? Upload your scanned document to us and we’ll send back the converted files. Optional verification service corrects recognition errors and layout issue for a low hourly rate.

Data processing for forms, reports, directories, and other documents is also available with output to CSV, Excel, XML, JSON, SQL, etc.

Contact us and if possible provide a sample, total pages, desired output and whether you want us to correct the results after OCR and we’ll reply back with a quote right away. Prices start at $50 for up to 1,000 pages.

Batch Scanning & OCR Servers

Data Capture Forms OCRAutomate document scanning and digital document archival processes using zone OCR, barcode recognition, database integration and other technologies.

Small business systems and single document workflows can be setup remotely via GotoMeeting, usually in just a few hours. Chat now if we’re online or leave a message to schedule a consultation.

Data Capture and Forms Processing

Advanced data extraction solutions that can turn the most complex documents into structured data ready […]

Robotic Process Automation

Introducing Robotic Process Automation

RPA stands for Robotic Process Automation and it represents a new approach to business automation that helps minimize the technical hurdles required for implementing new workflows.

Robotic Process Automation of Data Entry

Traditional business process automations rely on application programming interfaces (APIs) to allow systems to exchange data. This approach has two main drawbacks:

  1. The application vendor must make those APIs available
  2. A programmer needs to write custom code to interface with them

If your software vendor does not provide an interface for consuming the data you need to automate, then you’re out of luck. And even if they do, the development costs can eliminate the ROI if the transaction volume isn’t large enough.

RPA tools avoid the API problem by interfacing directly with the application user interface just like a human would do. They use artificial intelligence and machine learning to “watch” the operator perform a task within the application then creates its own program (called a “bot”) to mimic it. This means that:

  1. Bots can do anything a human can do within the application
  2. Users can create a bot without writing code

Practically speaking, an experienced robotic process automation consultant with programming experience is required to roll out an RPA solution enterprise-wide, and most users will only be able to automate small, routine tasks without assistance. Business-critical, high-volume automations will still involve coding. But RPA dramatically reduces the implementation time and avoids the need to retrofit APIs for software applications that were not designed to support them.

Using RPA with OCR Data Capture

UiPath Robotic Process Automation RPA OCROCR Data Capture is one of the most common business processes to automate with RPA. Taking data stored in paper or electronic documents and […]

ABBYY FlexiCapture Cloud

ABBYY FlexiCapture Cloud

ABBYY FlexiCapture Cloud delivers ABBYY’s advanced data capture platform capabilities via REST API and web interfaces. ABBYY FlexiCapture Cloud customers can rapidly configure and deliver their Content IQ solution, taking advantage of our cloud services to automate and accelerate their document-driven processes. The advanced machine learning and AI in the platform improve classification and data extraction results, enabling core processes to support better, smarter, faster decisions.

FlexiCapture Cloud enables organizations to accelerate digital transformation by complementing their automation systems with new and advanced cognitive capabilities that liberate the intelligence locked in their documents.

ABBYY FlexiCapture for Invoices Cloud

ABBYY FlexiCapture for Invoices Cloud

ABBYY FlexiCapture Cloud delivers ABBYY’s advanced data capture platform capabilities via REST API and web interfaces. ABBYY FlexiCapture Cloud customers can rapidly configure and deliver their Content IQ solution, taking advantage of our cloud services to automate and accelerate their document-driven processes. The advanced machine learning and AI in the platform improve classification and data extraction results, enabling core processes to support better, smarter, faster decisions.

FlexiCapture Cloud enables organizations to accelerate digital transformation by complementing their automation systems with new and advanced cognitive capabilities that liberate the intelligence locked in their documents.

ABBYY Cloud OCR SDK

ABBYY® Cloud OCR SDK is a web-based document processing service that will enhance your enterprise software systems, SaaS platforms, or your mobile apps with the ability to convert documents and utilize textual information from scans, PDFs, document images, smartphone photos, or screenshots.

Combining ABBYY’s latest AI-based technologies for information extraction with the highly scalable processing power of the Microsoft® Azure® computing infrastructure, this secure and reliable ABBYY cloud service can be easily integrated into your application via a REST API—empowering it to precisely convert virtually any number of pages within the shortest amount of time.

ABBYY Vantage

ABBYY Vantage leverages AI machine learning and a huge library of document “skills” to provide out-of-the-box data capture for all kinds of documents.

Vantage provides a simple way to implement new data capture processes without the need for programmers.

It takes the FlexiCapture platform, hosts it in the cloud, and dramatically simplifies the interface. The thousands of settings you can use with FlexiCapture to build templates are managed by the AI, giving you a simple point and click interface to create new document capture workflows.

The “Skills” library gives you pre-configured capture workflows for hundreds of the most common documents. Simply connect them to your import and export destinations and you are ready to go, saving you hours or even days of development time.

PaperVision Direct

What if you could include the critical business information you’re currently storing in paper files in your PaperVision®.com cloud information management service? Scan, import, index, and organize paper documents using your existing scanners and multi-function devices (MFD) to create convenient digital files and securely upload them to the cloud.

Start scanning documents right at your desk! Turn any vulnerable paper document into a useful digital file that can be securely managed in your PaperVision.com cloud service.

ImageSilo Direct

What if you could include the critical business information you’re currently storing in paper files in your ImageSilo cloud information management service? Scan, import, index, and organize paper documents using your existing scanners and multi-function devices (MFD) to create convenient digital files and securely upload them to the cloud.

Start scanning documents right at your desk! Turn any vulnerable paper document into a useful digital file that can be securely managed in your ImageSilo cloud service.

Google Cloud Vision API

Vision API offers powerful pre-trained machine learning models through REST and RPC APIs. Assign labels to images and quickly classify them into millions of predefined categories. Detect objects and faces, read printed and handwritten text, and build valuable metadata into your image catalog.

Automatically extract handwriting, plain text or form data from any document using a huge machine learning model based on billions of sample documents.

Google Vision is a cloud OCR service that automatically detects and extracts text and data from scanned documents and PDF files. It goes beyond simple optical character recognition (OCR) to also identify the contents of fields in forms and information stored in tables.

Google Vision API also lets you implement OCR in your RPA workflows. UiPath and other bots offer connectors that let you include Vision OCR into your RPA process.

Google Vision is not a “ready-to-use” product. It requires programing skills, experience with Google cloud services, and decent amount of coding to implement it into your systems, especially once you add user interfaces for scanning and data validation.

Simple Software developers have the necessary skills and experience to integrate Google Vision into your custom applications. Contact us or click the Request a Quote button to get a proposal for your custom application development project.

Amazon Textract API

Automatically extract handwriting, plain text or form data from any document using the world’s largest OCR machine learning model based on billions of sample documents.

Amazon Textract is a cloud OCR service that automatically detects and extracts text and data from scanned documents and PDF files. It goes beyond simple optical character recognition (OCR) to also identify the contents of fields in forms and information stored in tables.

Amazon Textract API also lets you implement OCR in your RPA workflows. UiPath and other bots offer connectors that let you include Textract OCR into your RPA process.

Textract is not a “ready-to-use” product. It requires programing skills, experience with AWS systems and decent amount of coding to implement it into your systems, especially once you add user interfaces for scanning and data validation.

Simple Software developers have the necessary skills and experience to integrate Textract into your custom applications. Contact us or click the Request a Quote button to get a proposal for your custom application development project.

Simple Software also offers the ready-to-use SimpleIndex application that incorporates Textract into a fully-featured scanning, indexing and document processing application.

Title

Go to Top