Google Cloud Vision API

Vision API offers powerful pre-trained machine learning models through REST and RPC APIs. Assign labels to images and quickly classify them into millions of predefined categories. Detect objects and faces, read printed and handwritten text, and build valuable metadata into your image catalog.

Automatically extract handwriting, plain text or form data from any document using a huge machine learning model based on billions of sample documents.

Google Vision is a cloud OCR service that automatically detects and extracts text and data from scanned documents and PDF files. It goes beyond simple optical character recognition (OCR) to also identify the contents of fields in forms and information stored in tables.

Google Vision API also lets you implement OCR in your RPA workflows. UiPath and other bots offer connectors that let you include Vision OCR into your RPA process.

Google Vision is not a “ready-to-use” product. It requires programing skills, experience with Google cloud services, and decent amount of coding to implement it into your systems, especially once you add user interfaces for scanning and data validation.

Simple Software developers have the necessary skills and experience to integrate Google Vision into your custom applications. Contact us or click the Request a Quote button to get a proposal for your custom application development project.

Description

Cloud Vision API

Derive insights from your images in the cloud or at the edge with AutoML Vision or use pre-trained Vision API models to detect emotion, understand text, and more.

Vision API offers powerful pre-trained machine learning models through REST and RPC APIs. Assign labels to images and quickly classify them into millions of predefined categories. Detect objects and faces, read printed and handwritten text, and build valuable metadata into your image catalog.

  • Use machine learning to understand your images with industry-leading prediction accuracy
  • Train machine learning models that classify images by your custom labels using AutoML Vision
  • Detect objects and faces, read handwriting, and build valuable image metadata with Vision API

BENEFITS:

  • Detect objects automatically

    The Vision API can detect and extract multiple objects in an image with Object Localization.

    Object localization identifies multiple objects in an image and provides a LocalizedObjectAnnotation for each object in the image. Each LocalizedObjectAnnotation identifies information about the object, the position of the object, and rectangular bounds for the region of the image that contains the object.

    Object localization identifies both significant and less-prominent objects in an image.

    Object information is returned in English only. The Cloud Translation can translate English labels into any of a number of other languages.

  • Reduce purchase friction

    Vision API Product Search allows retailers to create products, each containing reference images that visually describe the product from a set of viewpoints. Retailers can then add these products to product sets. Currently Vision API Product Search supports the following product categories: homegoods, apparel, toys, packaged goods, and general .

    When users query the product set with their own images, Vision API Product Search applies machine learning to compare the product in the user’s query image with the images in the retailer’s product set, and then returns a ranked list of visually and semantically similar results.

  • Detect handwriting in images

    Handwriting detection with Optical Character Recognition (OCR)

    The Vision API can detect and extract text from images:

    • DOCUMENT_TEXT_DETECTION extracts text from an image (or file); the response is optimized for dense text and documents. The JSON includes page, block, paragraph, word, and break information.
    • One specific use of DOCUMENT_TEXT_DETECTION is to detect handwriting in an image.
  • Optical Character Recognition (OCR)

    The Vision API can detect and extract text from images. There are two annotation features that support optical character recognition (OCR):

    • TEXT_DETECTION detects and extracts text from any image. For example, a photograph might contain a street sign or traffic sign. The JSON includes the entire extracted string, as well as individual words, and their bounding boxes.
    • DOCUMENT_TEXT_DETECTION also extracts text from an image, but the response is optimized for dense text and documents. The JSON includes page, block, paragraph, word, and break information.

Pricing

Pricing includes pay-per-use Cloud Vision API, scaling monthly charges for Vision API Product Search, and flat rates per node hour with free trials for AutoML Vision and AutoML Vision Edge.

Title

Go to Top