Thank you for choosing SimpleOCR–the royalty-free ! These instructions will tell you the basics of how to integrate SimpleOCR into your application.

SimpleOCR contains several group of functions including image manipulation, image I/O with TIFF files, image acquisition with TWAIN compliant scanners, and of course, OCR. Note that SimpleOCR can read and create TIFF files containing bi-level (i.e. black & white) images. TIFF files are created by SimpleOCR using the CCITT Group IV compression scheme, but it can read most TIFF bi-level images.

The source code examples are given in VB and C++. The function headers are given in C++, since this is the original language that SimpleOCR was written in. To translate, simply replace all pointer variables with long integers and all char * with strings. Also, the ActiveX functions all have an “X” appended to the name (OCR->OCRX, LoadImg->LoadImgX, etc.). In the documentation, SimpleOCR refers to general library functions, while SimpleOCX is used to refer specifically to the ActiveX control.

SimpleOCX is an ActiveX dynamic link library (Dll) that allows developers to quickly integrate the SimpleOCR functions from any ActiveX-compatible programming environment. SimpleOCX acts as a “wrapper” for the core SimpleOCR libraries. Hence, SimpleOCX is not a native ActiveX control; it only provides an ActiveX interface to the SimpleOCR functions contained in ocrdll.dll and dlltwain.dll. Programmers who desire more efficient execution may forgo use of SimpleOCX.dll and interface directly with the core libraries.

Adding SimpleOCR to your application

The following instructions are provided in Visual Basic, but the implementation of SimpleOCR is similar in any development environment that uses ActiveX. Consult your documentation for language-specific instructions on how to integrate ActiveX dlls.

  1. Ensure that SimpleOCX.dll has been properly registered using “regsvr32.exe c:\Program Files\SimpleOCR\simpleocx.dll”

  2. Add a reference to “SimpleOCX” using the Project/References menu

  3. You can now declare variables of type “SimpleOCR” and access all of the SimpleOCR functions through this object

Constants Used By SimpleOCR (VB)

Copy these constant declarations into a VB module.

Copy to Clipboard

Using the SimpleOCR ActiveX Control (VB)

This is an example of how to perform OCR on a multi-page image file using SimpleOCR X and VB.

Copy to Clipboard

OCR on a Multipage TIFF Image File (C++)

This function process several images stored in a TIFF file. The OCR results are stored in a text file.

Copy to Clipboard

This function uses a OCR output handler that is defined now:

Copy to Clipboard

Displaying an Image (C++)

This function displays an image at coordinates x, y in a display context:

Copy to Clipboard

Scanning Documents (C++)

This function scans “n” images and creates a set with the scanned images.

Copy to Clipboard

IMG

All the images manipulated by SimpleOCR have the IMG type. SimpleOCR provides functions that allows you to handle an IMG object as a Device Independent Bitmap (DIB). See your Windows SDK documentation in order to become familiar with DIB concepts.

The IMG objects are always manipulated through SimpleOCR functions. So even if IMG objects are implemented as structures, you don't have to bother with its definition. When programming in languages besides C++, substitute IMG * with Long Integer data types.

Please remember, SimpleOCR can only work with bi-level (i.e. black & white) images or grayscale images with 256 shades of grays.

SETOFIMG

Several images that belong to the same document can be grouped in a SETOFIMG object. The SETOFIMG objects are always manipulated through SimpleOCR functions. So even if IMG objects are implemented as structures, you don't have to bother with its definition.

When programming in languages besides C++, substitute SETOFIMG * with Long Integer data types.

AddDIB

int AddDIB(SETOFIMG * set, HGLOBAL hDib)

This function is similar to AddImage, except that, instead of an IMG object, this function expects an image in the DIB format.

Return Value

If the function fails, a nonzero value is returned.

Parameters

set

A pointer to a set of images of type SETOFIMG.

hDib

A HGLOBAL handler referencing a Global Memory object containing a BITMAPINFO structure followed by the bitmap bits.

Example

Copy to Clipboard

Related Functions: AddImage, InsertImage

AddImage

int AddImage(SETOFIMG * set, IMG * image)

Add an image to a set of image. The new image will be added at the last position.

Return Value

If the function fails, a nonzero value is returned.

Parameters

set

A pointer to a set of images of type SETOFIMG.

image

A pointer to the added image, of type IMG.

Example

Copy to Clipboard

Related Functions: AddDIB, InsertImage

CreateMultipleImg

SETOFIMG * CreateMultipleImg(void)

Creates an empty set of images.

Return Value

An empty set of images or NULL if the function fails.

Example

Copy to Clipboard

Related Functions: FreeMultipleImg

DelImage

void DelImage(SETOFIMG * set, int index)

Deletes and frees an image in a set.

Return Value

None

Parameters

set

A pointer to a set of images of type SETOFIMG.

index

Position of the deleted image, counted from 0.

Example

Copy to Clipboard

CountPixelsImg

int CountPixelsImg(IMG * img)

Count the number of black pixels in an image.

Return Value

Number of black pixels in the image

Parameters

img

Pointer to an image of type IMG.

Example

Copy to Clipboard

DeskewImg

int DeskewImg(IMG * img)

When a document has not been properly scanned, the resulting image can be skewed. This function analyses a skewed image and rotates it in order to fix the problem.

Return Value

If the function fails a nonzero error code is returned.

Parameters

img

Pointer to an image of type IMG.

Example

Copy to Clipboard

Related Functions: RotateImg

DIBToIMG

IMG * DIBToIMG(HGLOBAL hDib)

Converts a memory block containing a Device Independent Bitmap (DIB) to an IMG object.

Return Value

A pointer to an IMG object. If the function fails, the return value is NULL.

Parameters

hDib

A HGLOBAL handler referencing a Global Memory object containing a BITMAPINFO structure followed by the bitmap bits.

Comments

See your Windows SDK documentation for obtaining information about Device Independent Bitmaps and how to use it.

Example

Retrieve a DIB from the clipboard and convert it to an IMG object

Copy to Clipboard

EraseBlackBordersImg

int EraseBlackBordersImg(IMG * img)

Sometimes a scanned image has black borders. It happens frequently when the scanned document is smaller than the scanning area. This function detects and removes these black borders.

Return Value

On success, the function returns 0. If the function fails, the return value is different of 0.

Parameters

img

Pointer to an image of type IMG.

Example

Copy to Clipboard

ExtractImgArea

IMG * ExtractImgArea(IMG * img, int x, int y, int w, int h)

Extracts a rectangular area from an image.

Return Value

A pointer to a new IMG object. If the function fails, the return value is NULL.

Parameters

img

Pointer to an image of type IMG.

x

x coordinate of the upper left corner of the area.

y

y coordinate of the upper left corner of the area.

w

area width in pixels.

h

area height in pixels.

Comments

The original image is left unchanged. The extract image should be freed with the function FreeImg

Example

Copy to Clipboard

FixOrientationImg

int FixOrientationImg(IMG * img)

When a document has not been properly scanned, the resulting image can be of the wrong orientation. This function analyses an image in the wrong orientation, and rotates it the necessary 90, 180, or 270 degrees.

Return Value

If the function fails a nonzero error code is returned

Parameters

img

Pointer to an image of type IMG.

Example

Copy to Clipboard

FreeImg

void FreeImg(IMG * img)

This function frees an existing image.

Return Value

None

Parameters

img

Pointer to an image of type IMG.

Example

The following example frees an image

Copy to Clipboard

FreeMultipleImg

void FreeMultipleImg(SETOFIMG * set)

Frees a set of images. All the images in the set are also freed.

Return Value

An empty set of images or NULL if the function fails.

Parameters

set

A set of images of type SETOFIMG

Example

Copy to Clipboard

Related Functions : CreateMultipleImg

GetImage

MG * GetImage(const SETOFIMG * set, int index)

Gets a pointer to a given image in a set of images.

Return Value

A pointer to the image of order index in the set.

Parameters

set

A pointer to a set of images of type SETOFIMG

index

Order of the image you want to access. The first image is at index 0.

Example

Copy to Clipboard

GetImgBitmap

unsigned char * GetImgBitmap(const IMG * img)

This function gets a pointer to the bitmap corresponding to the image. The bitmap is organized like a Device Independent Bitmap (DIB)

Return Value

A pointer to the bitmap that encodes the image.

Parameters

img

Pointer to an image of type IMG.

Comments

See your Windows SDK documentation for obtaining information about Device Independent Bitmaps and how to use it.

Example

Copy to Clipboard

Related Functions: GetImgBitmapInfo, GetImgBitmapSize

GetImgBitmapInfo

LPBITMAPINFO GetImgBitmapInfo(const IMG * img)

This function gets a pointer to the BITMAPINFO structure corresponding to the image.

Return Value

A pointer to a BITMAPINFO structure.

Parameters

img

Pointer to an image of type IMG.

Comments

See your Windows SDK documentation for obtaining information about the BITMAPINFO structure and how to use it.

Example

Copy to Clipboard

Related Functions: GetImgBitmap, GetImgBitmapSize

GetImgBitmapSize

int GetImgBitmapSize(const IMG * img)

This function returns the size in bytes of the bitmap corresponding to the image.

Return Value

Bitmap size in bytes.

Parameters

img

Pointer to an image of type IMG.

Comments

Example

Copy to Clipboard

Related Functions: GetImgBitmapInfo, GetImgBitmap

GetImgRes

void GetImgRes(const IMG * img, int * pw, int * ph)

This function allows you to get the horizontal and vertical resolution of an image

Return Value

None

Parameters

img

Pointer to an image of type IMG.

pw

Pointer to an integer that will contain the image horizontal resolution, given in Dots Per Inch (DPI).

ph

Pointer to an integer that will contain the image vertical resolution, given in DPI.

Example

Copy to Clipboard

Related Functions: GetImgSize

GetImgSize

void GetImgSize(const IMG * img, int * pw, int * ph)

This function allows you to get the size of an image, given in pixels.

Return Value

None

Parameters

img

Pointer to an image of type IMG.

pw

Pointer to an integer that will contain the image width in pixels

ph

Pointer to an integer that will contain the image height in pixels

Example

The following example retrieves an image size

Copy to Clipboard

Related Functions: GetImgRes

GetNbImages

int GetNbImages(const SETOFIMG * set)

Returns the number of images that a set of images contains.

Return Value

The number of images in the set

Parameters

set

A pointer to a set of images of type SETOFIMG.

Example

Copy to Clipboard

HalfSizeImg

IMG * HalfSizeImg(IMG * img)

Shrinks a bi-level image at 50% of the original size and returns the result in a grayscale image.

Return Value

A pointer to a new IMG object. If the function fails, the return value is NULL.

Parameters

img

Pointer to an image of type IMG.

Comments

The original image is left unchanged. If you don't need it anymore, you have to free it by calling the FREEIMG function. This function is mainly useful when you want to display a reduced bi-level image with a good display quality.

Example

Copy to Clipboard

Related Functions: ResizeImg, ShrinkImg

InsertImage

int InsertImage(SETOFIMG * set, int index, IMG * image)

Inserts an image in a set of image at a given position.

Return Value

If the function fails, a nonzero value is returned.

Parameters

set

A pointer to a set of images of type SETOFIMG.

index

Position of the inserted image, counted from 0.

image

Inserted image.

Example

Copy to Clipboard

Related Functions: AddImage, AddDIB

InvertImg

int InvertImg(IMG * img)

Inverts an image (black pixels becomes white and white pixels becomes black)

Return Value

If the function fails a nonzero error code is returned.

Parameters

img

Pointer to an image of type IMG.

Example

Copy to Clipboard

LoadImg

IMG * LoadImg(const char * filename)

Loads an image from a TIFF file.

Return Value

A pointer to the loaded image or NULL if the function fails.

Parameters

filename

TIFF file name

Comments

When you don't need the loaded image anymore, you have to free it by calling the FreeImg function. If you load a multiple image TIFF file, only the first image stored in the file is loaded. You have to use LoadMultipleImg for handling multiple image files.

Example

Copy to Clipboard

LoadMultipleImg

SETOFIMG * LoadMultipleImg(const char * filename)

Loads a set of images from a TIFF file.

Return Value

A pointer to the loaded set or NULL if the function fails.

Parameters

filename

TIFF file name.

Comments

When you don't need the set anymore, you have to free it by calling the FreeMultipleImg function. If your TIFF files contain only one image, you should use LoadImg.

Example

Copy to Clipboard

Related Functions: SaveMultipleImgLoadImg

OCR

int OCR(const IMG * img, int noisy)

Recognizes the text located in an image.

Return Value

A non zero error code if the function fails.

Parameters

img

A pointer to the image to process.

noisy

Non zero value if the image is noisy (i.e. contains a lot of speckles)

Related Functions: OCROnArea

OCROnArea

int OCROnArea(const IMG * img, int noisy)

Recognizes the text located in an image that contains a unique text area. This function doesn't do any layout analysis on the area. The image containing the area is usually extracted from a page with ExtractImgArea.

Return Value

A non zero error code if the function fails.

Parameters

img

A pointer to the image to process.

noisy

A non zero value if the image is noisy (i.e. contains a lot of speckles)

Related Functions: ExtractImgAreaOCROnArea2

OCROnArea2

int OCROnArea2(const IMG * img, int noisy, int startprogress, int endprogress)

This function is similar to OCROnArea but allows you to give starting and ending values for the progress percentage. It is useful when you want to have to display a progress bar when processing several areas.

Return Value

A non zero error code if the function fails.

Parameters

img

A pointer to the image to process.

noisy

A non zero value if the image is noisy (i.e. contains a lot of speckles)

startprogress

Starting value for the progress percentage.

endprogress

Ending value for the progress percentage.

Related Functions: OCROnAreaExtractImgArea, OCRSetProgressHandler

OCRSetOutputHandler

OCROutputHandler OCRSetOutputHandler(OCROutputHandler handler)

When the output mode is OM_TEXT or OM_RICHTEXT, a user defined function of type OCROutputHandler will be called by the OCR engine for each “OCR event”.

Return Value

Previously selected output handler.

Parameters

handler

New OCR Output handler function.

Comments

If the output mode is OT_TEXT, OCR events among OT_PROP, OT_ITAL, OT_UNDS, OT_SIZE, OT_HILT and OT_BITM are not sent to the output hander.

An OCROutputHandler has the following form:

void AnOCRHandler(int event, int param);

with event, the code of the “OCR event” and param a value associated with the event.

The OCR events are:

OT_TEXT

A character has been recognized. param contains the ASCII code of the recognized character.

OT_PROP

The font type has changed (proportional or non proportional font). param is nonzero if the font is proportional.

OT_ITAL

Switches italic mode on or off. param is nonzero if the following characters are italic.

OT_UNDS

Switches underscored mode on or off. param is nonzero if the following characters are underscored.

OT_SIZE

Changes the character size. param contains the font size for the following characters.

OT_HILT

Changes the character color.
If param contains 1, the following word is not in the dictionary. If param contains 2, the following word has not been well recognized.

OT_ENDL

An end of line has been reached.

OT_ENDZ

An end of text area has been reached.

OT_BITM

An image has been recognized. (IMG *) param is a pointer to the image.

Related Functions: SetOutputModeOCRSetOutputCharHandler

OCRSetOutputCharHandler

OCROutputCharHandler OCRSetOutputCharHandler(OCROutputCharHandler handler)

When the output mode is O

M_TEXT or OM_RICHTEXT, a user defined function of type OCROutputCharHandler will be called by the OCR engine for each “real character” (i.e.: not for EOLs and Spaces). This function is called immediately after OCROutputHandler is called with the event OT_TEXT.

Return Value

Previously selected output handler.

An OCROutputCharHandler has the following form:

void AnOCRCharHandler(int ch, int conf, int left, int top, int width, int height);

ch

ASCII code of recognized character.

conf

Confidence level of the recognized character. Values can be 0-100. The higher the confidence – the engine is more sure about the recognized character.

left, top

coordinates (in pixels) of character in original image.

width, height

width and height in pixels of recognized character.

Related Functions: OCROutputHandler

OCRSetProgressHandler

OCRProgressHandler OCRSetProgressHandler(OCRProgressHandler handler)

When the OCR engine processes a document, a user defined function of type OCRProgressHandler, is called several times.

Return Value

Previously selected progress handler.

Parameters

handler

New OCR Progress handler function.

Comments

An OCRProgressHandler has the following form:

int AProgressHandler(int percent);

with percent, the percentage of the job completed at the time of the call. This value is between 0 and 100.

Defining such a function allows an application to display a progress bar. With this function, it's also possible to interrupt the OCR process. If the progress handler returns a non zero value, the OCR process is stopped.

OCRSetTemplate

void OCRSetTemplate(const char * theTemplate)

Sets the template for use in template matching during the OCR process.

Return Value

None

Parameters

theTemplate

String containing the template to use in OCR template matching.

Comments

The templates consist of the following:

# – Number
A – Letter
X – Any character
? – Optional character.
 – Use at end of template.
Other – Must match character

Providing a string of zero length, a NULL value, or the number zero will turn off template matching Template recognition can be increased by limiting the character set to only those characters that will appear in the strings matched by the template.

OCRLimitCharsTo

void OCRLimitCharsTo(const char * charsToLimit)

Sets the characters that the OCR output will be limited to.

Return Value

None

Parameters

charsToLimit

string containing the characters that the OCR output will be limited to

Comments

There are default limited character sets that are defined as follows:

#define LC_NUMERIC 1 only numbers
#define LC_ALPHABETIC 2 only letters
#define LC_ALPHANUMERIC 3 no punctuation
#define LC_UCASE 4 all uppercase
#define LC_LCASE 5 all lowercase
#define LC_NONNUMERIC 6 no numbers

Passing the function a string of zero length, a NULL value, or a zero will turn off the limiting of characters.

ReplaceImage

int ReplaceImage(SETOFIMG * set, int index, IMG * image)

Replaces an image in a set of image at a given position.

Return Value

If the function fails, a nonzero value is returned.

Parameters

set

A set of images of type SETOFIMG.

index

Position of the replaced image, counted from 0.

image

A pointer to the new image.

Comments

The replaced image is automatically freed.

Example

Copy to Clipboard

ResizeImg

IMG * ResizeImg(IMG * img, int nw, int nh)

Resizes an image.

Return Value

A pointer to a new IMG object. If the function fails, the return value is NULL.

Parameters

img

Pointer to an image of type IMG.

nw

Width of the new image.

nh

Height of the new image.

Comments

The original image is left unchanged. If you don't need it anymore, you have to free it by calling the FreeImg function.

Example

Copy to Clipboard

Related Functions: ShrinkImgHalfSizeImg

RotateImg

int RotateImg(IMG * img, int angle)

Rotates an image.

Return Value

If the function fails a nonzero error code is returned.

Parameters

img

Pointer to an image of type IMG.

angle

Rotation angle, given in degrees.

Comments

Example

Copy to Clipboard

SaveImg

int SaveImg(const char * filename, const IMG * img)

Saves an image to a TIFF file.

Return Value

If the function fails a nonzero error code is returned.

Parameters

filename

Name of the TIFF file you want to create.

img

A pointer to the image to be saved.

Comments

Example

Copy to Clipboard

Related Functions: LoadImg

SaveMultipleImg

void SaveMultipleImg(const char * filename, SETOFIMG * set)

Saves an image from a TIFF file.

Return Value

If the function fails a nonzero error code is returned.

Parameters

filename

Name of the TIFF file you want to create.

set

Image to be saved.

Comments

Example

Copy to Clipboard

Related Functions: LoadMultipleImg

ScanAndAddImage

int ScanAndAddImage(SETOFIMG * set)

Acquires a new image and adds it to a previously created image set. The scanning session should have been initialized with ScanInit
Return Value

If the function fails a nonzero error code is returned.

Parameters

set

A set of image of type SETOFIMG

Related Functions: ScanImg

ScanAutoBright

void ScanAutoBright(int automode)

Selects “Autobright” mode and lets the scanner determines an optimal brightness level. (Recommended)

Return Value

None

Parameters

automode

Mode:

nonzero value

Select the “autobright” mode

zero value

Unselect the “autobright” mode. In this case, you may select the brightness level by using ScanBrightness.

Related Functions: ScanBrightness

ScanAvailable

int ScanAvailable(void)

Detects if a scanner is connected to the computer

Return Value

A nonzero value is the scanner is available.

Comments

The connected scanner must be TWAIN compliant and the corresponding 32 bit TWAIN driver must be properly installed

ScanBrightness

void ScanBrightness(int brightness)

Changes scanning brightness

Return Value

None

Parameters

brightness

A value between -1000 (dark) and 1000 (light).

Related Functions: ScanAutoBright

ScanEnd

void ScanEnd(void)

Terminate a scanning section.

Return Value

None

Related Functions: ScanInit

ScanImg

IMG * ScanImg(void)

Acquires a new image. The scanning session should have been initialized with ScanInit.

Return Value

A pointer to the scanned image or NULL if the function fails.

Related Functions: ScanAndAddImage

ScanInit

int ScanInit(HWND hWnd)

Initializes the image acquisition process.

Return Value

If the function fails a nonzero error code is returned.

Parameters

hWnd

Your application main window handler.

Related Functions: ScanEnd

ScanResolution

void ScanResolution(int resolution)

Sets the scanning resolution. This function should be called before ScanInit.

Return Value

If the function fails a nonzero error code is returned.

Parameters

resolution

Scanning resolution in DPI. (default = 300 DPI)

Related Functions: ScanBrightness

ScanSelect

void ScanSelect(HWND hWnd)

Lets the user select a given scanner if several scanners are connected to the computer.

Parameters

hWnd

Your application main window handler.

ScanShowUI

void ScanShowUI(int mode)

Indicates if SimpleOCR should use the scanner user interface or not. This function should be called before ScanInit.

Return Value

If the function fails a nonzero error code is returned.

Parameters

mode

The selected mode.

nonzero

Use user interface (default).

zero

Don't use the user interface.

SetLanguage

void SetLanguage(int language, const char* dictDir)

Selects the language used in the text you want to process.

Return Value

None

Parameters

language

A value among:

ENGLISH for English language.
FRENCH for French language.
DUTCH for Dutch language.
ENGUK for UK English language.
CUSTOM for Custom dictionary.
NONE for No language selected.

dictDir

The directory where the dictionary files are stored.

Comments

If a language has been selected, the OCR process will use a dictionary in order to improve the OCR results.

SetOutputMode

void SetOutputMode(int mode)

        Selects the output mode for the OCR engine

Return Value

        None

Parameters

        mode

A value among:

        OT_TEXT

The engine will output only the text.

        OT_RICHTEXT

The engine will output the text and additional information like characters format, characters size, and font type.

        OT_WINDOW

The engine will output only the text directly in a window as it was typed on the keyboard.

SetOutputWindow

void SetOutputWindow(HWND hWnd)

When the output mode OT_WINDOW has been selected with the SetOutputMode function, this function allows you to indicate in which window the text will be sent.

Return Value

None

Parameters

hWnd

A window handler.

Related Functions: SetOutputMode

ShrinkImg

IMG * ShrinkImg(IMG * img, int nw, int nh)

Shrinks a bi-level image and returns the result in a grayscale image.

Return Value

A pointer to a new IMG object. If the function fails, the return value is NULL.

Parameters

img

Pointer to an image of type IMG.

nw

Width of the new image.

nh

Height of the new image.

Comments

The original image is left unchanged. If you don't need it anymore, you have to free it by calling the FreeImg function. This function is mainly useful when you want to display a reduced bi-level image with a good display quality.

Example

Copy to Clipboard

Related Functions: ResizeImgHalfSizeImg

Information in this document is subject to change without notice and said changes may not be reflected herein. ScanStore.com and its parent company, Meta Enterprises, LLC, may have patents or pending patents applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. The furnishing of this document does not grant you a license to these patents, trademarks, copyrights, or other intellectual property except as expressly provided in a written license agreement from Meta Enterprises, LLC.

Share This Story, Choose Your Platform!