IMPACT ABBYY FineReader 10 Binarisation Service

Produced by: ABBYY


Compare with similar tools:


Scenario


Use this toolkit when building your own OCR workflow out of various tools from various vendors. If you need a start-to-end solution, use the ABBYY FineReader Engine.

This part of the toolkit was developed and released to support the ABBYY FineReader Engine 10 for Windows.

Abstract


Binarisation is the transformation of a colour or greyscale image into a black and white image. Image binarisation is applied before OCR and intended to emphasise the difference between text and background content, since the contrast between black and white allows an OCR engine to more easily distinguish significant text detail from the background.

There are different types and levels of binarisation that can be applied, and not all of them will be appropriate for every image: careless binarisation can, for instance, effectively delete softly inked text from a digital image, making it unreadable.

Original page

Original page

Old binarisation (FR9)

Old binarisation (FR9)

New binarisation (FR10)

New binarisation (FR10)

 

Original page

Original page

Old binarisation (FR9)

Old binarisation (FR9)

New binarisation (FR10)

New binarisation (FR10)

A 24-bit colour image of a page with its binarised equivalent. Note the greater contrast between text and background in the binarised image: this makes it easier for an OCR engine to pick out and identify text content.

Publications

The following screencast explains the use of binarisation in the production of OCR documents, and introduces the IMPACT Project’s Binarisation Tool – a modular and adaptable toolkit that can be used to find the best type of binarisation for particular works, and apply it across a collection.

Willms, L.

An Introduction to the IMPACT Binarisation Tools developed by ABBYY from IMPACT Centre of Competence on Vimeo.

Availability

IMPACT ABBYY FineReader 10 Binarisation Service is under ABBY FineReader Engine 10 commercial licence. For further information on licencing, please contact ABBYY’s European Office.

Would you like to try it?

Finereader 11 engine binarizationOCR Post-correction and Enrichment