Tools for text digitisation

More than
250
state-of-the-art tools for text digitisation.

283 results

Tools

abbyy binarisation and colour reduction

  • Description:Use this toolkit when building your own OCR workflow out of various tools from various vendors
  • Group: image processing
  • Type: image processing and enhancement
  • Subtype: ner
  • License:
  • Language: n/a
  • Developer: abbyy

abbyy block segmentation

  • Description:Before characters and words can be recognised by an OCR engine the print space of the image has to be identified and from there paragraphs and lines This tool can be used to identify blocks on a scanned document
  • Group: image processing
  • Type: image segmentation
  • Subtype: ner
  • License:
  • Language: n/a
  • Developer: abbyy

abbyy finereader engine 10

  • Description:Stateoftheart OCR engine
  • Group: text recognition
  • Type: core text recognition
  • Subtype: ner
  • License:
  • Language: n/a
  • Developer: abbyy

blacklab

  • Description:BlackLab is a corpus retrieval engine built on top of Apache Lucene. It allows fast, complex searches with accurate hit highlighting on large, tagged and annotated, bodies of text. It was developed at the Institute of Dutch Lexicology (INL) to provide a fast and feature-rich search interface on our historical and contemporary text corpora.
  • Group: text processing
  • Type: nlp tools
  • Subtype: language identification
  • License:
  • Language: n/a
  • Developer: ivdnt

character segmentation

  • Description:The developed methodology takes as input isolated words and separates them into characters.
  • Group: image processing
  • Type: image segmentation
  • Subtype: ner
  • License:
  • Language: n/a
  • Developer: national center for scientific research (ncsr) \"demokritos\"

collaborative correction platform (concert)

  • Description:A web-based platform suitable for massive volunteer participation which validates and corrects OCR results
  • Group: text recognition
  • Type: postcorrection
  • Subtype: ner
  • License:
  • Language: n/a
  • Developer: ibm israel - science and technology ltd

cs499ocr

  • Description:Performs OCR with image processing and statistical pattern recognition.
  • Group: Text Recognition
  • Type: Core Text Recognition
  • Subtype: -
  • License: GPL
  • Language: -
  • Developer: -

cue.language

  • Description:cue.language is a small library of Java code and resources that provides the following basic natural-language processing capabilities
  • Group: Text Processing
  • Type: -
  • Subtype: NLP toolset and resources
  • License: Apache License
  • Language: Arabic Catalan Croatian Czech Dutch Danish English Esperanto Farsi Finnish French German Greek Hebrew Hindi Hungarian Italian Latin Norwegian Polish Portuguese Romanian Russian Slovenian Slovak Spanish Swedish Turkish
  • Developer: Jonathan Feinberg

cuneiform

  • Description:Cuneiform is an OCR system. In addition to text recognition it also does layout analysis and text format recognition. Cuneiform supports several languages.
  • Group: text recognition
  • Type: core text recognition
  • Subtype: ner
  • License:
  • Language: n/a
  • Developer: -

cutouts

  • Description:Cutouts is a web application which allows to crowdsource preparation of training data for Tesseract OCR engine.
  • Group: text recognition
  • Type: postcorrection
  • Subtype: utilities for training and customization
  • License:
  • Language: n/a
  • Developer: poznań supercomputing and networking center

digilib

  • Description:Digilib is a web based client/server image viewing environment for the internet
  • Group: Miscellaneous Utilities
  • Type: -
  • Subtype: creating presentation version
  • License: GNU GPL
  • Language: -
  • Developer: Max-Planck-Institute for the History of Science the Bibliotheca Hertziana the University of Bern

document deskewer

  • Description:generic skew detection and correction (for the full range 0-360 degrees) for documents printed using Roman scripts
  • Group: image processing
  • Type: image processing and enhancement
  • Subtype: ner
  • License:
  • Language: n/a
  • Developer: fraunhofer iais
  • Wiki


Would you like to add any tool?

Registered users can add new tools through a simple form login or register.

Search or filter tools

Group:

Type:

Subtype:

In demonstrator platform: