Tools for text digitisation

More than
250
state-of-the-art tools for text digitisation.

283 results

Tools

fraunhofer iais mydec color binarize

  • Description:Color binarize separates letters from the background. Grayscale images are converted to binary. It can be calculated for the separation either for the entire image or for each pixel of the optimal contrast.
  • Group: image processing
  • Type: image processing and enhancement
  • Subtype: image enhancement
  • License:
  • Language: n/a
  • Developer:

fraunhofer iais mydec color binarize

  • Description:Color binarize separates letters from the background. Grayscale images are converted to binary. It can be calculated for the separation either for the entire image or for each pixel of the optimal contrast.
  • Group: image processing
  • Type: image processing and enhancement
  • Subtype: image enhancement
  • License:
  • Language: n/a
  • Developer: the université françois-rabelais in tours

fraunhofer newspaper segmenter

  • Description:The Korrektor is a manual post-correction tool for automatically processed newspaper scans. By loading the result XML files into the software, it is possible to correct automatically detected layout elements, texts and other properties. The scanned documents are displayed in two separate windows to allow for a detailed inspection. Results can be edited using context menus, drag and drop and keyboard shortcuts.
  • Group: layout analysis
  • Type: nlp tools
  • Subtype: 0
  • License:
  • Language: n/a
  • Developer:
  • Wiki

functional extension parser

  • Description:The Functional Extension Parser (FEP) is a Document Understanding Software tool capable of decoding layout elements of books. Based on the output of Optical Character Recognition layout elements such as page numbers running titles headings and footnotes are detected and annotated.
  • Group: layout analysis
  • Type: nlp tools
  • Subtype:
  • License:
  • Language: n/a
  • Developer: university of innsbruck

gamera ocr

  • Description:OCR toolkit for Gamera: This is a Gamera toolkit for building standard text recognition applications. It is based on the Gamera framework and requires a working Gamera installation.
  • Group: text recognition
  • Type: core text recognition
  • Subtype: framework
  • License:
  • Language: n/a
  • Developer: -

gimp

  • Description:GIMP is the GNU Image Manipulation Program. It is a freely distributed piece of software for such tasks as photo retouching image composition and image authoring.
  • Group: image processing
  • Type: image processing and enhancement
  • Subtype: ner
  • License:
  • Language: n/a
  • Developer: -

gocr

  • Description:GOCR is an OCR (Optical Character Recognition) program developed under the GNU Public License. It converts scanned images of text back to text files.
  • Group: text recognition
  • Type: core text recognition
  • Subtype: ner
  • License:
  • Language: n/a
  • Developer: -

hOCR

  • Description:HOCR is a Hebrew optical character recognition library.
  • Group: Text Recognition
  • Type: Core Text Recognition
  • Subtype: -
  • License: GPLv3
  • Language: -
  • Developer: -

hOCR tools

  • Description:hOCR is a format for representing OCR output including layout information character confidences bounding boxes and style information. It embeds this information invisibly in standard HTML. By building on standard HTML it automatically inherits well-defined support for most scripts languages and common layout options. Furthermore unlike previous OCR formats the recognized text and OCR-related information co-exist in the same file and survives editing and manipulation. hOCR markup is independent of the presentation.
  • Group: Miscellaneous Utilities
  • Type: -
  • Subtype:
  • License: ASL 2.0
  • Language: -
  • Developer: -

imagemagick / graphicsmagick

  • Description:ImageMagick is a software suite to create edit compose or convert bitmap images. GraphicsMagick is the swiss army knife of image processing. It has been derived from ImageMagick 5.5.2
  • Group: image processing
  • Type: image processing and enhancement
  • Subtype: ner
  • License:
  • Language: n/a
  • Developer: imagemagick studio / graphicsmagick group

jmet2ont

  • Description:A tool that makes it possible to transform metadata from a traditional XML-based schema to RDF/OWL.''Mappings are described with XML. Existing mappings used in SYNAT transform traditional library/museum formats to the CIDOC CRM/FRBRoo ontology.
  • Group: Metadata Processing
  • Type: -
  • Subtype: Format transformation (XML)
  • License: GPL
  • Language: -
  • Developer: Poznań Supercomputing and Networking Center

kakadu

  • Description:Kakadu is a complete implementation of the JPEG2000 standard, Part 1,i.e., ISO/IEC 15444-1 — plus a great deal of Parts 2 and 3. The Kakadu software framework provides a solid foundation for a range of commercial and non-commercial applications.
  • Group: image processing
  • Type: image processing and enhancement
  • Subtype: image enhancement
  • License:
  • Language: n/a
  • Developer:


Would you like to add any tool?

Registered users can add new tools through a simple form login or register.

Search or filter tools

Group:

Type:

Subtype:

In demonstrator platform: