State-of-the-art Tools for Text Digitisation: Tutorial @ TPDL 2013

Sebastian KirchSucceed

On the 22nd of September, the tutorial State-of-the-art Tools for Text Digitisation was held in Valletta (Malta) in the framework of the TPDL conference. The tutorial, organised by the Succeed project, attracted 15 participants (a significant number for this type of events) mainly from libraries and archives from Europe, America and Africa.

Most of the participants did not have a technical background but were librarians and archivists interested in the possibilities of digitization and the elements and challenges to be considered when starting a digitization project. The session was organised in 6 sections:

  1. Introduction to text digitisation process
  2. Image enhancement
  3. OCR and Post-correction
  4. Logical structure analysis
  5. Lexicon-building, Deployment and Enrichment
  6. Hands-on session

For addtional information, the tutorial slides can be consulted here.

The feedback from the participants can be summarized as follows:

  • Even though most of the participants were not directly involved in digitization projects, they found it very interesting to get an overview on the possibilities digitisation technology offers.
  • Each digitisation project is different and with the extensive set of available tools it is important to evaluate very carefully which tools can improve the digitisation workflow.
  • The participants agreed that it is always advisable to have technical experts at hand that know the field very well and are able to select the right tools for the each step of the digitization workflow.