Tools & Resources

The IMPACT Centre of Competence provides access to a remarkable collection of tools and resources for the digitisation of historical texts. Some of the tools can be tested online in our Demonstrator Platform and the resources are composed by historical lexica and an image and grund-truth dataset for 10 different languages.

Tools for text digisation

An overview with more than 250 state-of-the-art tools for text digitisation. These tools can be filtered according to their purpose. The main groups in which these tools are classified are:

Search tool

How can I add more tools?

Registered users can add new tools through a simple form

Language resources

The various language institutes in IMPACT project built lexica for historical languages. The aim is to improve OCR results for historical text, and also to ensure that the user finds historic variants of word when searching for the modern-day form.

IMPACT project built lexica for ten historical languages. It also built special lexica for named entities (specific names of for example places and people) in three languages.

Language Resources

How can I download these resources?

To access these lexica, it is only needed to register at the Impact Centre.

Image and Ground Truth resources

The Impact Centre of Competence dataset contains more than half a million representative text-based images compiled by a number of major European libraries. Covering texts from as early as 1500, and containing material from newspapers, books, pamphlets and typewritten notes, the dataset is an invaluable resource for future research into imaging technology, OCR and language enrichment.

Impact Dataset

How can I access the Impact Dataset?

To access this Dataset, it is only needed to register at the Impact Centre.

Be informed of all the news in our Newsletter.