Stefan Pletschacher from the University of Salford presented methods to evaluate OCR results. For a proper evaluation, ‘ground truth’, that is almost 100% correct text is needed. But a big challenge lies in how to calculate the gravity of different kinds of errors. Will character or word accuracy be used? Do errors in heading count more than errors in footnotes? How are errors in a page’s structure measured?
