Use of digitised and OCRed text collections by end users

Impact CoC 7 May, 2010Discussions

Geneviève Cron of the Bibliotheque Nationale de France (BnF) begins by discussing the BNF’s digital library: Gallica. A million documents digitised since 1992, with OCR as standard since 2005. OCR accuracy for newspapers is 98% on word level, but results are much more varied – from 60% up. For books, the average accuracy lies at 90%.