ICDAR2019 Competition on Post-OCR Text Correction: Call for Participation

Impact Centre of Competence ICDAR2019

ICDAR2019 Competition on Post-OCR Text Correction (POCR) invites researchers from any field that can be applied to document analysis (e.g. natural language processing, data analysis, text data mining...) to challenge their method(s) for improving/denoising OCR-ed texts, on a testbed of more than 20 million characters. Given the noisy OCR of printed text from 10 languages (English, French, German, Finish, Spanish, Dutch, Czech, Bulgarian, Slovak and Polish), the participants will be proposed to participate in two tasks: detecting and/or correcting OCR errors.! The text element is intended for longform copy that could potentially include multiple paragraphs.

How to participate

All participants are invited to:

  1. Register on the website
  2. Train method(s) on the training dataset
  3. Test method(s) on the testing dataset
  4. Submit results and method descriptions

Track of challenges

  1. Detection of OCR errors: given the raw OCR-ed text, the participants are asked to provide the position and the
    length of the suspected errors.
  2. Correction of OCR errors: given the OCR errors in their context, the participants are asked to provide one or a ranked list of candidates for correction.

Important dates

  • 1st February 2019: Registration
    open
  • mid-February March 2019:
    Training set sent to participants
  • 30 March 2019: Registration
    deadline
  • 24 Apr. 2019: Testing set sent to participants
  • 26 Apr. 2019: Result submission to the organizers
  • 21 Sept.: Results notification

For further information, please visit https://sites.google.com/view/icdar2019-postcorrectionocr/home?authuser=0

Share this Post