What’s in a word?

Rafael CarrascoDiscussions, OCR evaluation/quality control

Word error rate is often used as a measure of OCR accuracy. Although words are the relevant unit in information retrieval, the definition of word in the context of OCR is not as simple as it might appear at first glance. This post compiles some of my ideas on the relation between words and characters.