[cs_content][cs_section bg_color=”hsla(168, 76%, 42%, 0.06)” parallax=”false” separator_top_type=”none” separator_top_height=”50px” separator_top_angle_point=”50″ separator_bottom_type=”none” separator_bottom_height=”50px” separator_bottom_angle_point=”50″ style=”margin: 0px;padding: 10px 0px;”][cs_row inner_container=”true” marginless_columns=”false” style=”margin: 0px auto;padding: 0px;”][cs_column fade=”false” fade_animation=”in” fade_animation_offset=”45px” fade_duration=”750″ type=”2/3″ style=”padding: 0px;”][x_custom_headline level=”h1″ looks_like=”h2″ accent=”false” style=”margin-top:1em;”]Word Spotting[/x_custom_headline][cs_text]

Produced by: National Center for Scientific Research (NCSR) “Demokritos”

[/cs_text][/cs_column][cs_column fade=”false” fade_animation=”in” fade_animation_offset=”45px” fade_duration=”750″ type=”1/3″ class=”search-other-tool” style=”padding: 0px;”][x_gap size=”10px”][x_custom_headline level=”h2″ looks_like=”h4″ accent=”false” class=”quitar-margen”]Compare with similar tools:[/x_custom_headline][cs_text]

  • Group: [x_button shape=”rounded” size=”small” float=”none” info=”none” info_place=”top” info_trigger=”hover” href=”/tools-resources/tools-for-text-digitisation/?query=&search-filter-group=text+recognition&search-filter-type=&search-filter-subtype=”][icon type=”search”] Text Recognition[/x_button]

[/cs_text][x_gap size=”90px”][/cs_column][/cs_row][/cs_section][cs_section bg_color=”hsl(0, 0%, 100%)” parallax=”false” separator_top_type=”none” separator_top_height=”50px” separator_top_angle_point=”50″ separator_bottom_type=”none” separator_bottom_height=”50px” separator_bottom_angle_point=”50″ style=”margin: 0px;padding: 0 0px 25px;”][cs_row inner_container=”true” marginless_columns=”false” style=”margin: 0 auto 0px;padding: 0px;”][cs_column fade=”false” fade_animation=”in” fade_animation_offset=”45px” fade_duration=”750″ type=”2/3″ style=”padding: 0px;”][cs_text][tabby title=”Scenario”]
This tool provides an integrated GUI for indexing historical documents without an OCR engine. It allows searching the database for instances of a query keyword using three different methods:

  1. Select the query from a predefined list of keywords.
  2. Define the query by an example.
  3. Type the query as text.

[/cs_text][cs_text][tabby title=”Abstract”]
Historical printed documents contain a vast amount of valuable information. A robust indexing of these documents is essential for quick and efficient use of valuable historical collections. Traditionally, this indexing has been done by means of OCR.

OCR produces its best results from well-printed, modern documents. But historical documents contain a range of effects that can reduce accuracy of recognition: from poor paper quality, poor typesetting, damage or degradation of the original paper source, and text skew or warping due to age or humidity.

The IMPACT Word Spotting tool represents a new approach to overcome these difficulties. It works by segmenting documents into individual words and compiling a list of the most common words (keywords) in the text. Users are then asked to classify the keywords by three possible methods:

  • by using a predefined keywords list
  • by providing an image example as a query
  • by typing the query as plain text

The application provides full functionality for the organisation, management and visualisation of a complete document collection.

Figure 1 - Nearest estimated words from specified keyword

Figure 1 – Nearest estimated words from specified keyword

Figure 2 - Query by example

Figure 2 – Query by example

[/cs_text][cs_text][tabby title=”Publications”]

[tabbyending][/cs_text][/cs_column][cs_column fade=”false” fade_animation=”in” fade_animation_offset=”45px” fade_duration=”750″ type=”1/3″ style=”padding: 10px;border-style: solid;border-width: 0;border-color: hsl(0, 0%, 100%);”][x_custom_headline level=”h2″ looks_like=”h4″ accent=”false”]Availability[/x_custom_headline][cs_text]For information on licencing, please contact NCSR IMPACT group[/cs_text][x_creative_cta padding=”25px 25px 25px 25px” text=”OCR Post-correction and Enrichment” font_size=”26px” icon=”cog” icon_size=”48px” animation=”slide-top” link=”/tools-resources/tools-for-text-digitisation/ocr-post-correction-and-enrichment/” color=”” bg_color=”hsl(168, 76%, 42%)” bg_color_hover=”hsl(0, 0%, 20%)” style=”margin-top:4em;”][/cs_column][/cs_row][/cs_section][/cs_content]