This tool detects and removes noisy black borders as well as noisy text regions. Moreover, it detects the optimal page frames of double page document images.
A scanned document will tend to contain graphical information that a researcher will not need, in particular the blank areas around the text body. The border removal process aims at enhancing document images by automatically detecting and cutting out noisy black borders as well as noisy text regions from neighbouring pages.
It is based on projection profiles combined with a connected component labeling process and signal cross-correlation in order to verify the detected text areas.
Removing these areas (commonly known as border removal) has the effect of making such t exts easier to read on a screen and also reduces the overall file size – making it easier for an institution to store, and quicker to transfer remotely to a researcher.
The following screen cast introduces the theory behind border removal, as well the IMPACT border removal toolkit: a modular and adaptable programme.
- IMPACT deliverable D-TR1: Image Enhancement Toolkit (December 2011)
- Stamatopoulos, N., B. Gatos and T. Georgiou, “Page Frame Detection for Double Page Document Images”. DAS2010 Conference (9-11 June, Cambridge, USA)
- Gatos, B. IMPACT Tools Developed by NCSR. IMPACT Final Conference 2011, 24-25 October, London, UK
- Willms, L. An Introduction to the IMPACT Border Removal Tools developed by NCSR
For information on licencing, please contact NCSR IMPACT group
- NCSR Border Detection and Removal at Impact Demonstrator Platform