IMPACT Polish GT Corpora
Produced by: University of Warsaw
Abstract
The search engine, made available by the Formal Linguistics Department of the University of Warsaw, facilitates searching digitalized texts in the DjVu format. The engine is a modification of the Poliqarp system (developed in the Institute of Computer Science of Polish Academy of Sciences) used to support the National Corpus of Polish, so it has the same query syntax. The modification has been implemented by Jakub Wilk, who also converted most of the texts to a suitable format. The idea to use Poliqarp for DjVu texts was developed by Janusz S. Bień. It was presented in a paper entitled “Facilitating access to digitalized dictionaries” and later in other publications including “Efficient search in hidden text of large DjVu documents”.
Publications
- Bień, Janusz S. (2012) Delivering the IMPACT project Polish Ground-Truth texts with Poliqarp for DjVu. Technical Report. Katedra Lingwistyki Formalnej UW. (Unpublished)
Availability
Availability
The search engine can be found at: