Fourteen works of Spanish Literature and a dictionary (consisting of 6 volumes) were selected for the IMPACT Demonstrator dataset. Most books are from the sixteenth or seventeenth century, known as the Spanish Golden Age. They are mostly literary works: religious, plays, novels, poetry… Just one book belongs to eighteenth century, as does the Diccionario de Autoridades. Two of these books are from America: Cartha Athenagorica by Sor Juana Inés de la Cruz and Commentarios reales by Inca Garcilaso de la Vega, they were selected in order to register the vocabulary of Spanish in Latin America.
Apart from these books, a selection of 86 works between late 15th Century and 17th Century, were selected from Biblioteca Virtual Miguel de Cervantes consisting of almost 2 million tokens and 90.000 word forms.
The current lexicon consists of 11,846 lemmata, 31,584 word forms and 36,857 lemma/word forms combinations.