CLARIN ERIC Exchange Venice-Leuven 2025

administratorBlog, CLARIN Mobility Grant, News

Written by Tatiana Tommasi, PhD student

I applied for a CLARIN Mobility Grant because I thought it could be useful to improve my PhD work, dedicated to the study of the epigraphic tradition through digital technologies. For my research I focus in particular on the possibilities offered by Layout analysis and Handwritten Text Recognition (HTR) tools for the digitisation of epigraphic prints and manuscripts containing transcriptions of ancient inscriptions. Following the suggestion of one of my supervisors, Prof. Federico Boschetti, I thought that it could be of great importance for my research to get in touch with the IMPACT centre of competence – CLARIN K-centre in digitisation, an international network, directed by Sally Chambers and with headquarters at the University of Alicante, gathering together several member institutions and offering tools for the digitisation of textual documents. Specifically, thanks to the CLARIN Mobility Grant funding, I had the opportunity to visit KU Leuven Libraries and the Faculty of Arts in Leuven (Belgium) for one week, from 5th to 9th May 2025. Here, Dr. Nele Gabriëls from the library’s Digitisation Department, introduced me to professors and researchers connected to Digital Humanities with whom I could exchange ideas (Fig. 1).


Fig. 1: University Library in Leuven

During the first day of my visit, I met with Nele and Prof. Gustavo Candela (University of Alicante and Impact) and explained to them the subject and objectives of my PhD project. The meeting helped me to identify two main research areas on which I could focus during my visit at KU Leuven: 1. Layout analysis and HTR technologies applied to epigraphic documents, both printed and handwritten (Fig. 2); 2. Data modelling and analysis of epigraphic metadata, their enrichment and publication as LOD.


Fig. 2: Layout analysis and HTR applied to an epigraphic print (L. A. Muratori, Novus thesaurus veterum inscriptionum, 1739) through eScriptorium

Regarding the first topic, particularly useful for me were the meetings with Prof. Margherita Fantoli, PhD student Laura Soffiantini and Dr. Maria Mihaela Truşcǎ, who is collaborating to the HTR section of the STUDIUM.AI project. First of all, I had the opportunity to know more about the interesting STUDIUM.AI project, dedicated to the study of handwritten notes from early modern students of the Old University of Leuven, applying HTR recognition tools as a first step to understand and reconstruct the knowledge network behind these historical documents. We also compared HTR tools and resources and discussed which strategies could be the most useful for my PhD project. I confirmed my initial idea of using the application eScriptorium for HTR and layout recognition tasks, because it is compliant with the principles of Open Science (Fig. 2). Specifically, after the productive exchanges of ideas, I decided that I will also try to use YOLO models, based on an object detection algorithm, for the layout analysis of epigraphic printed and handwritten documents. Of great importance for me was also the meeting with Dr. Tom Gheldof, expert in Digital Epigraphy. His suggestions helped me in defining the steps to create a digital edition of epigraphic manuscripts, starting from the possibilities offered by TEI (Text Encoding Initiative) XML standards. Moreover, the second day of my visit, thanks to the staff of the KU Leuven Libraries’ Digitisation Department, I had the opportunity to visit the Imaging Lab, dedicated to the digitisation of heritage documents with advanced technologies (Fig. 3).


Fig. 3: Imaging Lab

The period of time spent in Leuven was useful also to discover more about the modelling of data and the analysis, publication and reuse of metadata. Of particular interest for me in order to know more about the process of modelling data was the meeting with Professor Mark Depauw, which gave me the possibility to understand the complex relational database structure behind the research platform Trismegistos.

Furthermore, I had the opportunity to attend a presentation and a practical workshop, organised by Gustavo Candela and Nele Gabriëls thanks to the Impact collaboration (Fig. 4). These events, based on Gustavo’s experience in managing digital library metadata at the Biblioteca Virtual Miguel de Cervantes and at the National Library of Scotland, were dedicated to define a workflow for publishing datasets connected to cultural heritage and GLAM (Galleries, Libraries, Archives, Museums) institutions as machine readable data. Specific attention was offered to the enrichment process through external resources (like Wikidata), to the visualisation of data and their publication as LOD. Particularly useful was the practical part of the workshop, assisted by a Github repository (Github Leuven Collections as data) with resources, bibliographical references and exercises produced through Jupyter notebooks and using datasets created and maintained by KU Leuven Libraries. These events gave me the opportunity to discover new technologies, which I will be able to apply to the epigraphic datasets I am creating for my PhD research. The adoption of this method will allow me to enrich the data collected, for example through external resources like the gazetteer of ancient places Pleiades.

Together with the hosting institution and the Impact centre we are planning to write an article focusing on the application of Jupyter notebooks to the data of my PhD project, following the workflow defined during the workshop and built on 4 main steps: exploration, extraction, transformation and reuse of data.


Fig. 4: Conference on “Collections as Data” by Gustavo Candela

In conclusion, the visit to KU Leuven has been a really significant experience for me, both for my personal and professional growth. I had the opportunity to present my PhD project to several researchers and professors, and to receive valuable suggestions from them. Furthermore, during this week I learned to use new resources and technologies (for example, Jupyter notebooks) and I was able to observe the data I am working on from different points of view. I am really grateful for this experience, and I am sure that it will have a positive impact on the further steps of my PhD research.