On the 28th and 29th of January the CASCADE Network came together once again for what would be our 4th official reunion and first convention, this time the venue was the charming town of Leuven, Belgium. Unfortunately, since we arrived the evening before the start of the convention after a full day of travel from Sheffield, our wanderlust would have to wait.
One and a half years in, the network seems to have developed a tacit routine during these meet-ups with some of us choosing to use breakfast time to socialize and talk about the upcoming activities while others prefer to catch-up on sleep. However, this time, the interactions had an underlying nervousness as ESRs prepared to present their research projects to both peers and supervisors.
Last November was a busy month for me. I had the chance to attend two major events back-to-back: the FCAI AI Day in Espoo and the Nordic AI Meet (NAIM) in Sweden. I gave the same talk at both events: “Detecting Latin in Historical Books with Large Language Models.” This represents the core of my PhD work during my first year. I also presented a poster at NAIM.
It was Tuesday, August 19th,2025, and we were travelling to the third CASCADE MSCA Doctoral Network training camp. This time around, it was taking us to Helsinki, and the trip to the airport reminded me of how far we would be going this time (until now, the KU Leuven team had travelled by train for every meeting). What would we learn? What sights would we see? What connections would we strengthen?
Camp Day One: Wednesday, August 20th
Training camp opening session, led by Helsinki’s Mikko Tolonen. Photo by Emer Yip
As everyone slowly started coming down from their hotel rooms, ready for breakfast and our first day of training sessions, I started seeing familiar faces. Over breakfast with a few of my fellow ESRs, we excitedly discussed travel mishaps, current and past projects, and our expectations for the days to come. After eating, we all met in the hotel lobby (ESRs and supervisors alike) for what has now become a tradition: walking together to where the camp would take place. On this occasion, it would be the main building of the University of Helsinki.
There’s a special kind of buzz that comes with a big international conference, the hum of conversations in a dozen different accents, the shuffle of people moving between sessions, and the lively exchanges over coffee in crowded corners. Until this year, I’d never experienced it. I found it for the first time in Lisbon, at the Digital Humanities 2025 conference.
This was no ordinary trip for me. It was my first in-person international conference and my first time presenting at one. On Thursday, July 17, I’d be co-presenting a long paper with my colleague Rachel McCarthy: Rewriting Tradition: Quantifying Change in Lady Gregory’s Irish Legends.
The Journey to Lisbon
I set off from Cork on Sunday, July 13, catching an early flight through Amsterdam’s bustling Schiphol Airport. By mid-afternoon, Lisbon was unfolding beneath me, golden in the summer light. The city met me with two kinds of warmth, the sunshine, of course, but also the easy friendliness in the way people smiled and greeted me. The Mercure Lisboa Hotel became my base for the week, a comfortable spot to return to after the conference bustle.
Earlier this June, I had the pleasure of attending the 2025 DARIAH Annual Event in the historic city of Göttingen, Germany. Set in the impressive Göttingen State and University Library (SUB) and Alte Mensa, the conference brought together a vibrant community from the fields of digital humanities, cultural heritage, and data science under the unifying theme: “The Past”. Spanning four dynamic days, the event provided an inspiring forum for exploring how the digital humanities are reshaping our relationship with historical knowledge – from immersive archives and machine learning to educational games and inclusive metadata.
Of course, the primary debate focused on the place and role of artificial intelligence in digital humanities today. Dr Andrea Scharnhorst’s presentation, “Archiving for the Future Past – Multimodality and AI – Challenges and Opportunities”, explored how to enhance digital archival infrastructures with AI-driven solutions. Drawing on open-source infrastructures like the Dataverse project, the team demonstrated how innovative tools like Ghostwriter AI can support more inclusive annotation, metadata enrichment, and access across sensory modalities. Beyond the technical dimension, the presentation emphasized the importance of collaborative knowledge organization and the need to support smaller or vulnerable data communities, ensuring that the diversity of cultural expression is not lost to time – or technology. As Dr Scharnhorst explained, while AI will most certainly change human behaviour, it does not have to erode our humanity.
About two months ago, I had the privilege to attend the 2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL), which took place in Albuquerque, New Mexico, on April 29–May 4. This was my third big international in-person conference and second ACL experience, as I previously went to EAMT 2023 (the 24th Annual Conference of the European Association for Machine Translation) and EACL 2024 (the 18th Conference of the European Chapter of the Association for Computational Linguistics) during my Master’s studies. But it was the first time I would present a paper at such an event, making NAACL 2025 a crucial step in my academic career.
What made it even more meaningful was the fact that my paper was accepted at the 9th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2025), co-located with NAACL. Last year, when I attended EACL in Malta, I felt somewhat overwhelmed at the beginning of the conference. As someone coming from a translation background and only starting to learn programming, I found most of the talks too technical and difficult to understand. At the LaTeCH Workshop, however, which took place on the last day of EACL, I felt much more in my element: the topics discussed there focused on the humanities rather than the computational aspect, which aligned closer with my interests. Little did I know that the workshop organizer, Stefania Degaetano-Ortlieb, would later become my supervisor during my PhD.
Lidia Pivovarova (left) Ke Shu (middle) and Yu Wu (right) at the DHH 25 at the University of Helsinki, Finland.
I am a member of Computational History (COMHIS) research group at the University of Helsinki—and one of CASCADE’s early researchers. Last week, COMHIS successfully organised the Digital Humanity Hackathon 2025 (DHH 25) in Helsinki. Over 10 intense days, 36 participants from across disciplines came together to explore how digital methods can shed new light on humanities questions. As this milestone year marked a decade of DHH, the energy and enthusiasm were especially high.
Four Interdisciplinary Tracks
This year’s hackathon featured four thematic tracks, each tackling a different facet of digital humanities:
ParliaNets: Parliaments beyond Borders Investigating how debates in one country draw on foreign influences, participants mapped networks of parliamentary speeches and foreign-policy discussions.
Oral History: Digital Presence in Physical Absence Teams worked with Holocaust survivor testimonies, exploring how digital tools can preserve and analyze stories when the speakers themselves are no longer present.
Rare Earth: Rare Earth & Web Discourses Focusing on parallel mining approaches, this track combined environmental history with online discourse analysis to trace how “rare earth” minerals enter public conversation.
Early Modern: Economic Bubbles, Consumerism, and the Colonies This group uses Burney and Nichols newspaper collections to track consumer trends and indicators of economic change in emerging colonial markets.
I had the honour of serving as Team Leader for the Early Modern track—my first time in this role. I was responsible for distributing our datasets and providing all the technical support our team needed.
The CASCADE project has launched a Substack: ‘Language and Technology’!
This Substack offers a window into the research and reflections of CASCADE’s doctoral projects across the five partner universities: University College Cork, University of Helsinki, KU Leuven, Universität des Saarlandes, and the University of Sheffield.
Posts include insights into computational linguistics and text analytics; perspectives on the social, ethical, and cultural implications of AI and natural language processing; commentary on the role of language in the data economy; and interviews, explainers, and research highlights for both scholarly and general audiences.
The first post, written by CASCADE Early Stage Researcher Sofía Aguilar based at USAAR, is now live at:
Pop on over and hit the subscribe button to learn more about the work of our research team! ————————————————————————————————————- CASCADE is a collaboration between University College Cork, University of Sheffield, University of Helsinki, Katholieke Universiteit Leuven, and Universität des Saarlandes. Funded by Horizon Europe under the Marie Skłodowska-Curie Actions (MSCA) Doctoral Networks and the UKRO.
The CASCADE team including guest researchers Haim Dubossarsky and Tanja Säily at USAAR
The second CASCADE Training and Research Camp took place from 12-13 March at the University of Saarland in Saarbrücken, where the full group of 10 ESRs and their PhD supervisors, and a roster of guest researchers came together to learn from, and with each other, at the USAAR Campus Innovation Center. Two packed days of activities included ESR group work & brainstorming sessions; poster presentations and even a guided tour of 500 years of Saarbrücken history at the SAAR Historical Museum.
The programme included several guest researcher presentations by experts in the field of Historical Sociolinguistics, Computational Lexicography and Natural Language Processing, as well as an insightful presentation by a former Marie Curie Early Stage Researcher on the highs and lows of the ESR journey. All in all, it was a very well spent spent two days in Saarbrücken and the team returned to their respective institutions with heads filled with new information, new ideas and further plans for future collaborations.
Penelope Gia Bao Huu Nguyen (left) and fellow CASCADE doctoral student Maria Jimena Flores Alejo. (right)
When everyone at high school was busy choosing majors for their university applications, I was daydreaming about becoming a lexicographer. I loved the idea of working with dictionaries, especially English dictionaries. As the lingua franca and the language of the academic world, it was English that opened so many doors for me, liberated me, and above all, quenched my thirst for knowledge. But mastering a tool is completely different from having it as the subject of your research. I then had grown so fond of the language that I wished to study it. But as a non-native speaker, how could I make it happen? That is not to mention that linguistics and lexicography were not taught as majors at the best university in my region. Time went by, and I forgot about the little crazy dream I had.
I have a bachelor’s degree in English Studies from Can Tho University in Vietnam, a master’s degree in Linguistics from Purdue University in the USA, and now, I am pursuing a PhD in Digital Humanities at the University of Sheffield in the UK. The name of the degree may fail to capture what I really study, but what I can say is that I’m inching towards what I aspire to explore. Last spring, I even interviewed for the position of a linguistic data manager at Cambridge Dictionary. Somehow, I could complete almost all the data challenges. The once baseless dream has now been less baseless.