All We Need Is Data: Humanities, Datasets, and Corpora in the Era of Artificial Intelligence

The recent surge in AI development reflects a broader diffusion of disciplines, technologies, and tools made possible by increasingly powerful CPUs and, especially, GPUs. These advances have enabled effective implementation of machine learning models—particularly artificial neural networks—across diverse problem domains. As a result, AI and data science have become increasingly relevant to the humanities.

This talk offers an overview of notable humanities projects that incorporate these technologies, highlighting the critical role of large, diverse datasets and language corpora. It will explore key questions around processing, storing, sharing, and reusing such data. A case study from Italian sociolinguistics and corpus linguistics will serve as a springboard for reflecting on the evolving role of data in contemporary humanities research.

by Francesco BIANCO, Senior Fellow GATES – Assistant Professor at Palacký University Olomouc

June 10 at 5:30 p.m. – Université Grenoble Alpes (France) – Maison de la Création et de l’Innovation (MaCI), Room 208 (2nd floor).