Thesis defense

Language acquisition in brains and deep learning algorithms

Speaker(s)
Linnea Evanson
Practical information
06 December 2024
4pm
Place

Amphithéâtre Louis Liard, 17 rue de la Sorbonne, 75006 Paris

LSCP

Jury:

Laura Gwilliams – Stanford University, rapporteuse
Adina Williams – META AI New York, rapporteuse
Ghislaine Dehaene-Lambertz - Paris-Saclay University, examinatrice
Alejandrina Cristia – Ecole Normale Supérieure, examinatrice


Jean-Rémi King– Ecole Normale Supérieure, co-directeur de thèse
Pierre Bourdillon – HFAR, directeur de thèse

 

Abstract:

 

Children are able to acquire language with seemingly no effort. They transform from incoherent babblers at birth to understanding their native language, largely without being taught. What neural representations underlie this fundamentally human ability during development? In this thesis, I investigate the neural representations of language in children from two years old up to adulthood, using deep language algorithms as models.

To this aim, I collect the first intracranial dataset of children listening to natural language. The dataset includes over 5,000 electrodes from 19 children (2-11 years old) and 13 adolescents and adults (12-33 years old). These high spatio-temporally resolved signals offer a unique window into the developing brain, and can be effectively modelled by deep language algorithms which I train using developmentally plausible quantities of data.

First, our modelling results establish the behavioural trajectories of text models as similar to those of children. Second, our experimental data reveals a remarkable similarity in the language neural hierarchy between children and adults using encoding and decoding analyses of stereo-EEG (sEEG) signal. In addition to formal linguistic properties, we show that a deep neural network, trained with self-supervised learning on natural speech sounds, effectively accounts for these spatio-temporal dynamics. Third, we introduce a method to track the stages of neural language development by modelling sEEG data across development using 371 training steps of a self-supervised speech model.

Overall, these findings show how modern AI algorithms help model the hierarchy of natural language processing, not only in adults, but also in the developing brain.