Automatic Assessment of Child Language and Adult L2 Acquisition with Neural Language Models - 2025-26 Distinguished Computational Linguistics Lecture
Automatic Assessment of Child Language and Adult L2 Acquisition with Neural Language Models
Kenji Sagae, Ph.D.
Professor and Chair, Department of Linguistics, University of California, Davis
Abstract: When assessing language development, one typically faces a choice between easily computable but coarse-grained metrics focused on superficial characteristics that are broadly applicable to a variety of languages, or more expressive metrics tailored specifically to the grammar of a target language. In the first part of this talk, I will discuss recent work on automatic assessment of language development that uses small lightweight neural language models, and produces results that are comparable to what is achieved using established language assessment metrics based on language-specific information carefully designed by experts. Unlike existing sophisticated metrics, this approach is fully data-driven and can be applied in the same way to different languages without the need for linguistic expertise. I will present an evaluation scheme that makes it possible to compare this approach directly to previously proposed metrics. Training and evaluation of the language models used in this approach is made possible by the availability of longitudinal child language data in the CHILDES database. In the second part of this talk, I will discuss the application of this general assessment approach to adults learning a new language, including a long-term project on development of a dataset suitable for training and evaluation of models related to language learning in a university setting.
Bio: Kenji Sagae is a professor of linguistics at UC Davis, with additional affiliations with the computer science graduate program and the cognitive science program. Before joining UC Davis, he co-founded KITT.AI, a language technology startup acquired by Baidu in 2017. He was previously a research scientist at the Institute for Creative Technologies at the University of Southern California, where he held a research faculty appointment in computer science, and a research associate at the University of Tokyo. His PhD at Carnegie Mellon University focused on the application of data-driven natural language parsing to analysis of child language, and his current research involves computational analysis of language structure and its use in linguistic inquiry and in natural language processing applications.
Please submit interpreting requests to myAccess.rit.edu.
Event Snapshot
When and Where
Who
Open to the Public
Interpreter Requested?
No