I am Associate Professor in the Computational Linguistics Group
of the University of Groningen.
I am passionate about the statistical modeling of languages, particularly in a multilingual context,
and my long-term goal is to design robust NLP algorithms that can adapt
to the large variety of linguistic phenomena observed around the world.
Among others, I work towards improving the quality of Machine Translation for challenging language pairs,
and making state-of-the art NLP models more interpretable.
As a cross-disciplinary research enthusiast, I'm interested in enhancing research on human language processing or language evolution with computational modeling tools.
Last but not least, I enjoy observing, interacting with, and finding daily inspiration in my two daughters and their trilingual minds in the making.
My research was funded by a Veni grant from the Dutch Research Council (NWO) from 2016 to 2021.
Currently, I am involved in two national-consortium projects, both funded by NWO's NWA-ORC initiative:
Interpreting deep learning models for language, speech & music (InDeep)
and
Low Resource Chat-based Conversational Intelligence (LESSEN).
I also supervise two China Scholarship Council (CSC)-funded PhD students working on the simulation of human patterns of language learning and change.
Interested in my work ? Also see my Research page.
News
[Jun 2023] Paper accepted at TACL:
"Communication Drives the Emergence of Language Universals in Neural Agents: Evidence from the Word-order/Case-marking Trade-off",
with Yuchen Lian and Tessa Verhoef.
Super proud of my collaboration with Tessa started many years ago with the goal of bringing actual language evolution expertise together with actual NLP expertise. It took us years to understand what we wanted to do (and how!) but we're finally on full swing with two well defined PhD projects.
In this TACL work, we introduce an artificial learning framework (NeLLCom) that can be used to simulate human pattern of language learning and change with neural network learners (to be also presented at ESSLLI's workshop on Internal and External Pressures Shaping Language)
[Jun 2023] First-year PhD student Yuqing Zhang will be presenting her preliminary results on the question "Do neural networks display a dependency length minimization (DLM) principle like humans?" at TABU Dag.
[May 2023]
Demo accepted at ACL:
"Inseq: An interpretability toolkit for sequence generation models",
with Gabriele Sarti, Nils Feldhus, Ludwig Sickert, Oskar van der Wal, and Malvina Nissim.
InSeq is the ultimate library for applying feature attribution to seq-to-seq and other text generation models, signed by the awesome InDeep consortium.
[May 2023] Paper accepted at Interspeech:
"Wave to Syntax: Probing spoken language models for syntax",
with Gaofei Shen, Afra Alishahi, and Grzegorz Chrupala.
A fruitful collaboration within the InDeep consortium!
[Mar 2023] Paper accepted at NoDaLiDa:
"Slaapte or Sliep? Extending Neural-Network Simulations of English Past Tense Learning to Dutch and German",
with Xiulin Yang, Jingyan Chen, Arjan van Eerden, Ahnaf Samin.
This work originated from a NLP course project, very proud of my co-authors!