
The project is concerned with the parallels between human learning and machine learning. We leverage passive human data to improve and evaluate machine learning methods. Specifically, we make use of human language processing data, such as eye-tracking and electroencephalography (EEG) recordings during text comprehension to augment natural language processing (NLP) applications.


One of the main pilars of this work is to provide a large dataset of human cognitive data to enable machine learning studies on (neuro-)psychologically sound data.

We present the Zurich Cognitive Language Processing Corpus (ZuCo), a simultaneous EEG and eye-tracking resource for natural language reading.

This dataset was recorded in two parts, which are publicly available:

external page ZuCo 1.0
external page ZuCo 2.0

A collection of other cognitive data sources of human language processing recordings can be found external page here.


Improving NLP

How can passive human signals be used most effectively in machine learning? We work on systematic studies of using cognitive data to improve a wide range of NLP applications. For example, we enhance a named entity recognition model with eye-tracking features.


Cognival is a framework for cognivitive word embedding evaluation. How well can state-of-the-are word representations predict brain activation?

external page Try CogniVal




external page Dr. Nora Hollenstein (former member of DS3Lab), Institute for Computer Linguistics, University of Zurich


This project has only been possible with the great help from these international collaborators:

external page Prof. Nicolas Langer, University of Zurich

external page Dr. Maria Barrett, University of Copenhagen

external page Prof. Lisa Beinborn, Vrije Universiteit Amsterdam






Nora Hollenstein, Marius Troendle, Ce Zhang, Nicolas Langer. ZuCo 2.0: A Dataset of Physiological Recordings During Natural Reading and Annotation. LREC 2020.

Nora Hollenstein, Maria Barrett, Lisa Beinborn. Towards best practices for leveraging human language processing signals for natural language processing. LiNCR 2020.

Christian Pfeiffer, Nora Hollenstein, Ce Zhang, Nicolas Langer. Neural dynamics of sentiment processing during naturalistic sentence reading. NeuroImage 2020.

Nora Hollenstein, Antonio de la Torre, Nicolas Langer, Ce Zhang. CogniVal: A Framework for Cognitive Word Embedding Evaluation. CoNLL 2019.

Nora Hollenstein, Maria Barrett, Marius Troendle, Francesco Bigiolli, Nicolas Langer, Ce Zhang. Advancing NLP with Cognitive Language Processing Signals. preprint arXiv:1904.02682 2019.

Nora Hollenstein, Ce Zhang. Entity Recognition at First Sight: Improving NER with Eye Movement Information. NAACL 2019.

Nora Hollenstein, Jonathan Rotsztejn, Marius Troendle, Andrea Pedroni, Ce Zhang, Nicolas Langer. ZuCo, a simultaneous EEG and eye-tracking resource for natural sentence reading. Scientific Data 2019.

Maria Barrett, Joachim Bingel, Nora Hollenstein, Marek Rei, Anders Søgaard. Sequence classification with human attention. CoNLL 2018.

JavaScript has been disabled in your browser