Are you sure you want to delete this access key?
The LEGOv2 database is a parameterized and annotated version of the CMU Let’s Go database from 2006 and 2007.
This spoken dialogue corpus contains interactions captured from the CMU Let’s Go (LG) System by Carnegie Mellon University in 2006 and 2007. It is based on raw log-files from the LG system.
The corpus has been parameterized and annotated by the Dialogue Systems Group at Ulm University, Germany.
The corpus comes with both, MySQL-database dumps and CSV files.
The mysql-dump can be imported right away in any MySQL database. CSV files can be imported e.g. in Excel, Matlab, R, Weka, SPSS and other SQL databases than mysql.
interactions.csv/interactions-Table
: each line contains a system-user exchange, parameterized with 53 interaction parameters
Use callid to join with call-Table
Furthermore, the file contains EmotionalState
annotations and Interaction Quality
annotations, see below. For interaction quality please refer
to [Schmitt et al., 2011]
calls.csv/calls-Table
: each line contains information affecting the entire call. Primary key: callidThe file contains gender
, age
and dialogue outcome
annotations that can be used as target variable to predict task completion.
acoustics.csv/acoustics-Table
: each line contains basic acoustic and prosodic features extracted on the full utterance. Extraction has been
done with the Praat software; see [Schmitt2009] for details.The LEGO Spoken Dialogue Corpus has the following directory structure:
license.txt -> license file
readme.txt -> this file
interaction_parameters.pdf -> Description of interaction parameters
|
|---- audio -> wav files with user utterances and full recordings
|
|---- corpus
|
|---- csv -> CSV-files with interactions.csv, acoustics.csv and calls.csv
|---- mysql -> mysql dump
Interaction Quality [Schmitt2011]:
Annotation Scheme:
Annotation Scheme: friendly, neutral, slightly angry, angry, very angry
To cite the corpus, please use the following two publications:
[Schmitt2012]
A. Schmitt, S. Ultes and W. Minker
A Parameterized and Annotated Spoken Dialog Corpus of the CMU Let's Go Bus Information System
International Conference on Language Resources and Evaluation (LREC), Istanbul, Turkey, pp. 3369--3373, May 2012
[Ultes2015]
S. Ultes, A. Schmitt, M. J. Platero Sánchez and W. Minker
Analysis of an Extended Interaction Quality Corpus
International Workshop On Spoken Dialogue Systems (IWSDS), Busan, Korea, January 2015
submitted
References:
[Esenazi2008] Maxine Eskenazi, Alan W Black, Antoine Raux, and Brian Langner
Let’s Go Lab: a platform for evaluation of spoken dialog systems with real world users
in: Proceedings of Interspeech 2008 Conference, Brisbane, Australia`
[Schmitt2011] Alexander Schmitt, Benjamin Schatz and Wolfgang Minker,
MODELING AND PREDICTING QUALITY IN SPOKEN HUMAN-COMPUTER INTERACTION,
in: Proceedings of the SIGDIAL 2011 Conference,
Association for Computational Linguistics, 2011`
[Schmitt2009]
A. Schmitt, T. Heinroth and J. Liscombe
On NoMatchs, NoInputs and BargeIns: Do Non-Acoustic Features Support Anger Detection?
Proceedings of the SIGDIAL 2009 Conference, Association for Computational Linguistics, London, UK, pp. 128--131, 2009
Press p or to see the previous file or, n or to see the next file
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?