September 25, 2006
Comparing Real-Real, Simulated-Simulated, and Simulated-Real Spoken Dialogue Corpora
Speaker: Hua Ai
Purpose: Prelim Exam
Abstract: User simulation is used to generate large corpora for using reinforcement learning to automatically learn the best policy for spoken dialogue systems. Although this approach is becoming increasingly popular, the differences between simulated and real corpora are not well studied. We build two simulation models to interact with an intelligent tutoring system. Both models are trained on two different real corpora separately. We use several evaluation measures proposed in previous research to compare between our two simulated corpora, between the original two real corpora, and between the simulated and real corpora. We next examine the differentiating power of these measures. Our results show that although these simple statistical measures can distinguish real corpora from simulated ones, these measures cannot help us to draw a conclusion on the “reality” of the simulated corpora since even two real corpora can be very different when evaluated on the same measures.
Posted by nlplab at
11:49 AM
September 22, 2006
[TALK] Learning to Show You're Listening: A Trainer for Back-Channeling in Arabic
Nigel G. Ward, Yaffa Al Bayyari, Rafael Escalante, Thamar Solorio
University of Texas at El Paso
12 noon, 5317 Sennott Square (ISP Forum)
Abstract: Good listeners generally produce back-channel feedback, and do so in a
language-appropriate way. Second language learners often lack this
skill. We present a training sequence which enables learners to
acquire a basic Arabic back-channel skill, namely, that of producing
feedback immediately after the speaker produces a sharp pitch
downslope. This training sequence includes an explanation, audio
examples, the use of visual signals to highlight occurrences of the
pitch downslope, auditory and visual feedback on learners' attempts to
produce the cue themselves, and feedback on the learners' performance
as they play the role of an attentive listener in response to one side
of a pre-recorded dialog. Preliminary experiments suggest that this
allows some learners to acquire this behavior.
The talk will also touch on the role of back-channels in various types
of dialog, methods for the discovery and quantification of
dialog-relevant prosodic cues, potential cross-cultural
misunderstandings of prosodic signals, the interplay between
meta-communication and the communication of content, and ways to
quantify the value of good turn-taking relative to other dialog skills.
Posted by nlplab at
12:20 PM
September 15, 2006
[TALK] Cognitive Load and Spoken Interface Design: Comparing Natural and Standardized Approaches to the Generation of Referring Expressions
Speaker: Ellan Campana, Arizona State / University of Rochester
12 noon, 5317 Sennott Square (ISP Forum)
Human language capabilities are both context-dependant and flexible. On the one hand psycholinguistic evidence suggests that listeners naturally and rapidly
integrate elements of the visual, discourse-level, and social context with
incoming speech, using these elements to improve the speed at which they
identify the intended referents of referring expressions. On the other hand
research in human-human interaction has also shown that listeners are flexible
in their use of language in that they are able to adapt to speaker-dependant
patterns, and that they are capable of establishing and using new referring
expressions and sub-languages for specific domains. In the spoken language
interface literature, these two sets of findings have been used to support two
different approaches to interface design, which I call the "natural" approach
and the "standardized" approach. The natural approach argues that in order to be
easy to use, such interfaces should approximate human-human interaction as
closely as possible, including context-dependant generation and understanding of
referring expressions. The standardized approach argues that instead systems
should take advantage of human abilities to learn and adapt while minimizing
computational complexity. Thus, users should be exposed to and use consistent,
non-context-dependant referring expressions so that the systems will be easier
to learn. There is little direct empirical evidence examining which of these
design approaches results in less cognitive load on the part of system users. In
this talk I will describe the results from my research applying a classic tool
of cognitive psychology, the dual-task paradigm, to spoken interface evaluation
with the goal of comparing the two approaches directly. Specifically, I examine
natural and standardized design approaches with respect to the role of discourse
context in user comprehension / system generation of referring expressions.
Speaker Bio
Ellen Campana is a Lecturer ABD at Arizona State University, in the Arts, Media,
and Engineering Program and the Psychology Department. She is also currently a
candidate for a joint Ph.D. in Brain & Cognitive Sciences and Computer Science
at the University of Rochester. She holds a B.S. in Computer Science and a B.S
in Psychology from the University of Wisconsin-Madison, and an M. A. in Brain
and Cognitive Sciences from the University of Rochester.
Posted by nlplab at
05:01 PM
September 11, 2006
Interspeech Practice Talk
Building an English-Iraqi Arabic Machine Translation System for Spoken Utterances with Limited Resources.
By: Behrang Mohit
This is a joint work with Jason Riesa, Kevin Knight and Daniel Marcu
The paper can be found here .
Posted by behrang at
01:32 PM