September 25, 2006

Comparing Real-Real, Simulated-Simulated, and Simulated-Real Spoken Dialogue Corpora

Speaker: Hua Ai

Purpose: Prelim Exam

Abstract: User simulation is used to generate large corpora for using reinforcement learning to automatically learn the best policy for spoken dialogue systems. Although this approach is becoming increasingly popular, the differences between simulated and real corpora are not well studied. We build two simulation models to interact with an intelligent tutoring system. Both models are trained on two different real corpora separately. We use several evaluation measures proposed in previous research to compare between our two simulated corpora, between the original two real corpora, and between the simulated and real corpora. We next examine the differentiating power of these measures. Our results show that although these simple statistical measures can distinguish real corpora from simulated ones, these measures cannot help us to draw a conclusion on the “reality” of the simulated corpora since even two real corpora can be very different when evaluated on the same measures.

Posted by nlplab at 11:49 AM

September 22, 2006

[TALK] Learning to Show You're Listening: A Trainer for Back-Channeling in Arabic

Nigel G. Ward, Yaffa Al Bayyari, Rafael Escalante, Thamar Solorio

University of Texas at El Paso

12 noon, 5317 Sennott Square (ISP Forum)

Abstract: Good listeners generally produce back-channel feedback, and do so in a
language-appropriate way. Second language learners often lack this
skill. We present a training sequence which enables learners to
acquire a basic Arabic back-channel skill, namely, that of producing
feedback immediately after the speaker produces a sharp pitch
downslope. This training sequence includes an explanation, audio
examples, the use of visual signals to highlight occurrences of the
pitch downslope, auditory and visual feedback on learners' attempts to
produce the cue themselves, and feedback on the learners' performance
as they play the role of an attentive listener in response to one side
of a pre-recorded dialog. Preliminary experiments suggest that this
allows some learners to acquire this behavior.

The talk will also touch on the role of back-channels in various types
of dialog, methods for the discovery and quantification of
dialog-relevant prosodic cues, potential cross-cultural
misunderstandings of prosodic signals, the interplay between
meta-communication and the communication of content, and ways to
quantify the value of good turn-taking relative to other dialog skills.

Posted by nlplab at 12:20 PM

September 15, 2006

[TALK] Cognitive Load and Spoken Interface Design: Comparing Natural and Standardized Approaches to the Generation of Referring Expressions

Speaker: Ellan Campana, Arizona State / University of Rochester 12 noon, 5317 Sennott Square (ISP Forum) Human language capabilities are both context-dependant and flexible. On the one hand psycholinguistic evidence suggests that listeners naturally and rapidly integrate elements of the visual, discourse-level, and social context with incoming speech, using these elements to improve the speed at which they identify the intended referents of referring expressions. On the other hand research in human-human interaction has also shown that listeners are flexible in their use of language in that they are able to adapt to speaker-dependant patterns, and that they are capable of establishing and using new referring expressions and sub-languages for specific domains. In the spoken language interface literature, these two sets of findings have been used to support two different approaches to interface design, which I call the "natural" approach and the "standardized" approach. The natural approach argues that in order to be easy to use, such interfaces should approximate human-human interaction as closely as possible, including context-dependant generation and understanding of referring expressions. The standardized approach argues that instead systems should take advantage of human abilities to learn and adapt while minimizing computational complexity. Thus, users should be exposed to and use consistent, non-context-dependant referring expressions so that the systems will be easier to learn. There is little direct empirical evidence examining which of these design approaches results in less cognitive load on the part of system users. In this talk I will describe the results from my research applying a classic tool of cognitive psychology, the dual-task paradigm, to spoken interface evaluation with the goal of comparing the two approaches directly. Specifically, I examine natural and standardized design approaches with respect to the role of discourse context in user comprehension / system generation of referring expressions. Speaker Bio Ellen Campana is a Lecturer ABD at Arizona State University, in the Arts, Media, and Engineering Program and the Psychology Department. She is also currently a candidate for a joint Ph.D. in Brain & Cognitive Sciences and Computer Science at the University of Rochester. She holds a B.S. in Computer Science and a B.S in Psychology from the University of Wisconsin-Madison, and an M. A. in Brain and Cognitive Sciences from the University of Rochester.
Posted by nlplab at 05:01 PM

September 11, 2006

Interspeech Practice Talk

Building an English-Iraqi Arabic Machine Translation System for Spoken Utterances with Limited Resources.

By: Behrang Mohit
This is a joint work with Jason Riesa, Kevin Knight and Daniel Marcu
The paper can be found here .

Posted by behrang at 01:32 PM