January 25, 2006

Syntax-Based Named Entity Extraction for English and Arabic

By Behrang Mohit

Abstract: I present a framework to train a named entity (NE) tagger from a limited amount of annotated lexical resources. My approach leverages from other available resources such as syntactic and shallow semantic analyses. These resources are helpful in locating potential named entities that can be used to train a tagger with unsupervised approaches. My final goal was the development of the system for Arabic or other languages with limited resources. I first performed a proof of concept study on English as well. I report experimental results showing that there is a steady boost of classification accuracy when we use the extracted unla-beled data together with a small set of labeled training data. I also report the results of our effort on porting the system to the Arabic language. While the accuracy of the Arabic system is lower than the English system, our findings about the effects of different syntactic features hold for both languages

Posted by behrang at 02:46 PM

January 18, 2006

Jan. 18: Modelling User Satisfaction and Student Learning in a Spoken Dialogue Tutoring System with Generic, Tutoring, and User Affect Parameters

(by Kate Forbes-Riley, Talk for Wed., Jan 18, 12:30) This talk summarizes a paper submission (with Diane Litman). Here is the abstract for that paper: We investigate the use of the PARADISE framework for developing predictive models of system performance in our spoken dialogue tutoring system. We represent system performance using two metrics: user satisfaction, and student learning. We train and test predictive models of these metrics in our tutoring system corpora. We predict user satisfaction with 2 types of parameters: 1) generic system parameters, and 2) tutoring-specific parameters. To predict student learning, we also use a third type: 3) user affect parameters. Though generic parameters are useful predictors of user satisfaction in other PARADISE applications, overall our parameters produce less useful user satisfaction models in our system. However, generic and tutoring-specific parameters do produce useful models of student learning in our system. User affect parameters can increase the usefulness of these models.
Posted by nlplab at 03:54 PM

January 13, 2006

Language Modeling and Its Applications

Eugene Charniak

Brown University

Friday, January 13, 2006
10:30am - SENSQ 5317

Refreshments at 10:00am

Hosted by Jan Wiebe
Abstract

Parsing is the problem of mapping a sentence (in, say, English) to a phrase structure. It is important because it gives us a first rough cut at meaning. During the 1990s there was a flurry of new results using statistical techniques that gave us our first robust parsers ready for every-day use. While there has been continued results since then, the practical parsers at the start of 2005 were no better than what has available in 2000. The first part of the talk will recap this ancient history.

The last 12 months, however have seen a dramatic turn-around, with error rates decreasing by 25%. The second and third parts of the talk describe the two techniques responsible for this state of affairs: discriminative reranking and self training. We also show that the latest results seem to be less corpus specific than the previous results. (That is, they carry over to text corpora reasonably different than those upon which they were trained.

Finally we discuss a new parsing paradigm, course-to-find parsing, and present some starry-eyed proposals for radically different views of parsing.

Posted by behrang at 02:26 AM