September 28, 2005

Swapna

Subjectivity, Emotions and Prosody - Literature review

Posted by nlplab at 03:33 PM

[TALK] Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis

Theresa Wilson

In this talk, I will present a new approach to phrase-level sentiment analysis that first determines whether an expression is neutral or polar and then disambiguates the polarity of the polar expressions. With this approach, our system is able to automatically identify the contextual polarity for a large subset of sentiment expressions, achieving results that are significantly better than baseline.

Practice talk for EMNLP

Posted by nlplab at 11:05 AM

September 21, 2005

A Backoff Model for Bootstrapping Resources for Non-English Languages

Chenhai Xi

The lack of annotated data is an obstacle to the development of many natural language processing applications; the problem is especially severe when the data is non-English. Previous studies suggested the possibility of acquiring resources for non-English languages by bootstrapping from high quality English NLP tools and parallel corpora; however, the success of these approaches seems limited for dissimilar language pairs. In this paper, we propose a novel approach of combining bootstrapped resource with a small amount of manually
annotated data. We compare the proposed approach with other bootstrapping methods in the context of training a Chinese Part-of-Speech tagger. Experimental results show that our proposed approach achieves a significant improvement over EM and self-training and systems that are only trained on manual annotations.

This is a practice talk for EMNLP 2005.

Posted by nlplab at 02:37 PM

September 14, 2005

[talk] Noah Smith

Noah Smith
Johns Hopkins University

Title: Contrastive Estimation for Unsupervised Sequence Modeling

Abstract:

Conditional random fields (Lafferty, McCallum, and Pereira, 2001) are
quite effective at sequence labeling tasks like shallow parsing (Sha
and Pereira, 2003) and named-entity extraction (McCallum and Li,
2003). CRFs are *log-linear*, allowing the incorporation of arbitrary
features into the model. Clever new features are one way to improve
performance; clever objective functions are another (see, for
instance, recent work on max-margin parsing by Taskar, Klein, et al.,
2004).

We have developed a method to do both, in the unlabeled data
framework. That is, we use log-linear models capable of exploiting
new features, and a new class of objective functions: contrastive
estimation (CE). CE can be intuitively understood as exploiting
implicit negative evidence and is computationally efficient (unlike
log-linear EM). In fact, CE generalizes EM and a variety of other
objective functions. By engineering classes of implicit negative
evidence, CE can be adapted for specific applications.

We describe applications to two natural language learning
problems---POS tagging of unlabeled text with a dictionary (Merialdo,
1994) and dependency grammar induction (Klein and Manning, 2004)---and
show how contrastive estimation outperforms EM (with the same feature
sets), is more robust to loss of domain knowledge (dictionary
degradation or uninformative initialization), and can recover by
modeling additional, nonorthogonal features.

This is joint work with Jason Eisner and was presented at ACL 2005 and
the IJCAI 2005 Workshop on Grammatical Inference Applications.

Schedule:

10:15 -- 10:30 Rebecca SENSQ 5421
10:30 -- 11:30 Behrang, Carol, Chenhai
11:30 -- 12:30 Lunch (Rebecca, Diane, Jan, Mihai)
12:30 -- 2:00 Talk
2:00 -- 2:30 Mihai
2:30 -- 3:00 Theresa
3:00 -- 3:30 Amruta, Hua
3:30 -- 4:00 Swapna, Paul
4:00 -- 4:30 Rebecca

Posted by hwa at 12:30 PM