November 29, 2004

[talk] Bo Pang

Bo Pang
Cornell University

Title: A sentimental education: Sentiment analysis using
subjectivity summarization based on minimum cuts.

Abstract

Sentiment analysis, which seeks to identify the viewpoint(s)
underlying a text span, has recently attracted a great deal
of attention. Automatic analysis of such information can be
helpful for business intelligence applications, recommender systems,
and editorial sites. One example application is to determine
a review's sentiment polarity (``thumbs up'' or ``thumbs down'').
In particular, we consider the domain of movie reviews, which was
shown to be difficult for the polarity classification task in
previous work. We propose a novel machine-learning method to
first extract the subjective portions of the documents and then
apply text-categorization techniques to the resulting extracts
rather than to the entire reviews. Discarding the objective
portions of the review helps prevent the polarity classifier
from considering irrelevant or even potentially misleading text;
in addition, subjective extracts created in this process can be
presented to users as summaries of subjective content.

Our results show that the subjective extracts we create compactly
and accurately represent sentiment information: they are as informative
as the original documents while at the same time being 40% shorter.
Depending on the choice of downstream polarity classifier, using these
extracts can even lead to highly statistically significant improvement
for the polarity classification task. Also, we explore extraction
methods based on a minimum cuts formulation, which provides an efficient
and effective means for integrating inter-sentence-level contextual
information with traditional bag-of-words features.

This is joint work with Lillian Lee.

Posted by hwa at 12:30 PM

November 22, 2004

[TALK] Tessa Warren

Tessa Warren (Psychology and LRDC), syntactic complexity and reference, details TBA

Posted by litman at 04:47 PM

November 15, 2004

[Talk] Polarity

Paul Hoffmann

Title: Polarity in Context

Abstract: This talk describes an annotation scheme for marking the polarity of ons and expressive subjective elements in context and presents results of an annotation study.

Posted by nlplab at 11:01 AM

November 08, 2004

[talk] Oren Kurland

Oren Kurland
Cornell University

Title: Corpus structure, language models, and ad hoc information retrieval

Abstract:
The fundamental principle of the language-modeling approach to ad hoc
information retrieval is that given a query, documents will be ranked
according to their estimated language models' similarity to that of the
query.

Most previous work on the language-modeling approach to ad hoc information
retrieval, however, focuses on document specific-characteristics, and
therefore doesn't take into account the structure of the surrounding corpus.
We propose a novel algorithmic framework in which information provided by
document-based language models is enhanced by the incorporation of
information drawn from clusters of similar documents.

In this talk, we will first present the framework and describe a suite of new
algorithms that are natural instantiations of it. Even the simplest typically
outperforms the standard language-modeling approach. We will then discuss
connections to other work such as latent-variable models and present
experimental results which show that our best-performing algorithms post
improvements with respect to state of the art language-modeling based
algorithms over various data corpora.

This is joint work with Lillian Lee.

Posted by hwa at 12:30 PM

November 01, 2004

[TALK] Jan Wiebe

Title: Opinions In Question Answering: Current Research Directions.

Abstract: This talk will describe current research directions in the ARDA AQUAINT project "Opinions in Question Answering". We will focus on our current research in extracting "opinion frames" to represent subjective expressions in text.

Posted by nlplab at 12:30 PM