Ling 571 - Deep Processing Techniques for NLP
Winter 2011
Homework #7: Due 11:59 March 15, 2011
Goals
Through this assignment you will:
- Explore issues in pronominal anaphora resolution.
- Gain familiarity with syntax-based resolution techniques.
- Analyze the effectiveness of the Hobbs algorithm by applying it
to pairs of parsed sentences.
- Optionally: Implement the Hobbs algorithm for anaphora resolution on a set of
sentences.
Background
Please review the class slides and readings in the textbook on pronominal
anaphora resolution and especially the Hobbs algorithm.
Analyzing Coreference Resolution with the Hobbs Algorithm
The Hobbs algorithm takes as input a pronoun and a sequence of
sentence parse trees in the context, and returns the proposed
antecedent. The data file contains a list of pairs of sentences
separated by blank lines. In each pair of the sentences, the
second sentence has one or more pronouns to be resolved. Parse the
sentences, almost all of which are drawn from the first homework assignment,
using the same techniques as in HW#1.
For each pronoun, in each sentence pair, trace the Hobbs algorithm to
identify its antecedent.
Specifically, you should:
- i) Print out the pronoun and the corresponding parses.
- A) identify each parse tree node corresponding to 'X' in the
algorithm, writing out the corresponding NP or S (or SBAR) constituent.
- B) identify each node proposed as an antecedent
- C) reject any proposed node ruled out by agreement
- D) identify the accepted antecent.
- E) indicate whether the accepted antecedent is correct
- F1) If the accepted antecedent is correct, do nothing more
- F2) If the accepted antecedent is NOT correct, explain why and identify which of the syntactic and semantic preferences listed in the text would be required to correct this error.
This information should be written to a file called results.
"Implementation"
You may do steps A) and B) either:
- by manually stepping through the algorithm, or
- by implementing this simplified
portion of the algorithm. If you take this coding route, you should
output all the proposed antecedents, unless you want the added challenge
of filtering for agreement. You may use any supporting software, such
as NLTK's components for manipulating parse trees, that you wish, provided
it does not implement the full Hobbs algorithm for you.
Data
The words and contexts to analyze are found in
this file.
An application to a simplified parse of a textbook example appears in this file.
Output Files
Please name your output file results
Please remember to include your name in a comment at the
top of each file.
Handing in your work
All homework should be handed in using the class CollectIt.
Use the tar command to build a single hand-in file, named
hw#.tar where # is the number of the homework assignment and
containing all the material necessary to test your assignment. Your
hw7.cmd , if you coded, should be at the top level of whatever directory structure
you are using.
For example, in your top-level directory, run:
$ tar cvf hw7.tar *