Ling 571 - Deep Processing Techniques for NLP
Winter 2015
Homework #9: Due 11:59 March 18, 2015


Goals

Through this assignment you will:

Background

Please review the class slides (esp. Class 16, #54) and readings in the textbook on pronominal anaphora resolution and especially the Hobbs algorithm.

Analyzing Coreference Resolution with the Hobbs Algorithm

The Hobbs algorithm takes as input a pronoun and a sequence of sentence parse trees in the context, and returns the proposed antecedent. The data file contains a list of pairs of sentences separated by blank lines. In each pair of the sentences, the second sentence has one or more pronouns to be resolved. Parse the sentences, almost all of which are drawn from the first homework assignment, using the same techniques as in HW#1 (or HW#5 if you want to handle number agreement).

For each pronoun, in each sentence pair, trace the Hobbs algorithm to identify its antecedent.

Specifically, you should:

"Implementation"

You should implement step i) using NLTK and a suitable parser. You may do steps A-D either:

Files

Test Data, Example, and Resource Files

The files for this assignment may be found on patas in /dropbox/14-15/571/hw9/:

Parsing (and optional anaphora resolution)

Create a file hw9.py with the following parameters:

Condor file

Please name your condor file hw9.cmd.
  • Your program must run on patas using:
    $ condor-submit hw9.cmd
  • Please see the CLMS wiki pages on the basics of using the condor cluster. All files created by the condor run should appear in the top level of the directory.

    Output Files

    Write-up

    Describe and discuss your work in a write-up file. Include problems you came across and how (or if) you were able to solve them, any insights, special features, and what you learned. Give examples if possible. If you were not able to complete parts of the project, discuss what you tried and/or what did not work. This will allow you to receive maximum credit for partial work.

    Please name the file readme.{txt|pdf} with a suitable extension.

    Handing in your work

    All homework should be handed in using the class CollectIt. Use the tar command to build a single hand-in file, named hw#.tar where # is the number of the homework assignment and containing all the material necessary to test your assignment.