Ling 571 - Deep Processing Techniques for NLP
Winter 2016
Homework #5: Due February 9, 2016, 23:45


Goals

Through this assignment you will:

Background

Please review the class slides and readings in the textbook on feature-based grammars and parsing. Also, review Chapter 9 of the NLTK book for additional detail on feature structures and feature-based parsing in NLTK. A discussion of aspect, relevant to the last few test sentences can be found in J&M 17.4.2

NOTE: The NLTK book contains a discussion of HPSG-style handling of subcategorization. However, this framework is *NOT* implemented in NLTK as it stands. An analogous list structure using [FIRST=?a,REST=?b] pseudo-lists can achieve the same effect, but this should be considered an extra-credit option to be explored if you have spare time. It is not required for this assignment.

Building a Feature-based Grammar

Based on the materials above, create a set of context-free grammar rules augmented with features in the NLTK .fcfg format that are adequate to analyze a small set of English natural language sentences. Sample grammars may be found in the NLTK Book Chapter 9 text, in the mini example file referenced below, and in some of the NLTK grammars under /corpora/nltk/nltk-data/grammars. The grammar should be loadable with nltk.data.load().

Your grammar should be able to parse all well-formed sentences in the test sentence file and reject all ill-formed sentences in the list.

Parsing

Create a program to parse the example sentences based on your grammar and analyze the results. Specifically, your program should: Note: If the sentence is ambiguous, you only need to print a single parse.

Programming

Create a program called hw5_parser.{py|pl|etc} which performs the feature parsing grammar check described above invoked as:
hw5_parser.{py|pl|etc} <input_grammar_filename> <input_sentence_filename> <output_filename> where,

Files

Please adhere to the naming conventions below:

Example and Test Data Files

All data and example files may be found in /dropbox/15-16/571/hw5/.

Submission Files

  • hw5_parser.{py|pl|etc}: Primary program file with language-appropriate extension.
  • hw5_feature_grammar.fcfg: This file should contain the grammar rules with feature augmentations required to parse the acceptable sentences in the test set and reject the ungrammatical ones. The file should be consistent with the NLTK .fcfg format.
  • hw5_output.txt: The output file with the results of parsing each of the input sentences in sentences.txt with your hw5_feature_grammar.fcfg.
  • hw5.cmd: Condor file which drives your parsing program (hw5_parser.{py|pl|etc}) with the relevant grammar, test sentences, and output file.
  • readme.{txt|pdf}: Write-up file
  • hw5.tar: Your hand-in file
  • Handing in your work

    All homework should be handed in using the class CollectIt.