Ling 571 - Deep Processing Techniques for NLP
Winter 2016
Homework #2: Due January 19, 2016


Goals

Through this assignment you will:

Background

Please review the class slides and readings in the textbook on Chomsky Normal Form conversion.

Converting a Grammar to Chomsky Normal Form

As noted in the text, the CKY algorithm requires a grammar in Chomsky Normal Form (CNF). While it is not always intuitively clear how to write a grammar from scratch in CNF, it is fairly straightforward to convert a context-free grammar into a weakly equivalent grammar in CNF.

Following the approach outlined in class, implement a procedure to convert an input grammar of the form used for the first assignment to a new weakly equivalent grammar in CNF.

You will want to create data structures corresponding to RULE, RHS, LHS, etc. You may use whatever programming language you like, provided that it can be run on the CLMS cluster using condor. You may use existing implementations of these data structures in NLTK or other NLP toolkits (e.g. the Stanford parser), but you must implement the conversion algorithm yourself.

Converting a general context-free grammar to Chomsky Normal Form

The program you submit should do the following:

Programming

Create a program named hw2_tocnf.{py|pl|java|....etc} to perform the conversion described above invoked as:
hw2_tocnf.<ext> <input_grammar_file> <output_grammar_file>
where

Verification & Parse Comparison

Using your system from HW#1, you will parse a set of sentences with Note: You will need to run your code yourself in both these conditions with the files specified below. You will include the output parse files in your submission tar file. You do *not* need to run this code as part of your condor file.

Files

Please adhere to the naming conventions.

Test, Validation, and Example Files

Submission Files

Handing in your work

All homework should be handed in using the class CollectIt.