Lab 1 (due 3/31)
NB The lab assignments will typically include write up
instructions at the end. Before you start, read the whole assignment
once, including the write-up instructions, so you know what to keep
track of along the way.
Getting started
- If you're not working on a Treehouse machine, install the LKB
- Run emacs
- Type M-x lkb to run the LKB
- Download the starter grammar
- Unzip the starter-grammar (tar xzf start-grammar.tgz)
- Load the starter grammar in the LKB:
- Select "Load Complete Grammar" from the LKB Top menu
- Navigate to the file called script inside the directory starter-grammar
- Try parsing:
- Select "Parse > Parse input..." from the LKB Top menu and parse the sentence that appears in the dialogue.
- Examine the file lexicon.tdl in the starter grammar, and try making up sentences to parse based on the vocabulary there.
- Select "Parse > Batch parse..." from the LKB Top menu, then navigate to the files test.items for the input and test.results for the output.
- Examine the resulting file test.results. Every starred sentence should be followed by a 0 (no parses found) and every unstarred sentence by a 1 (one parse found).
- You might find the brief
guide to tdl syntax on the FAQ
helpful.
Eliminating redundancies in the lexicon
In the starter grammar, there are many constraints that are
repeated in each lexical entry. This part of the lab asks you to pull
those constraints out into types, thus ending up with lexical entries
that only need to stipulate the type and orthographic form. (NB: This
grammar doesn't have any semantics.)
You should do this even for the types that will end up instantiated
only by one lexical entry.
- For each type of word, create a subtype of the type word in the file types.tdl.
- Put the constraints from the lexical entries on that type. For example, the type for singular nouns would look like this:
sg-noun := word &
[ HEAD noun &
[ AGR 3sing ],
SPR < [ HEAD det & [ AGR 3sing ] ] >,
COMPS < > ].
And now the lexical entry for dog will look like this:
dog := sg-noun &
[ ORTH "dog" ].
- If any of your subtypes of word have constraints in common, create a supertype to house those constraints and make the previous types inherit form the supertype.
- Try to have each constraint stated in only one place.
- Update lexicon.tdl so that each lexical entry inherits from one of your new types and does not specify any redundant constraints.
- Batch parse the file test.items to make sure you haven't lost any coverage.
- Debug as necessary.
Phrase Structure Recursion
The LKB requires that every phrase structure rule has a fixed
number of daughters. The starter grammar handles head-complement
structures with two separate rules, one for transitive and one for ditransitives.
- Replace these two rules with one head-complement rule. This new rule will apply twice in sentences with ditransitive verbs.
- The new rule should pick up the first complement on the COMPS list and shorten the list appropriately.
- Test your changes by batch parsing test.items. Check for extraneous extra parses as well as lost coverage/overgeneration.
Reentrancies
- The starter grammar declares AGR on pos, but AGR is only
relevant (in this fragment) for determiners and nouns. Modify the
type hierarchy under pos so that AGR is only appropriate for
these two. (NB: You can only declare features on a single type.)
- The lexical entries for nouns in the starter grammar stipulate the
same AGR value redundantly for themselves and their determiners.
Modify your lexical types (if you haven't done so already) to use
reentrancy (i.e., an identity constraint) instead.
- Test your grammar with the batch test files agr.items
and test.items.
Labels
The starter grammar does not have any node labels defined,
so the LKB is just labeling nodes with their type (word
or phrase). You can fix this by defining labels.
- First, you need to lay the groundwork. In the file globals.lsp,
change
(defparameter *simple-tree-display* t)
to
(defparameter *simple-tree-display* nil)
- Next, add the following lines to the same file:
(defparameter *label-template-type* 'label)
(defparameter *label-path* '(LABEL-NAME))
(defparameter *label-fs-path* '())
- Add the definition of the label type to types.tdl:
label := expression &
[ LABEL-NAME *string* ].
- Tell the LKB that it will need to load a new file labels.tdl as part of loading your grammar, by adding the following line to the script file:
(read-tdl-parse-node-file-aux (lkb-pathname (this-directory) "labels.tdl"))
- Finally, create a file labels.tdl in your grammar directory,
and add the label definitions to it. For example, NP can look like this:
np-label := label &
[ HEAD noun,
SPR < >,
COMPS < >,
LABEL-NAME "NP" ].
Write up
Please submit write-ups as plain text files. (In future labs,
that will help me run example sentences through your grammar.)
Your write up should include:
- A one-two paragraph discussion of the relative merits of
binary-branching versus flat head complement structures in this
environment.
- A description of the system of abbreviations you came up with.
- At least three questions that this lab caused you to wonder about.
(Please indicate if you've figured out the answers, or if you would still like to see them addressed.)
- If you were unable to complete any part of the assignment, a
description of the problems you encountered and what you think might
be going on. (You can earn partial credit for any part of the
assignment you couldn't get working by describing it in this section.)
Submit your assignment
ebender at u dot washington dot edu
Last modified: Wed Mar 29 15:28:07 PST 2006