Lab 4 (due 2/1)
The goal of this lab is once again to continue development of your test suite
on the one hand, and to refine your starter grammar on the other.
This time, the focus on the test suite will be on phenomena not covered
by the customization system (but on the agenda for the rest of the
quarter). On the grammar side, the focus will be on whatever clean-up
is required to get to a good starting point.
Back to top
Preliminaries
GoPost
Posting questions by Wednesday night helped some, but there
was still a definite sense of procrastination last week. So this
time you need to post at least one question by Wednesday night and
at least one by Friday noon.
Version control
If you haven't already put your work for this class under version
control, do so now. Here are the directions from Lab 3.
Back to top
Test Suite
The first task is to create positive and negative example sentences
illustrating the following phenomena, to the extent that they
are relevant for your language:
Before you start, read the general instructions for
testsuites and the formatting
instructions.
Back to top
Starter grammar
The goal for the starter grammars for this week is to get
to the best possible starting point. I will try to get feedback
to you from Lab 3 quickly, and my feedback will contain suggestions
of what to clean up. If we've been discussing any tdl-editing
for your grammar for phenomena already covered, this would be a good
time to do it.
In addition, you should do the following:
- Review your grammars performance over the test suite (using
[incr tsdb()]) to determine whether there are any examples that you think should parse which currently don't.
- Use interactive unification to determine the source of the parse failure, and post to GoPost to get ideas about how to fix it.
- Likewise, if you are overgenerating, post to GoPost for ideas on how to fix it. In some cases, the advice might well be "we'll fix this later".
- For the examples that do parse, examine the semantic representations (Simple MRS) to check whether the dependencies look correct. That is, are the nouns showing up as the arguments of the verbs, and in the way you would expect? Do you see correct values for tense, aspect, person, number and (where relevant) gender on the event and individual variables?
- If you have any incorrect semantic representations, post to GoPost for advice on how to fix them.
Back to top
Make sure you can parse individual sentences
Once you have created your starter grammar (or each time you
create one, as you should iterate through grammar creation and
testing a few times as you refine your choices), try it out on a
couple of sentences interactively to see if it works:
- Load the grammar into the LKB.
- Using the parse dialog box (or 'C-c p' in emacs to get the parse
command inserted at your prompt), enter a sentence to parse.
- Examine the results. If it does parse, check out the semantics (pop-up menu on the little trees). If it doesn't look at the parse chart to see why not.
- Problems with lexical rules and lexical entries often become apparent here, too: If the LKB can't find an analysis for one of your words, it will say so, and (obviously) fail to parse the sentence.
Note that the questionnaire has a section for test sentences. If
you use this, then the parse dialog will be pre-filled with your test sentences.
Back to top
[incr tsdb()] profile
The final step for this lab is to use the [incr tsdb()] grammar
profiling system to test the performance of your starter grammar over
your test suite, and then examine the results. (You may find in doing
so that you want to refine certain aspects of your starter grammar.
You can do this by uploading the file "choices" which comes with your
grammar into the customization system and then tweaking from there.)
We expect to see an overall drop in coverage this week (since
you'll be adding sentences that we don't expect to parse yet), but
at the same time, some improvement over the subset of your test suite
that represents the last two weeks.
Create a test suite profile
- Create a directory called tsdb inside your grammar
directory.
- Inside tsdb, create two subdirectories: home (for
test suite instances) and skeletons (for skeletons).
- Save a copy of Index.lisp in
tsdb/skeletons
- Save a copy of Relations in
tsdb/skeletons. (If your browser doesn't like files without
extensions, here's another copy of the
same file with .txt appended. You should save it as just Relations.)
- Make a subdirectory called lab2 inside
tsdb/skeletons for your test suite. (If you choose a different
name for this subdirectory, you must edit Index.lisp accordingly.)
- Download the perl script make_item.pl
and run it on your test suite:
perl make_item.pl testsuite.txt
- (If the perl script doesn't like the formatting of your test suite,
edit the test suite appropriately and/or complain about the perl
script on GoPost.)
- Copy the .item file which is output by make_item.pl
to tsdb/skeletons/lab2/item.
- Copy tsdb/skeletons/Relations to tsdb/skeletons/lab2/relations (notice the change from R to r).
Create and run an initial test suite instance
- Start the lkb
- Load your starter grammar. (The script file is in matrix/lkb/script.)
- Start [incr tsdb()] (within emacs, that's M-x itsdb)
- In the [incr tsdb()] podium, select Options > Database Root
and input the path to tsdb/home.
- In the [incr tsdb()] podium, select Options > Skeleton Root
and input the path to tsdb/skeletons.
- Optional: For future use, you can set these variables
ahead of time in a file called .tsdbrc in your home directory.
It should contain these lines, with path names edited appropriately:
(in-package :tsdb)
(setf *tsdb-home* "path-to-tsdb/home")
(setf *tsdb-skeleton-directory* "path-to-tsdb/skeletons")
- In the [incr tsdb()] podium, select File > Create. You should
see your test suite in the menu there. Select it, and get a test suite
instance. Post to GoPost if this doesn't work.
- Make sure your grammr is loaded into the LKB.
- Once you have a test suite instance, select it (by clicking on it),
then do Process > All Items.
- Explore the results, with functions such as Browse > Results and Analyze > Competence.
- Be sure to save (i.e., not overwrite or delete) this test suite
instance, as you'll be asked to turn it in.
Back to top
Write up
Your write up should include the following:
- Documentation the new or revised choices you made in the customization
system, illustrated with examples from your test suite. (Diff your lab3 and lab4 choices files to make sure you've caught all the changes.) Here's an example of what this should look like.
- Descriptions of any properties of your language illustrated
in your test suite but not covered by your starter grammar and/or
the customization system. This will be most of the additions to your
test suite this week. Here, too, please include IGT from your testsuite,
and give explanations along the lines of the example above (though without
the information about the customization system).
- Documentation the coverage of your grammar over the testsuite.
If there are examples that thare parsed incorrectly (unanalyzed
grammatical examples, analyzed ungrammatical examples, or grammatical
examples assigned surprising parses), reflect on why that might be.
- Documentation of any changes you made to your grammar to improve
its performance (coverage and accuracy). Include the exmaples that
motivated the change and explain what changes you made to the
choices file or tdl.
Back to top
Back to top
Back to course page
ebender at u dot washington dot edu
Last modified: Sun Jan 25 21:12:31 PST 2009