Lab 5 (due 5/3 11:59 pm)

Overview

This will be our last lab with the customization system. The goal is once again to improve the grammar from last week by extending it for additional phenomena. There are two different ways to do this:

  1. As in the last two weeks, pick three new phenomena and document them in your testsuite and extend your choices file to account for them. This will be 3 + 3 (testsuite, choices)
  2. If you have phenomena documented in your testsuite not yet accounted for by your grammar, then you can work on those instead. In this case you might do less testsuite extension, but I still expect six total tasks.

Notes: If possible, I'd like every group to do clausal complements and clausal modifiers. Also, as with e.g. aspect you don't have to do every single variation on these phenomena, just some.

As before, you'll also be using [incr tsdb()] to test the resulting grammar and compare it to your starting point from last week.

New phenomena:

Previous phenomena you may not have completed:

Back to top

Initial testsuite run

  1. Create and run initial testsuite instances for both the linguist-provided data and your small testsuite, using the initial grammar.

    Note If your tsdb/ directory is inside a shared folder on VirtualBox, it will not work.

  2. For each of these, explore the results, collect the following information to provide in your write up:

Back to top

Create a small testsuite for your additional phenomena

Add examples to your testsuite, according to the general instructions for testsuites and the formatting instructions, illustrating the phenomena you worked on above. The testsuite should have both positive and negative examples, but doesn't need to be exhaustive (since we're working with test corpora this year), but you'll want both positive and negative examples for each of the phenomena you work on in this section. I expect these testsuites to have about 20-30 examples total by the end of this week, though you can do more if you find that useful. All examples should be simple enough that your grammar can parse them or fails to parse them because of the one thing that's wrong with them.

Create a test suite skeleton

  1. Make a subdirectory called lab5 inside tsdb/skeletons for your test suite.
  2. Edit tsdb/skeletons/Index.lisp to include a line for this directory, e.g.:
    (
    ((:path . "matrix") (:content . "matrix: A test suite created automatically from the test sentences given in the Grammar Matrix questionnaire."))
    ((:path . "corpus") (:content . "IGT provided by the linguist"))
    ((:path . "lab5") (:content . "Test suite collected for Lab 5."))
    )
    
  3. Download the python script make_item, make sure it is executable, and run it on your test suite:

    make_item testsuite.txt

    Notes on make_item:

  4. Copy the .item file which is output by make_item to tsdb/skeletons/lab5/item.
  5. Copy tsdb/skeletons/Relations to tsdb/skeletons/lab5/relations (notice the change from R to r).

Back to top

Improve the choices file for your selected phenomena

For the phenomena you chose, refine the choices file by hand. Please be sure to post lots of questions on Canvas as you work on this!

Make sure you can parse individual sentences

Once you have created your starter grammar (or each time you create one, as you should iterate through grammar creation and testing a few times as you refine your choices), try it out on a couple of sentences interactively to see if it works:

  1. Load the grammar into the LKB.
  2. Using the parse dialog box (or 'C-c p' in emacs to get the parse command inserted at your prompt), enter a sentence to parse.
  3. Examine the results. If it does parse, check out the semantics (pop-up menu on the little trees). If it doesn't look at the parse chart to see why not.
  4. Problems with lexical rules and lexical entries often become apparent here, too: If the LKB can't find an analysis for one of your words, it will say so, and (obviously) fail to parse the sentence.

Note that the questionnaire has a section for test sentences. If you use this, then the parse dialog will be pre-filled with your test sentences.

Back to top

Run both the test corpus and the testsuite

Following the same procedure as usual, do test runs over both the testsuite and the test corpus.

Again, collect the following information to provide in your write up:

  1. How many items parsed?
  2. What is the average number of parses per parsed item?
  3. How many parses did the most ambiguous item receive?
  4. What sources of ambiguity can you identify?
  5. For 4 newly parsing or otherwise fixed items, do any of the parses look reasonable in the semantics?

Back to top

Write up

NB: While the test suite and choices file creation is joint work, the write up should be done by one partner (the other will get a turn next week). The writing partner should have the non-writing partner review the write up and make suggestions.

Your write up should be a plain text file (not .doc, .rtf or .pdf) which includes the following:

  1. Your answers to the questions about the initial and final [incr tsdb()] runs, for both test corpus and test suite, repeated here:
    1. How many items parsed?
    2. What is the average number of parses per parsed item?
    3. How many parses did the most ambiguous item receive?
    4. What sources of ambiguity can you identify?
    5. For 4 items, do any of the parses look reasonable in the semantics?
  2. Documentation of the phenomena you have added to your testsuite, illustrated with examples from the testsuite.
  3. Documentation of the choices you made in the customization system, illustrated with examples from your test suite.
  4. Descriptions of any properties of your language illustrated in your test suite but not covered by your starter grammar and/or the customization system.
  5. If you have identified ways (other than those you already reported) in which the automatically created choices file is particularly off-base, please report them here. If you can include IGT from the testsuite or your descriptive materials illustrating the problem, that is even better.

Back to top

Submit your assignment

Back to top

Back to course page


Last modified: