Lab 7 (Due 5/12 11:59pm)

Preliminaries

Updated: I've removed some of "ancillary changes" instructrions under wh- questions, because most of those were incorporated into the Matrix in the meantime. I've also added some information about using the variable property mapping to cut down on generator outputs.

As usual, check the write up instructions first. Especially in the test corpus section, but also in general, it will be helpful to keep notes along the way as you are doing grammar development.

Requirements for this assignment

0. Make sure you have a baseline test suite corresponding to your lab 6 grammar.
1. Check that negation is working, and fix it if necessary.
2. Find a simple sentence from your test corpus and try to get your grammar to be able to parse it.
- Post at least the sentence you are working on (as IGT) and ideally some questions about it by the end of the day on Tuesday.
3. Get wh questions working.
4. Make sure your grammar can still generate, and debug as necessary.
5. Test your grammar using [incr tsdb()]. [incr tsdb()] should be part of your test-development cycle. In addition, you'll need to run a final test suite instance for this lab to submit along with your basline.
6. Write up the phenomena you have analyzed.

Run a baseline test suite

Before making any changes to your grammar for this lab, run a baseline test suite instance. If you decide to add items to your test suite for the material covered here, consider doing so before modifying your grammar so that your baseline can include those examples. (Alternatively, if you add examples in the course of working on your grammar and want to make the snapshot later, you can do so using the grammar you turned in for Lab 6.)

Negation

The negation library is more robust than in previous years, so we expect that in most cases the output is working or close to working.

Parse a simple negated sentence and see if gets the right semantics. Here is a sample from English for The dog doesn't sleep.
If the sentence doesn't parse, post to Canvas with an explanation of how negation works in your language.
If the sentence parses, but the semantics is wrong, post to Canvas with the semantics you are getting and info on how negation works in your language.

One sentence from the test corpus

The goal of this section is to parse one more sentence from your test corpus than you are before starting this section. In most cases, that will mean parsing one sentence total. In your write up, you should document what you had to add to get the sentence working. Note that it is possible to get full credit here even if the sentence ultimately doesn't parse by documenting what you still have to get working.

This is a very open-ended part of the lab (even more so than usual), which means: A) you should get started early and post to Canvas so I can assist in developing analyses of whatever additional phenomena you run accross and B) you'll have to restrain yourselves; the goal isn't to parse the whole test corpus this week ;-). In fact, I won't have time to support the extension of the grammars by more than one sentence each, so please *stop* after one sentence.

Create a profile from your test corpus skeleton, and run a baseline.
Use Browse | Results to see if anything is parsing.
Look for some plausible candidate sentences. These should be relatively short and ideally have minimal additional grammatical phenomena beyond what we have already covered.
Examine the lexical items required for your target sentence(s). Add any that should belong to lexical types you have already created.
Try parsing the test corpus again (or just your target sentence from it).
If your target sentence parses, check the MRS to see if it is reasonable.
- If you were able to get a sentence with a correct MRS just by adding lexical items to lexicon.tdl, please try one more sentence.
If your target sentence doesn't parse, check to see whether you still have lexical coverage errors. Fixing these may require adapting existing lexical rules, adding lexical rules, and/or adding lexical types. Post to Canvas for assistance.
If your target sentence doesn't parse but your grammar does find analyses for each lexical item, then examine the parse chart to identify the smallest expected constituent that the grammar is not finding, and debug from there. Do you have the phrase structure rule that should be creating the constituent? If so, try interactive unification to see why the expected daughters aren't forming a phrase with that rule. Do you need to add a phrase structure rule? Again, post to Canvas for assistance.
Iterate until either the sentence parses or you at least have a clear understanding of what you would need to add to get it parsing.
Run your full test suite after any changes you make to your grammar to make sure you aren't breaking previous coverage/introducing spurious ambiguity.

Wh-questions

In constructing your testsuite for this phenomenon in a previous lab, you were asked to find the following:

The simplest case, where there is a single clause and one argument is questioned: Who did the child see?/Who saw the child?
The shape of the wh words for core arguments. Do these vary with case, animacy, gender, something else?
The possible positions of wh words: Do they appear where an ordinary argument would? Move to the beginning of the clause? Are both of these possible?
What happens if the questioned argument belongs to a lower clause (e.g. Who did the observer think the child saw?)?
Are there any other differences between wh questions and declaratives (or yes-no questions)? (For example, English requires subject-auxiliary inversion in the main clause of a matrix wh-question.)
Are there are any differences between wh questions concerning subject and non-subject arguments? (For example, English does not do subject-auxiliary inversion if the questioned element is the main clause subject.)
Optional: What happens with multiple wh elements in the same clause (e.g. Who saw what?)

In the following, I'll share the tdl I've developed in a small English grammar, for two possibilities:

Actual English (wh words go to the start of the clause, with their position indicating whether the matrix or subordinate clause is the question)
Pseudo-English, where wh words stay put

Your goal for this part of the lab is to use this as a jumping-off point to handle wh questions as they manifest in your language. Of course, I expect languages to differ in the details, so please start early and post to Canvas so we can work it out together.

Wh pronouns

Type and entry definitions for my tdl pronouns (used in both versions):

wh-pronoun-noun-lex := norm-hook-lex-item & basic-icons-lex-item & 
  [ SYNSEM [ LOCAL [ CAT [ HEAD noun,
			   VAL [ SPR < >,
				 SUBJ < >,
				 COMPS < >,
				 SPEC < > ] ],
		     CONT [ HOOK.INDEX.PNG.PER 3rd,
	                    RELS <! [ LBL #larg,
				       ARG0 #ind & ref-ind ],
				  [ PRED "wh_q_rel",
				    ARG0 #ind,
				    RSTR #harg ] !>,
			    HCONS <! [ HARG #harg,
				        LARG #larg ] !> ] ],
	     NON-LOCAL.QUE <! #ind !> ] ].
				    
what := wh-pronoun-noun-lex &
  [ STEM < "what" >,
    SYNSEM.LKEYS.KEYREL.PRED "_thing_n_rel" ].

who := wh-pronoun-noun-lex &
  [ STEM < "who" >,
    SYNSEM.LKEYS.KEYREL.PRED "_person_n_rel" ].

Ancillary changes (both versions)

In order to make sure the diff-list appends for the non-local features don't leak (leaving you with underspecified QUE or SLASH), there may be a few ancillary changes required. For example:

If you have any lexical types that inherit from basic-zero-arg, change this to norm-zero-arg.
If you have any rules beyond bare-np, head-opt-comp and head-opt-subj that discharge arguments without realizing them, be sure such arguments are constraind to be unexpressed by the rule.

Wh-initial phrase structure rules

basic-head-filler-phrase :+
   [ ARGS < [ SYNSEM.LOCAL.COORD - ], [ SYNSEM.LOCAL.COORD - ] > ].

wh-ques-phrase := basic-head-filler-phrase & interrogative-clause & 
		  head-final &
   [ SYNSEM.LOCAL.CAT [ MC bool,
			VAL #val,
			HEAD verb & [ FORM finite ] ],
     HEAD-DTR.SYNSEM.LOCAL.CAT [ MC na,
				 VAL #val & [ SUBJ < >,
					      COMPS < > ] ],
     NON-HEAD-DTR.SYNSEM.NON-LOCAL.QUE <! ref-ind !> ].
			

extracted-comp-phrase := basic-extracted-comp-phrase &
  [ SYNSEM.LOCAL.CAT.HEAD verb,
    HEAD-DTR.SYNSEM.LOCAL.CAT.VAL.SUBJ cons ].

extracted-subj-phrase := basic-extracted-subj-phrase &
  [ SYNSEM.LOCAL.CAT.HEAD verb,
    HEAD-DTR.SYNSEM.LOCAL.CAT.VAL.COMPS < > ].

Note that the constraints on SUBJ and COMPS in the two types just above are somewhat specific to English --- depending on the word order facts of your language, you may need to constrain them differently.

Note also that all the phrase structure rules require instances in rules.tdl. For example:

wh-ques := wh-ques-phrase.

Note that you do NOT want to create an instance of basic-head-filler-phrase directly. basic-head-filler-phrase is a supertype of wh-ques-phrase.

Wh in situ phrase structure rule

Note that all the phrase structure rules require instances in rules.tdl

wh-int-cl := clause & head-compositional &  head-only &
  [ SYNSEM [ LOCAL.CAT [ VAL #val,
			 MC bool ],
	     NON-LOCAL non-local-none ],
    C-CONT [ RELS <! !>,
	     HCONS <! !>,
	     HOOK.INDEX.SF ques ],
    HEAD-DTR.SYNSEM [ LOCAL.CAT [ HEAD verb & [ FORM finite ],
				  VAL #val & 
				    [ SUBJ < >,
				      COMPS < > ] ],
		      NON-LOCAL [ SLASH <! !>,
				  REL <! !>,
				  QUE <! ref-ind !> ] ] ].

The general head-subj type assumes that QUE is empty, which won't fly in this case, so we need to redefine it. In the pseudo-English grammar, I did it this way:

eng-subj-head-phrase := head-valence-phrase & head-compositional & 
              basic-binary-headed-phrase &
  [ SYNSEM phr-synsem & 
           [ LOCAL.CAT [ POSTHEAD +,
                 HC-LIGHT -,
                 VAL [ SUBJ < >,
                       COMPS #comps,
                       SPR #spr ] ] ],
    C-CONT [ HOOK.INDEX.SF prop-or-ques,
         RELS <! !>,
         HCONS <! !>,
         ICONS <! !> ],
    HEAD-DTR.SYNSEM.LOCAL.CAT.VAL [ SUBJ < #synsem >,
                    COMPS #comps,    
                    SPR #spr ],
    NON-HEAD-DTR.SYNSEM #synsem & canonical-synsem &
       [ LOCAL [ CAT [ VAL [ SUBJ olist,
                 COMPS olist,
                 SPR olist ] ] ],
         NON-LOCAL [ SLASH 0-dlist & [ LIST < > ],
             REL 0-dlist ] ]].

And then used this type in place of decl-subj-head-phrase in my definition of subj-head-phrase:

subj-head-phrase := eng-head-subj-phrase & head-final &
  [ HEAD-DTR.SYNSEM.LOCAL.CAT.VAL.COMPS < > ].

If your language has head-opt-subj, this will need to be rewritten similarly.

Sample MRSs

Below are some sample MRSs for wh questions, considering both subject and complement questions as well as matrix and embedded questions. Please use these as a point of comparison when you check your MRSs.

Who chases cars?

I ask who chases the dog.

What do you think the dog chases?

Constraining generator outputs

By this point in the quarter, it is common for the generator outputs to be frustratingly numerous. The generator allows us to see the full glory of the combinatorics of our grammars!

One place in which it can be convenient to cut that back somewhat is with underspecified variable properties. I'll illustrate here with aspect, though similar things can apply to other variable properties. If aspect is only optionally marked in your language, and you parse an item unmarked for aspect and then generate, you are probably seeing all of the aspect forms in the generator results. This can be averted by defining a type no-aspect that contrasts with your actual ASPECT values:

no-aspect := aspect.

... and then using the variable property mapping in semi.vpm to take underspecified aspect to no-aspect. For example:

E.ASPECT : ASPECT
 perfective <> perfective
 progressive <> progressive
 imperfective <> imperfective
 * >> no-aspect
 no-aspect << [e]

Aside from the last two lines (which you should add, if your grammar falls into this category), you should leave what's there as is (it's been customized to the aspect values that you entered).

(There is some documentation of the variable property mapping machinery here.)

Write up your analyses

For each of the following phenomena, please include the following your write up:

A descriptive statement of the facts of your language.
Illustrative IGT examples from your testsuite. These should be examples that actually work in the current grammar, or would work if not for the particular problem you are talking about.
A statement of how you implemented the phenomenon (in terms of types you added/modified and particular tdl constraints).
If the analysis is not (fully) working, a description of the problems you are encountering.
A statement of whether or not you can generate from examples illustrating the phenomenon.

Phenomena:

Negation
Whatever you fixed about your grammar as you worked on the test corpus sentence. (In this section, please include the test corpus example you are targeting and a narrative of what you worked on to try to get it parsing.)
Wh-questions

In addition, your write up should include a statement of the current coverage of your grammar over your test suite (using numbers you can get from Analyze | Coverage and Analyze | Overgeneration in [incr tsdb()]) and a comparison between your baseline test suite run and your final one for this lab (see Compare | Competence).

Submit your assignment

Create a tarball of your grammar, your tsdb directory including both initial and final profiles, and your write up.
If you're using svn, export the grammar so I don't get all your .svn files:
```
svn export yourgrammar iso-lab7
```
For git, please do the equivalent.
Remove extraneous [incr tsdb()] profiles from the copied directory. (I'd like the initial baseline and final result for both test suite and test corpus; Only keep intermediate versions that you specifically want to say something about.)

Create a tarball:

      tar czf iso-lab7.tgz iso-lab7

Upload the tarball to CollectIt under the name of the partner who did the write up.

Create a tarball of your grammar, your tsdb directory including both initial and final profiles, and your write up. The best way to do this (so that it unpacks most easily when I download from CollectIt) is to cd into the directory containing your lab (e.g., cd lab7/) and do:

tar czf lab7.tgz *

Upload the tarball to Canvas

Back to main course page
Last modified: 05/12/2017 03:39:31