Linguistics 567: Knowledge Engineering for NLP

Lab 7 Due 2/18

Background

This lab has four goals:

Handle the syntactic differences between matrix and embedded polar questions and between polar questions and declaratives.
Implement the syntactic pattern of imperatives.
Map the right clausal semantics to each syntactic pattern.
Implement semantic selection by clausal-complement taking verbs of clause-types in their complements.

Run a baseline test suite

Before making any changes to your grammar for this lab, run a baseline test suite instance. If you decide to add items to your test suite for the material covered here, consider doing so before modifying your grammar so that your baseline can include those examples. (Alternatively, if you add examples in the course of working on your grammar and want to make the snapshot later, you can do so using the grammar you turned in for Lab 6.)

Matrix Patch

I have made some updates to the lexical rule types in matrix.tdl. Before you start this week's lab:

Download the new matrix.tdl and put it in place of your existing copy of this file.
Load in the grammar and fix any errors. (Note that some of the type names changed, so you may get some errors from your existing lexical rules.)
Run your test suite to see if coverage/overgeneration were affected. If so, post to EPost and we'll work out a fix.
Parse a couple of sentences interactive and check whether you have any -M node labels corresponding to lexical rules. If so, this means that you are not inheriting all of the constraints on the lexical rules that you should, and in particular, that you are not inheriting ultimatedly from one of infl-ltol-rule, const-ltol-rule, infl-ltow-rule or const-ltow-rule. Try to fix this.

Semantic representations

The semantics for declarative and interrogative clauses will be the same except for the value of the feature SF (sentential force) on the event index of the main predicate.

Embedded clauses should have their local top handle related to an ARGn position in the embedding verb's relation through a qeq. EB TODO Add example here

Matrix yes-no questions

For some of you, the customization script provided the right kind of matrix yes-no questions. Examine the behavior of you grammar on relevant sentences to discover whether this is so. Note that if you have no syntactic difference between matrix yes-no questions and matrix declaratives, your grammar should be giving all such clauses the SF value 'prop-or-ques'.

If your language's strategy for matrix yes-no questions was not covered by the customization script, you'll need to add them now. Here are some descriptions of how to handle strategies that I'm aware of but haven't yet put into the customization script.

Interrogative matrix clauses: Verbal inflection

If you language marks matrix interrogatives with inflection on the verb, try to determine whether that inflection unambiguously marks interrogatives. If so, create a lexical rule (inflectional, add-only, no-ccont, ltol or ltow accoring to how it fits in with the rest of your rules) which adds the inflection and constrains the INDEX.SF of the verb to question.

If, on the other hand, that same inflectional marker is also used in other constructions, you may wish to pursue this alternative treatment:

Create a feature of the type verb say QUES with possible values + and - (bool).
Make verb-lex constrain the value of HEAD.QUES to -.
Write a lexical rule which adds the inflection and changes the value of HEAD.QUES to +. (Alternatively, consider leaving verb-lex unspecified, and writing a pair of rules, one which fills in + and one which fills in -). Be sure your lexical rule copies up the values of all other HEAD features. This lexical rule should also constrain the mother's MC value to na.
Make sure your new lexical rule interacts properly with existing lexical rules. (E.g., If there are lexical rules which apply after this one, make sure they are copying up the MC value.)
Create a non-branching phrase structure rule which inherits from interrogative-clause and head-only. This rule should copy up the VAL information and require that its daughter have empty SUBJ and COMPS lists. It should also require it daughter to be [HEAD verb] and [MC na]. The mother's own value for MC should be + or underspecified, depending on whether or not such clauses are restricted to matrix position.
Check that the root condition is also constraining MC appropriately.

Interrogative matrix clauses: Markers on either end

This option is for languages that mark interrogatives with particles on either end of the clause (or alternatively, with intonation or [in signed languages] non-manual signs which extend the length of the clause and are represented in transcription with markers on either end). Rather than attach one of these markers before the other, the most straight-forward thing appears to be to create a ternary rule. I've added some types supporting ternary rules to the matrix (included in the patch provided above).

We're going to take a construction-y approach to analysis, creating a phrase structure rule which calls for specific elements in two of the three daughters and does the right thing in the semantics itself. Specifically, do the following:

Create a lexical type which encodes the constraints in common to the left and right markers. If the same elements have some other function in the grammar, write the constraints accordingly. If they don't, make the lexical type a subtype of norm-zero-arg and make sure its valence lists, MOD value, RELS and HCONS lists are all empty. You should also give it a distinctive HEAD value (lest it show up as the argument of something else). For example, in the ASL case, there should probably be a new subtype of head for non-manual markers:
```
nmm := head.
```
Create subtypes of your lexical type for the left-hand and right-hand elements. These subtypes don't need additional constraints, just contrasting names.
Create lexical entries of each of the subtypes, with the appropriate STEM value. These elements don't need any KEYREL information.
Create a type yes-no-question-phrase which inherits from ternary-head-middle-phrase and interrogative-clause. These supertypes will give it question semantics while also making the HEAD, VAL and HOOK values come from the head daughter (the middle daughter). You may need to specify the value of other features inside CAT on the mother, either copying them from the head daughter or otherwise specifying them appropriately. If you do need to, you'll be able to tell because your grammar will be over generating.
You'll need to constrain the RELS and HCONS inside of C-CONT to be empty (i.e., <! !>).
Constrain the first and third elements of the ARGS list of yes-no-question-phrase to be the lexical types of your left and right markers, respectively.
Create an instance of yes-no-question-phrase in rules.tdl and test it!

Clause embedding verbs

We will be using clausal complements as our example of embedded clauses. To do so, we need to create clause-embedding verbs. First, find examples of verbs that can embed propositions and verbs that can embed questions. If you also find verbs that are happy to embed either, we can make use of them. For inspiration, you can look here or here.

Create another subtype of verb-lex which inherits from clausal-second-arg-trans-lex-item (the type for verbs like think, ask, and know).
Constrain the CAT value of the complement appropriately (the feature specifications of S, CP or either depending on what's going on with your embedded clauses). The CONT.HOOK.INDEX.SF should also be constrained to prop-or-ques.
Create two subtypes, one which constrains the CONT.HOOK.INDEX.SF to proposition and one which constrains it to question.
Create lexical entries for verbs inheriting from one or the other subtypes (or the supertype) as appropriate.

If your matrix and embedded clauses look the same, you should be able to test this immediately. If not, you'll have to wait until you've implemented the syntax for your embedded clauses.

Complementizers

Some languages mark embedded clauses (declaractive, interrogative or both) with complementizers (e.g., that and whether in English). To implement this, you'll need to do the following. (If your language also marks matrix questions with a question particle, you have some of the following in your grammar already.)

Create a type complementizer-lex-item which inherits from raise-sem-lex-item and basic-one-arg. It should identify the one thing on the ARG-ST with the sole item of its COMPS list and otherwise have empty valence features. It's HEAD value should be comp.
If you have both interrogative and declarative complementizers, make subtypes for each, constraining the CONT.HOOK.INDEX.SF on each of them to question and proposition respectively.
If you have just one kind of comlementizer, you can put the SF constraint on complementizer-lex-item directly.
Create lexical entries for your complementizers. These do not need any KEYREL information.

Test your embedded clauses. Do they parse as expected? Can you still generate?

The feature MC

If your matrix and embedded clauses have different syntactic properties (e.g., presence v.\ absence of complementizers), you'll need to constrain things so that the embedded clause syntax only appears in embedded clauses and vice versa for matrix clause syntax. There are three resources for doing so:

The root condition
The selecting context for the embedded clauses (i.e., the embedded verb)
The feature MC (inside CAT)

If the difference is strictly S v. CP, you don't need the feature MC. Otherwise, you probably will need all three: The root condition will require [MC +], the embedding verb will require [MC -], and the constructions/lexical rules/etc which create the embedded and matrix clauses themselves should set appropriate values for MC.

Be sure your test suite contains negative examples illustrating matrix clause syntax in embedded clauses and vice versa.

Imperatives

One common way to create imperatives is to leave off the subject, perhaps in conjunction with a particular verb form. To handle this:

Create a lexical rule to create the appropriate form of the verb, if necessary. This lexical rule should put a distinctive value into the head feature FORM. (If you don't already have such a feature, create one). Lexical rules for contrasting verb forms should put a different value on that feature.
Create a type which inherits from imp-head-opt-subj-phrase, and constrain it to require the appropriate FORM value on its head daughter (if necessary).
This type should also constrain the HOOK.INDEX.PNG on its head daughter appropriately (to indicate 2nd person).
If you also have non-imperative sentences with dropped subjects, you'll want to change your existing head-opt-subj-rule. Create a subtype of decl-head-opt-subj-phrase, and have your instance inherit from that. The subtype should put any required constraints on the FORM value.
If the verbs in the imperative form can't otherwise appear in matrix clauses, you'll want the lexical rule to mark them [MC na] or [MC -].

If your language marks imperatives with some sort of particle, see if you can treat it as a kind of a complementizer (see instructions on complementizers above).

If your language does something else, talk to me :)

Test your grammar

Use your test suite to check the syntactic coverage of your grammar.
Examine the semantic representations you assign to each of the clause types, and compare them to the examples given in the lab instructions.
Check for overgeneration (syntactic forms associated with one clause type showing up in other clause types, multiple parses for single sentences with spurious clause type assignments or lack of clausal semantics).
Make sure your grammar still generates.

Write up

Describe the syntactic properties embedded and matrix interrogative clauses and matrix imperatives in your language. Illustrate your points with glossed examples from your test suite in the format that your grammar expects so I can try them out.
Describe the current coverage of your grammar with respect to those properties.
Describe how you handled the syntactic properties of the various clauses (or attempted to handle them) and got the right correlation between syntax and semantics (or attempted to do so).
If your grammar already had matrix yes-no questions working properly from the customization script, describe how the tdl implements these questions.

Submit via ESubmit

Be sure your matrix folder includes your write-up.
Be sure your matrix folder includes a tsdb/home directory with your initial and final test suite runs for this lab (and preferably nothing else, so I can easily find these).
Consider removing the doc/ subdirectory in order to save space on E-Submit.
Compress the folder, and upload it to ESubmit.
Submit it by midnight Sunday night (preferably by Friday evening :-).

Back to main course page