Linguistics 567: Knowledge Engineering for NLP
Lab 7 Due 2/18
Background
This lab has four goals:
- Handle the syntactic differences between matrix and
embedded polar questions and between polar questions
and declaratives.
- Implement the syntactic pattern of imperatives.
- Map the right clausal semantics to each syntactic pattern.
- Implement semantic selection by clausal-complement taking verbs
of clause-types in their complements.
Run a baseline test suite
Before making any changes to your grammar for this lab,
run a baseline test suite instance. If you decide to add
items to your test suite for the material covered here, consider
doing so before modifying your grammar so that your baseline can
include those examples. (Alternatively, if you add examples
in the course of working on your grammar and want to make the
snapshot later, you can do so using the grammar you turned
in for Lab 6.)
I have made some updates to the lexical rule types in matrix.tdl.
Before you start this week's lab:
- Download the new matrix.tdl and put
it in place of your existing copy of this file.
- Load in the grammar and fix any errors. (Note that some of
the type names changed, so you may get some errors from your existing
lexical rules.)
- Run your test suite to see if coverage/overgeneration were affected.
If so, post to EPost and we'll work out a fix.
- Parse a couple of sentences interactive and check whether you
have any -M node labels corresponding to lexical rules. If so, this
means that you are not inheriting all of the constraints on the
lexical rules that you should, and in particular, that you are not
inheriting ultimatedly from one of infl-ltol-rule,
const-ltol-rule, infl-ltow-rule or
const-ltow-rule. Try to fix this.
Semantic representations
The semantics for declarative and interrogative clauses will
be the same except for the value of the feature SF (sentential
force) on the event index of the main predicate.
Embedded clauses should have their local top handle related
to an ARGn position in the embedding verb's relation through a qeq.
EB TODO Add example here
Matrix yes-no questions
For some of you, the customization script provided the
right kind of matrix yes-no questions. Examine the behavior of
you grammar on relevant sentences to discover whether this is so.
Note that if you have no syntactic difference between matrix
yes-no questions and matrix declaratives, your grammar should
be giving all such clauses the SF value 'prop-or-ques'.
If your language's strategy for matrix yes-no questions was
not covered by the customization script, you'll need to add them
now. Here are some descriptions of how to handle strategies that I'm
aware of but haven't yet put into the customization script.
Interrogative matrix clauses: Verbal inflection
If you language marks matrix interrogatives with inflection on the
verb, try to determine whether that inflection unambiguously
marks interrogatives. If so, create a lexical rule (inflectional,
add-only, no-ccont, ltol or ltow accoring to how it fits in with the
rest of your rules) which adds the inflection and constrains the
INDEX.SF of the verb to question.
If, on the other hand, that same inflectional marker is also used
in other constructions, you may wish to pursue this alternative
treatment:
- Create a feature of the type verb say QUES
with possible values + and - (bool).
- Make verb-lex constrain the value of HEAD.QUES
to -.
- Write a lexical rule which adds the inflection and changes
the value of HEAD.QUES to +. (Alternatively,
consider leaving verb-lex unspecified, and writing a
pair of rules, one which fills in + and one which
fills in -). Be sure your lexical rule copies up
the values of all other HEAD features. This lexical
rule should also constrain the mother's MC value to na.
- Make sure your new lexical rule interacts properly with existing
lexical rules. (E.g., If there are lexical rules which apply
after this one, make sure they are copying up the MC value.)
- Create a non-branching phrase structure rule which inherits from
interrogative-clause and head-only. This rule should
copy up the VAL information and require that its daughter
have empty SUBJ and COMPS lists. It should also
require it daughter to be [HEAD verb] and [MC na].
The mother's own value for MC should be + or underspecified, depending on whether or not such clauses are restricted to matrix position.
- Check that the root condition is also constraining MC
appropriately.
Interrogative matrix clauses: Markers on either end
This option is for languages that mark interrogatives with
particles on either end of the clause (or alternatively, with
intonation or [in signed languages] non-manual signs which extend the
length of the clause and are represented in transcription with markers
on either end). Rather than attach one of these markers before the
other, the most straight-forward thing appears to be to create a
ternary rule. I've added some types supporting ternary rules to the
matrix (included in the patch provided above).
We're going to take a construction-y approach to analysis, creating
a phrase structure rule which calls for specific elements in two
of the three daughters and does the right thing in the semantics itself.
Specifically, do the following:
- Create a lexical type which encodes the constraints in common to
the left and right markers. If the same elements have some other
function in the grammar, write the constraints accordingly. If they
don't, make the lexical type a subtype of norm-zero-arg and
make sure its valence lists, MOD value, RELS and
HCONS lists are all empty. You should also give it a distinctive
HEAD value (lest it show up as the argument of something else).
For example, in the ASL case, there should probably be a new subtype
of head for non-manual markers:
nmm := head.
- Create subtypes of your lexical type for the left-hand and
right-hand elements. These subtypes don't need additional constraints,
just contrasting names.
- Create lexical entries of each of the subtypes, with the
appropriate STEM value. These elements don't need any
KEYREL information.
- Create a type yes-no-question-phrase which inherits from
ternary-head-middle-phrase and interrogative-clause. These
supertypes will give it question semantics while also making the
HEAD, VAL and HOOK values come from the
head daughter (the middle daughter). You may need to specify the
value of other features inside CAT on the mother, either
copying them from the head daughter or otherwise specifying them appropriately.
If you do need to, you'll be able to tell because your grammar will be
over generating.
- You'll need to constrain the RELS and HCONS inside
of C-CONT to be empty (i.e., <! !>).
- Constrain the first and third elements of the ARGS list of
yes-no-question-phrase to be the lexical types of your left and
right markers, respectively.
- Create an instance of yes-no-question-phrase in
rules.tdl and test it!
Clause embedding verbs
We will be using clausal complements as our example of embedded
clauses. To do so, we need to create clause-embedding verbs. First,
find examples of verbs that can embed propositions and verbs that can
embed questions. If you also find verbs that are happy to embed
either, we can make use of them. For inspiration, you can look here
or here.
- Create another subtype of verb-lex which inherits
from clausal-second-arg-trans-lex-item (the type for
verbs like think, ask, and know).
- Constrain the CAT value of the complement appropriately
(the feature specifications of S, CP or either depending on what's
going on with your embedded clauses). The CONT.HOOK.INDEX.SF
should also be constrained to prop-or-ques.
- Create two subtypes, one which constrains the CONT.HOOK.INDEX.SF
to proposition and one which constrains it to question.
- Create lexical entries for verbs inheriting from one or the
other subtypes (or the supertype) as appropriate.
If your matrix and embedded clauses look the same, you should be
able to test this immediately. If not, you'll have to wait until
you've implemented the syntax for your embedded clauses.
Complementizers
Some languages mark embedded clauses (declaractive, interrogative
or both) with complementizers (e.g., that and whether in
English). To implement this, you'll need to do the following. (If
your language also marks matrix questions with a question particle,
you have some of the following in your grammar already.)
- Create a type complementizer-lex-item which inherits
from raise-sem-lex-item and basic-one-arg. It should
identify the one thing on the ARG-ST with the sole item of
its COMPS list and otherwise have empty valence features.
It's HEAD value should be comp.
- If you have both interrogative and declarative complementizers,
make subtypes for each, constraining the CONT.HOOK.INDEX.SF
on each of them to question and proposition respectively.
- If you have just one kind of comlementizer, you can put the SF
constraint on complementizer-lex-item directly.
- Create lexical entries for your complementizers. These do not
need any KEYREL information.
Test your embedded clauses. Do they parse as expected? Can you
still generate?
The feature MC
If your matrix and embedded clauses have different syntactic properties
(e.g., presence v.\ absence of complementizers), you'll need to constrain
things so that the embedded clause syntax only appears in embedded clauses
and vice versa for matrix clause syntax. There are three resources
for doing so:
- The root condition
- The selecting context for the embedded clauses (i.e., the embedded
verb)
- The feature MC (inside CAT)
If the difference is strictly S v. CP, you don't need the feature
MC. Otherwise, you probably will need all three: The root
condition will require [MC +], the embedding verb will
require [MC -], and the constructions/lexical rules/etc which
create the embedded and matrix clauses themselves should set
appropriate values for MC.
Be sure your test suite contains negative examples illustrating
matrix clause syntax in embedded clauses and vice versa.
Imperatives
One common way to create imperatives is to leave off the subject, perhaps
in conjunction with a particular verb form. To handle this:
- Create a lexical rule to create the appropriate form of the verb,
if necessary. This lexical rule should put a distinctive value into
the head feature FORM. (If you don't already have such a feature,
create one). Lexical rules for contrasting verb forms should put a different
value on that feature.
- Create a type which inherits from imp-head-opt-subj-phrase,
and constrain it to require the appropriate FORM value on its
head daughter (if necessary).
- This type should also constrain the HOOK.INDEX.PNG on its
head daughter appropriately (to indicate 2nd person).
- If you also have non-imperative sentences with dropped subjects, you'll
want to change your existing head-opt-subj-rule. Create a subtype of
decl-head-opt-subj-phrase, and have your instance inherit from that.
The subtype should put any required constraints on the FORM value.
- If the verbs in the imperative form can't otherwise appear in matrix
clauses, you'll want the lexical rule to mark them [MC na] or
[MC -].
If your language marks imperatives with some sort of particle,
see if you can treat it as a kind of a complementizer (see instructions
on complementizers above).
If your language does something else, talk to me :)
Test your grammar
- Use your test suite to check the syntactic coverage of your grammar.
- Examine the semantic representations you assign to each of
the clause types, and compare them to the examples given in the
lab instructions.
- Check for overgeneration (syntactic forms associated with
one clause type showing up in other clause types, multiple parses
for single sentences with spurious clause type assignments or
lack of clausal semantics).
- Make sure your grammar still generates.
Write up
- Describe the syntactic properties embedded and matrix
interrogative clauses and matrix imperatives in your language.
Illustrate your points with glossed examples from your test suite
in the format that your grammar expects so I can try them out.
- Describe the current coverage of your grammar with
respect to those properties.
- Describe how you handled the syntactic properties
of the various clauses (or attempted to handle them) and
got the right correlation between syntax and semantics (or
attempted to do so).
- If your grammar already had matrix yes-no questions working
properly from the customization script, describe how the tdl implements
these questions.
Submit via ESubmit
- Be sure your matrix folder includes your write-up.
- Be sure your matrix folder includes a tsdb/home directory with
your initial and final test suite runs for this lab (and preferably
nothing else, so I can easily find these).
- Consider removing the doc/ subdirectory in order to save
space on E-Submit.
- Compress the folder, and upload it to ESubmit.
- Submit it by midnight Sunday night (preferably by Friday evening :-).
Back to main course page