Linguistics 471: Grammar Engineering
Lab 2 Due 4/18
Read all the way through the assignment once before
starting it. Also, note that you'll be asked to turn in three
small write-ups this time. They're mentioned in the assignment, of
course, and then I remind you again in the instructions under "Submit
via ESubmit"
Background
- Semantic information is handled inside the feature SYNSEM.LOCAL.CONT.
The value of CONT is a feature structure of type mrs, defined
as follows:
mrs := mrs-min &
[ HOOK hook,
RELS diff-list,
HCONS diff-list,
MSG basic_message ].
- The value of RELS is a list of elementary predications.
- The value of HOOK is a feature structure of type hook encoding the information that is available for further
semantic composition.
- The value of HCONS is a list of handle constraints (to represent
scope -- don't worry about the details for now!)
- The value of MSG is a representation of illocutionary force,
but we won't be addressing that in this lab.
- In addition, the feature SYNSEM.LKEYS.KEYREL (on words only, not phrases)
provides a pointer to the main relation contributed by the word.
This feature serves as a shortcut for defining lexical entries. The
type norm-hook-lex-item (defined in matrix.tdl) provides
the link between LKEYS.KEYREL and SYNSEM.LOCAL.CONT.RELS.
- relations (the things on the RELS list), come in different
types, all defined in matrix.
- Two example relations (NB: the following are not type definitions):
[ PRED '_cat_n_rel
ARG0 x
LBL h1 ]
[ PRED '_chase_v_rel
ARG0 e
ARG1 x
ARG2 y
LBL h2 ]
- The value of PRED is a unique predicate name. It can be a string
as in these examples (indicated by the ') or a subtype of
predsort (which will be useful for underspecification in some
cases). The value of LBL is a handle (for handle constraints, i.e.,
scope -- again, don't worry about this for now). The values of ARG0
through ARGn (in practice, up to ARG4) are the arguments to a
relation. For nouns, ARG0 is the index of the thing the noun denotes
(here, the cat). For verbs, ARG0 is an index for the event (here, the
chasing), and ARG1 and others are the participants in that event.
- The types basic-verb-lex et al constrain the KEYREL to be a
relation of the right type, and furthermore relate the HOOK values of
things on the ARG-S list to the ARGn roles in the relation. All you
need to provide is the PRED value.
Tasks
Get an updated copy of matrix.tdl
Emily has fixed another bug in the Matrix. Down
load a fresh copy here: matrix.tdl.
(Without this bug fix, the covert-det rule in this assigment
won't work properly.)
Take a look at the current state of affairs
- Start the LKB and load your grammar.
- Parse a sentence.
- Click on the small tree that shows up, and select "MRS".
- Note the lack of relation names in this structure.
- Click on the small tree again, and explore the options
under MRS. (They may not all work.)
Add relation names to your noun and verb lexical entries
- In order to make the machine translation exercise work
at the end of the quarter, we want to use identical PRED values
across all the different grammars. So, we will adopt the following
convention:
- Your lexical entries presently look something like this:
gato := noun-lex &
[ STEM < "gato" > ].
- For each one, you can add a relation name as follows:
gato := noun-lex &
[ STEM < "gato" >,
SYNSEM.LKEYS.KEYREL.PRED '_cat_n_rel ].
- NB: Don't forget the ' at the beginning of the pred name.
That tells the LKB it's a string and not a type.
- Now parse a sentence again and look at its MRS and admire
the relation names that have appeared there.
- The various views on the MRS available through the menu
on the small tree are generated by the LKB on the basis of the feature
structure. To see the information in feature structure itself:
- Click on the small tree, and select "Show enlarged tree".
- Click on the S at the top of the tree, and select "Feature structure".
- Scroll the window until you can see the value of the path
SYNSEM.LOCAL.CONT.
- Appreciate the easier-to-read views provided by the LKB!
Add types for relation names for determiners
In order to support our goal of machine translation, we need to
allow for underspecified determiners in some languages. That is, for
a language that doesn't necessarily mark definite v. indefinite
determiners, we'd like the quantifier relation to be compatible with
either definite or indefinite. To achieve this, we will write a small
subhierarchy under predsort. Likewise, some languages have a
three-way distinction among demonstratives, while others only a
two-way.
Add PRED values to your determiner lexical entries
- If you don't alreay have determiners:
- If you already have some determiners:
- Add entries for any new determiners you found
in the prep phase of this lab.
- Add PRED values to your determiner lexical entries, choosing from
the types defined above. Note that you may choose a maximally
specific type from the hierarchy, or an underspecified types as
appropriate. Furthermore, within one language, multiple determiners
which mean the same thing but are distinguished in terms of agreement
will have the same PRED value. For example, three of the entries
needed for English would look like this:
the := determiner-lex &
[ STEM < "the" >,
SYNSEM.LKEYS.KEYREL.PRED def_q_rel ].
those := determiner-lex &
[ STEM < "those" >,
SYNSEM.LKEYS.KEYREL.PRED distal+dem_q_rel ].
that := determiner-lex &
[ STEM < "that" >,
SYNSEM.LKEYS.KEYREL.PRED distal+dem_q_rel ].
Note that in this case the PRED values don't start with ', since
we've defined types.
- Parse a sentence again, and admire the MRS.
- Write-up #1: Write up
(in a couple of paragraphs) what you've done for this part, i.e.,
which determiners your grammar has, which PRED values you gave them,
and why.
Add a rule for determinerless NPs
One of the requirements on well-formed MRSs is that
each ARG0 of a noun-relation be bound by a quantifier
(i.e., also be the ARG0 of a quant-relation). If your noun
phrases contain overt determiners (and you're using the
basic-determiner-lex type provided by the Matrix) this
is already the case. For noun phrases that don't contain
overt determiners, we'll need to add a non-branching rule
which fills in the appropriate semantics.
- Add the following type definition to esperanto.tdl:
(Hint: try copy \& paste, either from your internet browser, or by
downloading the following file and copying and pasting from it: lab2.txt.)
covert-det-phrase := head-only &
[ SYNSEM.LOCAL.CAT.VAL [ SPR < >,
SUBJ < >,
COMPS < >,
SPEC < > ],
HEAD-DTR.SYNSEM.LOCAL [ CAT.VAL [ SPR < [ LOCAL.CAT.HEAD det ] >,
SUBJ < >,
COMPS < > ],
CONT.HOOK [ INDEX #index,
LTOP #larg ] ],
C-CONT [ RELS < ! quant-relation &
[ LBL #ltop,
ARG0 #index,
RSTR #harg ] ! >,
HCONS < ! qeq &
[ HARG #harg,
LARG #larg ] ! >,
HOOK [ INDEX #index,
LTOP #ltop ]]].
Notes: C-CONT is the semantic information contributed by the
phrase structure rule itself, here a quantifier relation and
the associated handle constraint. The head (and only) daughter
is something looking for a determiner specifier. The mother
has no valence requirements left.
- Choose an appropriate type from the quantifier_rel
hierarchy, and make it the value of PRED inside the quant-relation
inside C-CONT in covert-det-phrase.
- Define an instance of covert-det-phrase in rules.tdl.
- Test your new rule by parsing a sentence without a determiner.
Check the MRS to make sure the PRED value showed up where you
expected it.
In many languages, determiners are optional only with some kinds of
nouns, and non-optional with others. If determiners are always
optional in your language, that is, if any given noun can appear
without a determiner, then you don't need to do this part. Read it
anyway though :-). The general strategy is going to be to define
two types of nouns, ones with optional determiners and one with
obligatory determiners. We will indicate optionality with a
feature OPT (appropriate objects of type synsem and therefore
found at the path SYNSEM.OPT). Nouns which require determiners will
say that the element of their SPR list is [OPT -]. Nouns which
can optionally appear without determiners won't say anything about
OPT. The covert-det rule will say that the SPR requirement of its
head daughter is [OPT +]. This will be incompatible with those nouns
that say [OPT -] but compatible (of course) with those that don't
mention OPT at all.
(Nouns like proper names and pronouns which generally can't
take determiners, i.e., must undergo the covert-det rule, need to
have a SPR requirement which is incompatible with any overt determiner,
but still compatible with the covert-det rule, or perhaps a special
covert-det rule just for proper names and pronouns. If your only
case of determinerless NPs is this one, talk to me.)
Try generating from the semantic representation
Submit via ESubmit
- Be sure your matrix folder includes three write-ups (1 2 3).
- Compress the folder, and upload it to ESubmit.
- Submit it by midnight Sunday night (preferably by Friday evening :-).
Back to main course page