The goal of this lab is to add noun phrase, verb phrase, adjective phrase, and sentence coordination to your grammar, and to be able to parse and generate sentences containing coordinated structures. You will be provided with a set of general coordination rules from which you will derive rules specific to your language. (The coordination scaffolding I'll be providing is a work in progress, so it's possible you'll find that something doesn't work right at first. If this happens, please let me know what goes wrong (and how you fixed it, if you did) so that what finally goes into the Matrix is as solid as possible.)
There are many different ways coordination can be marked in your language, including a conjunction like English and, a suffix or prefix, or possibly no marking at all (juxtaposition). In addition, your language might mark coordination differently on different phrase types -- for example, it may use a special verb inflection to mark VP coordination, but juxtaposition for noun phrases. You'll need to collect the coordination facts about your language before you come to the lab. Note that you only need the facts about coordination strategies that mean something like "and" -- we won't be handling "or", "but", "then", etc.
Caveat: Some languages don't seem to have a coordination structure that's a single constitutent, instead using an adjunct marked by an adposition or affix meaning with. If you have such a language, for the purposes of this lab you'll be pretending it does have a balanced coordination strategy so that you have something to work on.
[NP-top [NP a fish ] [NP-mid [NP a barrel ] [NP-bottom and [NP a smoking gun ]]]]
The following are some sample semantic representations for each phrase type you'll be working on. They've been slightly abbreviated, and there's some wiggle room -- your semantics may have different quantifer relations, for example.
Noun phrase coordination: "dogs and cats leave"
<h1,e2, {h3:_dog_n(x4), h5:indef_q(x4,h6,h7), h8:_and_coord(x9,h11,x4,h12,x10), h13:_cat_n(x10), h14:indef_q(x10,h15,h16), h17:indef_q(x9,h8,h18), h1:_leave_v(e2,x9)}, {h6 qeq h3, h15 qeq h13}>
Adjective phrase coordination: "red and blue dogs leave"
<h1,e2, {h3:_red_adj(e4,x5), h6:_and_coord(e8,h3,e4,h9,e7), h9:_blue_adj(e7,x5), h6:_dog_n(x5), h10:indef_q(x5,h11,h12), h1:_leave_v(e2,x5)}, {h11 qeq h6}>
Verb phrase coordination: "cats eat and leave"
<h1,e2, {h3:_cat_n(x4), h5:indef_q(x4,h6,h7), h8:_eat_v(e9,x4), h1:_and_coord(e2,h8,e9,h11,e10), h11:_leave_v(e10,x4)}, {h6 qeq h3}>
The most important thing to notice about these representations is the _and_coord_rel, which is used to semantically coordinate two other relations. It has five arguments: its own INDEX (an individual or event), L-HNDL and L-INDEX for its left coordinand, and R-HNDL and R-INDEX for its right coordinand. Note that the usual argument relationships between subjects and verbs and between adjectives and their modified nouns remain. Also note that the L-HNDL and R-HNDL are identified with the appropriate handle of the coordinand for adjective and verb phrases, but not for noun phrases. In more-than-two-way coordinations, you'll likely see implicit_coord, a binary coordination relation that is inserted to hook all the coordinations together, in much the same way that the binary mid rule is used to hook n-way coordinated phrases together. Here's an example:
Noun phrase coordination: "dogs cats and fish eat"
<h1,e2, {h3:_dog_n(x4), h5:indef_q(x4,h6,h7), h8:_cat_n(x9), h10:indef_q(x9,h11,h12), h13:_and_coord(x14,h16,x9,h17,x15), h18:_fish_n(x15), h19:indef_q(x15,h20,h21), h22:indef_q(x14,h13,h23), h24:implicit_coord(x25,h26,x4,h27,x14), h28:indef_q(x25,h24,h29), h1:_eat_v(e2,x25)}, {h6 qeq h3, h11 qeq h8, h20 qeq h18}>
All the rules and definitions necessary to implement coordination can be found in this file: coord.tdl. Download it and put it in your Matrix directory, or wherever you like -- you'll be copying the rules contained in it into esperanto.tdl. After downloading the file, follow these steps:
coord.tdl contains a definition of a lexical type conj-lex, which you should use for any lexical coordinators (a.k.a. conjunctions) in your language. Your lexical item should go into lexicon.tdl and should look something like this:
and := conj-lex & [ STEM < "and" >, SYNSEM [ LOCAL.CAT [ HEAD.MOD null, VAL [ SPR < >, SUBJ < >, COMPS < > ]], LKEYS.KEYREL.PRED '_and_coord_rel ]].
Whether your language has different coordination strategies for different parts of speech or not, you're going to need a set of rules for each phrase type because the semantics of each will differ. For each phrase type, therefore, you'll need a top-coord rule, a bottom-coord rule, and in some cases a mid-coord rule (see the comments in coord.tdl to determine which you need). Your top and mid rules will inherit from two rules in coord.tdl: a basic phrase-type specific top or mid rule and a rule that specifies the coordination marking pattern (asyndeton, monosyndeton, polysyndeton...) for that strategy. Your bottom rule will derive from either the unary or binary bottom rule in coord.tdl.
For example, here are the rules that handle NP coordination in English:
np-top-coord-rule := basic-np-top-coord-rule & monopoly-top-coord-rule. np-mid-coord-rule := basic-np-mid-coord-rule & monopoly-mid-coord-rule. np-bottom-coord-rule := binary-bottom-coord-rule & [ SYNSEM.LOCAL.CAT.HEAD noun, ARGS.REST.FIRST.SYNSEM.LOCAL.CAT.HEAD noun ].
(These rules will probably work for some of your languages, so if you're feeling very lazy you might be able to get away with just cutting and pasting them.)
One of the things these rules do is manipulate the COORD feature, which is a boolean flag that is only set inside of coordinated structures. coord.tdl contains some type addendum statements that prevent ordinary headed rules from interacting with COORD + phrases. You'll need to copy all of these into esperanto.tdl. You'll also need to go into roots.tdl and make sure any roots you've specified are COORD -.
After you've created your language-specific rules, go add them to rules.tdl as usual, then load up your grammar and try parsing some sentences with coordination in them. Do the semantics look like the representations above? Next, try generating. Do you get more any spurious sentences? You may find you need to constrain the coordination rules more (in particular, the general rules don't take most agreement phenomena into account) -- do this by modifying your language-specific rules rather than the rules in coord.tdl.