Linguistics 567: Knowledge Engineering for NLP

Lab 7 Due 5/12

Read all the way through the assignment once before starting it. Once again I'll be asking for write ups, and basing a significant portion of the grade on the write up. This means that even if you don't get something working, you can get a lot of partial credit for describing the problem and how you attempted to handle it, and you best guess as to why it's not working. Conversely, you could have everything working properly, but if you don't describe the phenomena (with glossed examples) and how you handle them in your write up, you won't get full credit.

You might notice that these instructions are vaguer than in previous weeks. This is only partially because I expect you to have more of a sense to implement the details as the quarter goes along. It's more because I can't predict all of the details for this material. The upshot. Ask questions! Ask early and ask often :-).

Background

This lab has four goals:

Semantic representations

This section gives example semantic representations (of the form produced by the "Indexed MRS" option) to compare your results to.

Matrix interrogative

The main difference between this and the matrix declarative given last week is addition of the question_m_rel. The local top handle is now the label of the question_m_rel, which takes the proposition_m_rel's handle directly as its sole argument. The argument of the proposition_m_rel is still related via qeq to the label of the _sleep_v_rel.

Embedded interrogative (with matrix declarative)

This one is just like the embedded declarative from last week, except there is a question_m_rel in addition to the proposition_m_rel for the embedded clause. Note that _know_v_rel takes the handle of the embedded question_m_rel as its argument directly (no qeq) and the embedded question_m_rel takes the embedded proposition_m_rel as its argument directly (again, no qeq). The next link in the chain (between the argument of the embedded proposition_m_rel and the _sleep_v_rel) does have a qeq.

Once you've got all of these, matrix interrogatives with embedded declaratives or interrogatives should follow!

Imperatives

< h1, e2:SEMSORT:TENSE:ASPECT:MOOD,
{ h1:command_m_rel(h5),
  h6:_sleep_v_rel(e2,x8:second)},
{h5 qeq h6}>

Note that the MARG of the command_m_rel is qeq the handle of the verb, and the ARG1 of the verb is constrained to be second person. (This latter may or may not be appropriate to mimic in other languages.)

Run a baseline test suite

Before making any changes to your grammar for this lab, run a baseline test suite instance. If you decide to add items to your test suite for the material covered here, consider doing so before modifying your grammar so that your baseline can include those examples. (Alternatively, if you add examples in the course of working on your grammar and want to make the snapshot later, you can do so using the grammar you turned in for Lab 6.)

Syntactic differences between clauses

Your first step in this lab should be to cover the syntax of your clause types. Once that is working, worry about the semantics.

Everyone will need to implement a clause-embedding verb type (another subtype of verb-lex). This one should inherit from clausal-second-arg-trans-lex-item (defined in matrix.tdl) as well as verb-lex (defined in klingon.tdl), and constrain the CAT value of its complement appropriately. If your clause-embedding verb from last week doesn't take interrogative complements, you'll need to add one. All clause-taking verbs should place appropriate constraints on the CONT.MSG of their complements.

In addition, you may need to implement one or more of the following:

Think before you code! There's too much variety across languages in this domain for me to sketch out all the relevant possibilities in this lab, so you'll need to plan out what you're going to try. I'm happy to answer questions as you do.

To give you some guidance, I describe below what I did for English while testing out the matrix for this lab.

Semantic differences between clauses

There are two parts to the problem of getting the semantics right for clauses:

  1. Making sure each clause type gets the right message(s) inserted.
  2. Making sure that the syntax and semantics correlate as they are supposed to (e.g., if there is a word order that is particular to interrogative clauses, it shouldn't get a parse with propositional semantics).

The matrix has done most of the work for (1), it's just a matter of hooking it in to your grammar in the right way. Hopefully, the English examples below will be useful in this regard.

A sketch of what you need to do

Matrix polar questions just like declaratives

If there is no syntactic difference between matrix (polar) interrogatives and matrix declaratives, you've lucked out. You just need to add a rule that builds question semantics instead of proposition semantics.

This rule will look just like your declarative-clause construction, except that you need to do a bit more work with the semantics, since the type interrogative-clause in matrix.tdl is somewhat underspecified compared to declarative-clause.

Once you have both non-branching constructions in, you should find that every sentence has double the parses. Verify that this is so.

An alternative is to change your matrix declarative clause construction so that it introduces an underspecified message relation (prop-or-ques_m_rel). In this case, you'll only have one parse of matrix clauses that can be either declarative or interrogative, with only one message relation in it.

Interrogative matrix clauses: Verbal inflection

If you language marks matrix interrogatives with inflection on the verb, you'll want to do the following:

Interrogative matrix clauses: Question particle

Some languages mark (matrix) interrogative clauses with a question particle, either at the beginning or the end of a sentence. One way to handle this is like a complementizer, with one of the following two options:

  1. If the order follows what you see in your other head-complement structures, you can build the CP with the ordinary head-complement rule.
  2. If the order of the question particle is different from the usual possibilities for head-complement structures in your language, you might need to define a new head complement rule or add a constraint to one of your existing head-complement rules. The rule that disallows the question particle should constrain the HEAD value of the head daughter to make it incompatible with the question particle. (If your question particle uses a different ordering from other complementizers, talk to me.)

You'll then need a non-branching interrogative clause construction that takes the CP as its daughter (similar to the construction described above). Since the type interrogative-clause inherits from basic-head-only, the mother will also be a CP. That means you'll need to change your root condition to allow these CPs (perhaps disallowing those that only turn up in embedded contexts).

Alternatively, you can have the question particle introduce the question message relation. The lexical type for such an element would have the following constraints:

Interrogative matrix clauses: Subject-verb inversion

Embedded interrogatives: Just like matrix interrogatives

If your embedded interrogatives look just like your matrix ones, you're in luck. You only need to make sure that those clauses can appears as the complement of appropriate verbs.

Embedded interrogatives: Marked by question particles

If your embedded interrogatives involve a question particle or complementizer, but your matrix interrogatives don't, you'll want to develop a complementizer analysis for the embedded ones. It probably makes more sense to follow the instructions for matrix interrogatives with question particles rather than the ones for embedded declaratives from last week, as the semantics is different.

If your matrix declaratives and interrogatives look the same and you went with a single clause rule for matrix clauses (which introduces an underspecified message), but your embedded clauses are distinguished either by the complementizer involved or perhaps only by the selecting verb, you'll need to have the complementizer or the selecting verb do some semantic work to get the message on an embedded clause right.

If you have interrogative complementizers: They'll look like the message-introducing question particle described above.By constraining their complement's MSG to be proposition_m_rel, they will in fact be resolving the underspecification. The constraints described above also produce a complementizer that introduces the addition question_m_rel and hooks up the handles appropriately.

If you do not have interrogative complementizers but the embedded verb constrains message of its complement: For verbs that can take only embedded questions, you'll want to have them introduce the question_m_rel, resolve any underspecification on the complement's message, identify the MARG of the question message with the LTOP of the embedded clause, and identify the LBL of the question message with an ARGn in their own key relation. Note that the lower-level lexical types in the matrix are constrained to only introduce on relation on the RELS list. This means that if you define this type of verb, you'll need to work with higher-level supertypes in the matrix, avoiding single-rel-lex-item.

Embedded interrogatives: Other

Talk to me :-)

Imperatives

English expresses imperatives by leaving out the subject and requiring a special form of the verb. One way to handle this is to make a construction which inherits from basic-head-opt-subj-phrase, marks the clause as imperative (say with a new CAT feature), requires a particular FORM value on the verb, and constrains the index of the daughter's SUBJ to be second person. Then a second non-branching rule, inheriting from imperative-clause would take the mother of the first and produce a constituent with clausal semantics. We would also need an appropriate lexical to produce verbs in the right form. Furthermore, the other clausal constructions need to be made sensitive to the `imperative' feature so as to reject the phrases built by this head-opt-subj rule as daughters.

It might seem like a better idea to have just one rule which does the SUBJ cancellation and the clausal semantics. This is not possible with the current version of the matrix (basic-head-opt-subj-phrase inherits from a supertype which constrains it to be [MSG no-msg]). This might get revised :-).

If your language allows subject prodrop generally, and requires it on imperatives, you could handle this by having the head-opt-subj rule record its application with a CAT feature. The imperative clause rule would be sensitive to this feature (as well as the HEAD.FORM, if appropriate), but the other clausal constructions wouldn't be.

If your language marks imperatives with some sort of sentence-final or sentence initial particle, see the instructions above for doing the analogous thing with interrogatives.

If your language does something else, talk to me :)

Test your grammar

Write up

Submit via ESubmit


Back to main course page