Linguistics 567: Grammar Engineering

Lab 9 Due 3/9


The goal for this lab is to create an "accommodation" transfer grammar for your language by using it as the target language in two translation pairs, with English and Italian as the inputs. Along the way, you will be cleaning up your grammar so that it generates (and generates as few outputs as are motivated).

As usual, I'll be asking for before and after tsdb profiles and a write up (see directions below).

(Updated 3/3/08) Full credit for the "tasks" portion of the lab will be given if you can translate 10 of the MMT sentences into your language from both English and Italian (without excessive spurious realizations). At least 30/50 points if you can translate any MMT sentence into your language from both English and Italian (without excessive spurious realizations).

Running the translation system

The first step is to get the tranlsation system running for English to Italian (eng2ita). Here are step-by-step instructions:

  1. Download the English and Italian grammars. (ita and eng updated 3/6/08.) Unpack each of them with tar xzf eng.tgz and tar xzf ita.tgz.
  2. Start two separate emacsen. Put one on the left of your screen (this will be the "source" emacs). Put one on the right of your screen ("target" emacs).
  3. Start the LKB in each. Make sure the "source" LKB Top menu is on the left of the screen and the "target" one is on the right.
  4. Load the English grammar into the "source" LKB.
  5. Load the Italian grammar into the "target" LKB.
  6. In the "target" LKB, select Options | Expand menu.
  7. In the "target" LKB, select Generate | Start server.
  8. In the "source" emacs/lkb parse the English sentence Dogs sleep.
  9. From the pop-up menu on the tree that comes up, select "Rephrase." You should see a transfer output window and then the Italian grammar should output "cani DROM-ONO" in a realizations window.

Attempt to translate into your language

(Updated 3/3/08, 9:25pm)

  1. Save the file semi.vpm to your grammar directory.
  2. Edit the file lkb/script to add the following line, right before the comment that starts "Next, the lexicon itself":
    (mt:read-vpm (lkb-pathname (parent-directory) "semi.vpm") :semi)
  3. Edit the file lkb/globals.lsp to add the following line, with "iso" replaced by the three-letter iso code for your language:
    (setf *translate-grid* '(:iso))
  4. Edit the flie lkb/globals.lsp in the English and Italian grammars so that the line for *translate-grid* now looks like the appropriate one of the lines below (again replacing "iso" with the code for your language).
    (setf *translate-grid* '(:eng :ita :iso))
    (setf *translate-grid* '(:ita :eng :iso))
  5. Now load your grammar into the "target" lkb.
  6. Parse Dogs sleep with the English grammar in the "source" lkb and select "rephrase".
  7. Observe what happens: Do you get generation outputs? Some error in the emacs buffer in the "target" emacs?
  8. If you get an error, you'll need to compare the MRSs to to see what the difference is. I expect that for Dogs sleep you won't need any transfer rules, and thus any errors should be addressed through harmonization (aka cleaning up your MRS).

Comparing MRSs

To compare the MRSs, you can look at the MRS from the English grammar directly, but this can be a bit misleading, since you really want to look at the input to the generator (i.e., the transfer output). To do this, you can select "Generate | Display Input MRS" or "Generate | Display Internal MRS" from the "target" LKB Top menu. Since we're not using the VPM machinery on the target side, these two should be the same.

  1. Generate | Display Internal MRS
  2. Parse the expected output
  3. Choose Indexed MRS from the pop-up menu

There are a number of things that could be wrong:

  1. Missing RELS or HCONS (broken diff-list append).
  2. Misspelled PRED values (look carefully at the underscores).
  3. Misspelled/differently spelled feature values (e.g. sing instead of sg).
  4. Misspelled/differently spelled feature names (e.g., PERS instead of PER).

Create a transfer grammar

Once you have Dogs sleep translating, it's time to try a broader range of the MMT sentences, as well as both English and Italian as input to see what kinds of transfer rules you will need.

Note that you will be modifying the English and Italian grammars for this part of the lab. The transfer rules types are in mt-mrs.tdl, mtr.tdl and acm.tdl. Of those, acm.tdl should be the most interesting. You'll want to edit the file to create instances of the transfer rules that you need for your grammar. It will be simplest to edit this file in one grammar (say the English one) and create a symbolic link to it in the other grammar, so that you have one transfer grammar for your language.

  1. Try translating all of the MMT sentences from English to your language and Italian to your language.
  2. For each one that doesn't go through, compare the input MRS to the MRS your expected output is giving.
  3. Do any harmonization that is warranted.
  4. For the remaining differences, look to see if one of the existing transfer rule types in acm.tdl will do the trick. If so, create an instance of that transfer rule type in, e.g.,:
    pro-drop := pronoun-delete-mtr.
  5. If you need a different transfer rule, post on GoPost about what you need, and we'll work out how to formulate it.
  6. Reload the "source" grammar and try translating again.
  7. Rinse and repeat.

Write up

  1. Update 3/7/08: Include a plain text file (called iso.txt, with iso replaced by your iso language code) with your 17 MMT sentences (in your language only) in the format your grammar expects (morpheme segmented or not), one perline, with a blank line between each.
  2. Describe any clean up you did to your grammar.
  3. Describe the transfer rules you instantiated, and why.
  4. Describe any further transfer rules you needed to develop, and why.
  5. Document your current coverage on translating the MMT sentences from English and Italian into your language.
  6. If you don't have full coverage, describe why not.

Submit your assignment

Back to main course page