Lab 9 (due 3/4 11:59 pm)

Overview

Preliminaries

The goal for this lab is to create an "accommodation" transfer grammar for your language by using it as the target language in two translation pairs, with English and Pite Saami as the inputs. Along the way, you will continue cleaning up your grammar so that it generates for as many of the MMT sentences as possible (where "as possible" means within the time constraints you have available for this class this week), and generates as few outputs as are motivated (again, within your own limits for reasonable time spent!).

As usual, I'll be asking for before and after tsdb profiles and a write up (see directions below).

Please try to get an early start so that you are working on transfer rules by Wednesday at the latest, and Thursday demo day can include some transfer rule work!

Full credit for the "tasks" portion of the lab will be given if you instantiate at least one transfer rule and can translate 10 of the MMT sentences into your languages from English and 7 of the MMT sentences into your language from Pite Saami (without excessive spurious realizations). At least 30/50 points if you can translate at least two MMT sentence into your language from both English and Pite Saami (without excessive spurious realizations).

(Note: If you are not able to get translations for 10 of the MMT sentences into your language and/or don't have enough coverage of phenomena to be able to parse at least 10, please contact me early in the week and we'll adjust expectations for the lab. Likewise, if no transfer rules are necessary for any of the sentences that your grammar can cover, let me know that as early as possible too, and I'll adjust expectations.)

Back to top

Refine/extend MMT sentences

You'll be asked to turn in an iso.txt file with one sentence per line, covering the 26 examples in mmt/test_sentences/eng.txt. Any sentence that you don't know how to write in your langauge can be replaced with SKIPPED. If there's more than one way to translate one of the sentences, just pick one. If more than one of the sentences translate to the same thing in your language, just list it on each applicable line.

If your iso.txt file from previous labs already conforms to these requirements, please just turn it in again as is.

Initial testsuite run

If you add anything to your testsuite, please rerun it. Otherwise, you can turn in the final testsuite run from Lab 8 as your initial testsuite.

Back to top

Initial translations

Test each item in eng.txt and sje.txt and see which ones work 'out of the box' and which (of those that you expect to be in your grammar's coverage) do not. Hopefully at least the first item works. If none do, post to Canvas with questions.

Refine/harmonize MRSs

For the items that don't work right out of the box, compare your MRS to the out of the eng and sje grammars to see if you can spot the difference. Some possible kinds of differences that you can fix by adjusting your grammar (NB: no changes should be made to the eng or sje grammars; if you see something you think needs changing there, post to Canvas):

  1. Minor differences in predicate spelling -- change the lex entry or rule that puts in the predicate so that the spelling matches
  2. Differences in the variable properties -- refine the semi.vpm file (post to Canvas for help early & often)
  3. Arguments are linked differently (e.g. ARG1 and ARG2 are swapped) -- this is probably something to address by changing your grammar.
  4. Some LBL identities are missing -- edit your grammar so that they are there
  5. Some RELS or HCONS are flat out missing -- this would be due to an underspecified append list/broken list append and is somenting to fix in your grammar

Fix as many of the above errors as you can by refining your own grammar.

Instantiate (or maybe write) transfer rules

There are going to be other differences in MRSs that can't be harmonized away but in fact require the instantiation of transfer rules. There is a large collection of transfer rules in mmt/tm/acm.tdl.

  1. For each item you are trying to get working via transfer rules, first examine the types defined in that file to see if one matches your needs.
  2. If so, create an instance of it in mmt/tm/iso/acm.mtr. Look at mmt/tm/eng/acm.mtr for an example.
  3. If not, please post to Canvas for assistance in creating a new transfer rule.
  4. Recompile your transfer grammar with ace, and test to see if the new transfer rule fixed the item you had in mind.

Controlling overgeneration

For any grammar that can handle the multiply embedded clauses or the before/after examples, you might notice that you're getting weird translation outputs where the clauses seem to switch! Adding this to your my_language.tdl file will fix that:

    qeq :+
       [ HARG.INSTLOC #il,
         LARG.INSTLOC #il ].
  

Ideally, the output from your grammar should be only legitimate strings for each input. This means you should try to rule out both ungrammatical outputs as well as grammatical outputs that don't match the meaning. (The latter could come about if you have morphological rules that don't model their semantics.)

Focusing only on the MMT sentences, check the outputs you are getting. If there is more than one for a given item, identify what the sources of variation are. Legit sources include e.g. word order variation. For variation that is not legitimate, please post to Canvas for advice on how to cut it back.

As you work on controlling overgeneration, you should be running your testsuite frequently to make sure you aren't cutting out coverage on accident.

Run both the test corpus and the testsuite

Following the same procedure as usual, do test runs over both the testsuite and the test corpus.

Again, collect the following information to provide in your write up:

  1. How many items parsed?
  2. What is the average number of parses per parsed item?
  3. How many parses did the most ambiguous item receive?
  4. What sources of ambiguity can you identify?

Back to top

Write up

Your write up should be a plain text file (not .doc, .rtf or .pdf) which includes the following:

  1. Documentation of what you changed in your grammar to do MRS refinement. Include IGT to illustrate.
  2. Documentation of what you changed in your grammar to cut back on overgeneration. Include IGT or pointers to the MMT sentences to illustrate.
  3. For any residual realization ambiguity (multiple outputs for one input), characterize the sources (e.g. free word order).
  4. Documentaiton of your MT coverage for both eng and sje as source languages.
  5. Documentation of your coverage over testsuite & test corpus for both the initial & final runs, including the answers to the questions given above.

Back to top

Submit your assignment

Back to top

Back to course page


Last modified: