Lab 5 (due 2/9 11:59 pm)

Overview

This is our final lab with the customization system. It is also our first foray into MT. The focus will be on finishing up the choices files (though it's not expected that you will have used every part of the customization system) and on getting one sentence translating from English to your language. You will also work on collecting the MMT sentences for your language and use [incr tsdb()] to compare the initial and final state of your grammar for the week over the testsuites.

This lab entails the following general steps, which are not (fully) ordered with respect to each other.

Back to top

Begin collecting the MMT sentences for your language

We will be working with the sentences in eng.txt, but it is not expected that every grammmar will cover every sentence. For this week, I ask you to:

  1. Find translations (or approximations) for all of the words in the small vocabulary of those sentences.

For the write up for this portion, I expect you to tell me about the process you went through and report on item 5 above.

Back to top

Improve the choices file for three phenomena

For the three phenomena you chose, refine the choices file by hand (through the quesionnaire or via direct editing or some combination). Please be sure to post lots of questions on Canvas as you work on this! I expect the write up of this portion to include copy paste of the specific choices values you changed as well as relevant IGT that I can use to test the effects.

Back to top

Tdl edits

By now, you have collected some suggested tdl edits (from lab 2-4 feedback or in class). Once you are all done refining things via the customization system, patch these into your grammar.

The only tdl edits this week should be things that I have suggested as bug fixes or work arounds. You are not expected to come up with tdl edits on your own. If I haven't suggested any to you, this section is a freebie --- nothing to do here!

For the write up, please include the actual tdl changes and an explanation of their purpose.

Try a first translation

Preliminaries

In later labs, we will refine the variable property mapping and create small transfer grammars for each language by using it as the target language in two translation pairs, with English and another language (probably Pite Saami) as the inputs. For now, we'll be attempting to get just one sentence through. This will be one that doesn't actually require any transfer rules.

In all of the instructions below, replace "iso" with the ISO 693-3 code for your language.

  1. Download and unpack mmt.tgz.
  2. Test eng2sje and sje2eng translation:
     cd mmt/
     ./translate-line.sh eng sje 1
    
    Note: If you aren't working on the VM, you'll need to fix the path to ace in the file translate-line.sh (and possibly install ace).
  3. Look inside translate-line.sh; try changing which line is not commented out and see what different behaviors you get.
  4. Make a symlink to your grammar in mmt/grammars/iso
        ln -s /path/to/your/grammar mmt/grammars/iso
      
  5. and compile it afresh with ace:
      cd mmt/grammars/iso
      ace -G iso.dat -g ace/config.tdl
    
  6. Move the generic transfer grammar mmt/tm/gen to mmt/tm/iso
      cd mmt/tm
      mv gen iso
    
  7. Compile that generic transfer grammar:
      ace -G iso.dat -g ace/config.tdl
    
  8. Copy your MMT entences to test_sentences/iso.txt.
  9. Try translating the first sentence from eng to your language:
     ./translate-line.sh eng iso 1
    
  10. This one should not require any transfer rules. If it doesn't work, there are several possible causes:

For your write up for this part, please describe what happened when you tried the steps above. What difficulties did you encounter and how did you resolve them? What output did you get?

Back to top

Run the testsuite

Following the same procedure as usual, do a test run over your testsuite.

Collect the following information to provide in your write up:

  1. How many items parsed?
  2. What is the average number of parses per parsed item?
  3. How many parses did the most ambiguous item receive?
  4. What sources of ambiguity can you identify?

Back to top

Write up

Your write up should be a plain text file (not .doc, .rtf or .pdf) which includes the following:

  1. A description of the phenomena you improved in the choices file, including:
  2. A description of any tdl edits you made and what they are for.
  3. A description of your process for translating the MMT sentences and your documentation about which sentences may be impossible.
  4. A description what happened when you tried the MT set up. What difficulties did you encounter and how did you resolve them? What output did you get?
  5. A description of the performance of your final grammar for this week on the test suite, as compared to your starting grammar (see details above).

Back to top

Submit your assignment

Back to top

Back to course page


Last modified: