The goals for this lab are to get the grammars ready to have lots of success in MT, once we add transfer rules next week! Specifically, you will be doing the following:
For tdl editing, please practice incremental development: Test as frequently as you possibly can, both by compiling the grammar and by testing specific sentences.
From among the MT sentences, find a phenomenon that your grammar does not yet handle, or does not yet handle properly, and fix it. Post to Canvas by Tuesday what you're working on, with IGT, so I can provide guidance. Here are the sentences grouped by phenomena. The expectation is that for whichever phenomenon you pick this week, you'll have all of the associated sentences working (but this might not always be feasible; if not, please let me know over Canvas).
Dogs sleep Dogs chase cars
I chase you
Dogs eat
The dogs dont chase cars
I think that you know that dogs chase cars I ask whether you know that dogs chase cars
Cats and dogs chase cars Dogs chase cars and cats chase dogs Cats chase dogs and sleep
Do cats chase dogs
Hungry dogs eat Dogs in the park eat Dogs eat in the park
The dogs are hungry The dogs are in the park The dogs are the cats
The dog s car sleeps My dogs sleep
Who sleeps What do the dogs chase What do you think the dogs chase Who asked what the dogs chase I asked what the dogs chased
The dog sleeps because the cat sleeps The dog sleeps after the cat sleeps
Consult your descriptive materials for the phenomenon you chose, to understand how it is expressed in your language.
Add the relevant sentences to iso.txt and to your testsuite. The testsuite should also include relevant negative examples.
(If this was already done in previous labs for your phenomenon, that is okay.)
I expect this to be done in collaboration with me, which is why I'm asking you to post by Tuesday indicating which phenomenon you are working on and how it is expressed in your language.
In some cases, I'll suggest that you get an analysis from the customization system and integrated it into your current grammar (or at least use it as a starting point). In others, I might directly suggest some tdl or a general approach.
You are welcome to start with a suggestion of how you'd like to handle it, but please don't dive into extensive implementation without discussing the approach with me.
As usual, please practice incremental development and test frequently (by compiling the grammar and testing individual sentences, as well as by running your full testsuite).
For each MMT sentence that goes through, look at the range of generator outputs. How much realized-string ambiguity are you getting? What are the sources of ambiguity?
For MMT sentecnes that don't go through for lack of transfer rules, try using the MMT system with your language as both input and output. How much realized-string ambiguity are you getting? What are the sources of ambiguity?
If the ambiguity relates to overgeneration (i.e. your grammar generating ungrammatical strings), we'll want to work on adding appropriate constraints. Please psot to Canvas for assitance.
Likewise, if you have any ambiguity that relates to semantically empty things (affixes, words) that you didn't previously clean up, work on that some this week too. Please post to Canvas for assistance.
In this part of the lab, your task is to try each of the items in the MMT sentences for both sje and eng as source languages and your language as the target language. (Note that sje doesn't have all of the items).
Following the same procedure as usual, do test runs over both the testsuite and the test corpus.
Collect the following information to provide in your write up:
Your write up should be a plain text file (not .doc, .rtf or .pdf) which includes the following:
svn export yourgrammar iso-lab8 For git, please do the equivalent.
tar czf iso-lab8.tgz iso-lab8