Ling 472/CSE 472: Introduction to Computational Linguistics
Spring 09
Final project information
Project specifications
- All projects must be evaluated in terms of precision and recall.
- The project must include an implementation of a baseline system.
Compare the results of your (better) system with the baseline.
Sample project ideas
- Create a morphological analyzer for a morphologically complex
language. Evaluate on short running text (held out for evaluation purposes).
Measure precision, recall, and ambiguity.
- Write a program that uses character n-grams to classify words according
to which language it is drawn from. For instance, the website
Ardalambion contains wordlists from
several artificial languages constructed by J. R. R. Tolkien you can use
as training and test data. Note: no student from Ling 471, Winter 2009 may
do this project.
- Write a program that classifies email texts as indicating that
a file should be attached or not.
- Write a program to transliterate some non-ascii text to an
ascii-based writing system. Stage one uses only completely regular
rules. Stage two allows for exceptions. Evaluate precision and recall
in both directions.
- Write a program that translates English (or any other natural language)
into SQL for a restricted domain. For instance, from a parse of the sentence
"what cities are located in Greece", generate
SELECT City FROM city_table WHERE Country = "Greece".
Project presentation and write-up
- Initial project plan due 5/8, specifying task to be attempted,
data to be used, baseline, and means of measuring precision, recall
and anything additional metrics.
- An outline of the project write up due 6/1.
- Presentation/demonstration of your working system in class 6/3 or
6/5.
- Final project in executable state + write up due 2:30pm on 6/8.
- Write up requirements (8 pages, double-spaced):
- Background: What is the problem, how are you approaching it.
- Data: What data are you using, where did you get it, what is
your gold standard, how is your data divided between training and test?
- Methodology: How did your system work?
- Results: What is your baseline? Precision and Recall for baseline
and system, comparison to baseline, and f-measure or
additional measure if applicable.
- Discussion: What are the implications of this project for
broader inquiry in computational linguistics?
Group work
- You are encouraged to work in pairs.
- For partner projects:
- project plan must include description
of how the work will be allocated.
- project write-up must include a clear description of who did what
Back to course page