Natural language processing (NLP) enables computers to make use of data represented in human language (including the vast quantities of data available on the web) and to interact with computers on human terms. Applications from machine translation to speech recognition and web-based information retrieval demand both precision and robustness from NLP technology. Meetings these demands will require better hand-built grammars of human languages combined with sophisticated statistical processing methods. This class focuses on the implementation of linguistic grammars, drawing on a combination of sound grammatical theory and engineering skills.
Class meetings will alternate between lectures and discussion sessions. We will cover the implementation of constraints in morphology, syntax and semantics within a unification-based lexicalist framework of grammar. Weekly exercises will focus on building up an implemented grammar for a language (everyone must work on a different language, so be prepared to work with a language you don't know well!), based on the LinGO Grammar Matrix. At the end of the quarter, we will use the various grammars in a machine translation task. Since 2019 we have been testing out the software of the AGGREGATION project and starting from automatically constructed grammar specifications.
Prerequisites: Linguistics 566 or equivalent. No programming experience is required.
If you have already established accommodations with Disability Resources for Students (DRS), please communicate your approved accommodations to me at your earliest convenience so we can discuss your needs in this course.
If you have not yet established services through DRS, but have a temporary health condition or permanent disability that requires accommodations (conditions include but not limited to; mental health, attention-related, learning, vision, hearing, physical or health impacts), you are welcome to contact DRS at 206-543-8924 or uwdrs@uw.edu or depts.washington.edu/uwdrs/. DRS offers resources and coordinates reasonable accommodations for students with disabilities and/or temporary health conditions. Reasonable accommodations are established through an interactive process between you, your instructor(s) and DRS. It is the policy and practice of the University of Washington to create inclusive and accessible learning environments consistent with federal and state law.
Washington state law requires that UW develop a policy for accommodation of student absences or significant hardship due to reasons of faith or conscience, or for organized religious activities. The UW's policy, including more information about how to request an accommodation, is available at Faculty Syllabus Guidelines and Resources. Accommodations must be requested within the first two weeks of this course using the Religious Accommodations Request form available at https://registrar.washington.edu/students/religious-accommodations-request/.
[Note from Emily: The above language is all language suggested by UW and in the immediately preceding paragraph in fact required by UW. I absolutely support the content of both and am struggling with how to contextualize them so they sound less cold. My goal is for this class to be accessible. I'm glad the university has policies that help facilitate that. If there is something you need that doesn't fall under these policies, I hope you will feel comfortable bringing that up with me as well.]
Weekly lab exercises, typically assigned on Fridays and due by the following Friday night. Course time on Wednesdays will be used for discussion of the exercises, so please work on them ahead of time and bring questions. Lab exercises will require write-ups to explain the phenomena as manifested in your language and how you implemented your analysis. Active class participation will be viewed favorably when it comes to grading.
Everyone will complete Lab 1 individually, but students are be expected to work in pairs starting with Lab 2. Partners will alternate doing the write up portion of the lab, and have the grades for the labs where they did the write up weighted more heavily in their final course grade.
Lab exercises are to be turned in via Canvas.
Under construction---will be updated.
All course recordings will be posted on our Canvas page. If I'm slow to make them available there, please ping me over the Canvas discussions.
Dates | Lecture | Lab | Due date | Reading |
---|---|---|---|---|
1/6, 1/8 | Overview, Introduction LKB Formalism | Lab 1:
| 1/10 | Opt: Copestake Ch 1-3 |
1/13, 1/15 | Testsuites, [incr tsdb()] Grammar Matrix |
Lab 2: Testsuites/customization I:
|
1/17 | Opt: CopestakeCh 4-5, Req: Bender et al 2010 (in 'Files' Canvas) |
1/22 | Morphotactics in the Matrix, Lab 3 phenomena | Lab 3: Testsuites/customization II:
|
1/24 | Goodman 2013 |
1/27, 1/29 | Lab 4 phenomena | Lab 4: Testsuites/customization III: Three more phenomena TBD:
|
1/31 | |
2/3, 2/5 | MRS | Lab 5:
|
2/7 | Copestake et al 2005, especially Sec 3 |
2/10, 2/12 | The LOGON MT architecture, VPM | Lab 6: VPM work, MMT-driven tdl editing, one more phenomenon | 2/14 | |
2/19 | tdl style, non-verbal predicates | Lab 7: Non-verbal predicates | 2/21 | |
2/24, 2/26 | Reflections, [incr tsdb()] demo, etc | Lab 8: MMT-driven tdl editing | 2/28 | |
3/3, 3/5 | Transfer rules | Lab 9: MT | 3/7 | |
3/10, 3/12 | The Grammar Matrix: Future directions | Machine Translation Extravaganza Course evals | Nope |