Linguistics 567: Knowledge Engineering for NLP

An elective course in UW's Professional Master's in Computational Linguistics

Winter 2007

(Course websites from previous years: 2006 2005)

Course Info

  • Required Text:
  • Software

    This software is available in the Treehouse. You can also install it on your home machines. For some of the functionality (in particular the test suite management software [incr tsdb()] aka "the fine system"), you will need to be running linux. If you have a machine with an Intel chip (i.e., a Windows machine or a recent Mac), you can try using Knoppix. Alternatively, on a recent Mac, you can run a linux virtual machine with Parallels.

    Instructor Info

    Links

    Syllabus

    Description

    Natural language processing (NLP) enables computers to make use of data represented in human language (including the vast quantities of data available on the web) and to interact with computers on human terms. Applications from machine translation to speech recognition and web-based information retrieval demand both precision and robustness from NLP technology. Meetings these demands will require better hand-built grammars of human languages combined with sophisticated statistical processing methods. This class focuses on the implementation of linguistic grammars, drawing on a combination of sound grammatical theory and engineering skills.

    Class meetings will alternate between lectures and discussion sessions. We will cover the implementation of constraints in morphology, syntax and semantics within a unification-based lexicalist framework of grammar. Weekly exercises will focus on building up an implemented grammar for a language of your choice (everyone must work on a different language, so be prepared to work with a language you don't know well!), based on the LinGO Grammar Matrix. At the end of the quarter, we will use the various grammars in a machine translation task.

    Prerequisites: Linguistics 566 or equivalent. No programming experience is required.

    Note: To request academic accommodations due to a disability, please contact Disabled Student Services, 448 Schmitz, 206-543-8924 (V/TTY). If you have a letter from Disabled Student Services indicating that you have a disability which requires academic accommodations, please present the letter to the instructor so we can discuss the accommodations you might need in this class.

    Requirements

    Weekly lab exercises, typically assigned on Mondays and due by Sunday night. Course time on Mondays will be used for discussion of the exercises, so please work on them ahead of time and bring questions. (I'd make the deadline Friday, but would rather not spend the weekend grading things...) Most lab exercises will require write-ups to explain the phenomena as manifested in your language and how you implemented your analysis. Active class participation will be viewed favorably when it comes to grading.

    Lab exercises are to be turned in via Catalyst E-Submit:

    Schedule of Topics and Assignments (tentative)

    DatesLectureLabDue dateReading
    1/3 Overview, Introduction
    LKB Formalism
    Lab 1: Getting to know the LKB; Choose languageDue: 1/9 (Tue, not Sun) Ch 1-3
    1/8, 1/10 Testsuites, [incr tsdb()] Lab 2: Constructing a testsuiteDue: 1/14Oepen & Flickinger 1998
    1/17
    (No class 1/15)
    The Grammar Matrix: Motivations, technical details, ODIN Lab 3: Finishing the testsuitesDue: 1/21Bender, Flickinger & Oepen 2002; Bender & Flickinger 2005
    1/22, 1/24 Minimal Recursion Semantics Lab 4: Starter grammar, vocab, initial test suite runDue: 1/28Ch 4, 5
    Copestake, Flickinger, Pollard, and Sag, 2005
    1/29, 1/31 Matrix tour, cont.; Case, Agreement Lab 5: Case, AgreementDue: 2/4Flickinger & Bender 2003
    2/5, 2/7 Argument optionality, modification, discourse status Lab 6: Argument optionality, modification, discourse statusDue: 2/11Borthen and Haugereid 2005, available through UW Library's Electronic Journals (Research on Language and Computation 3(2):221-246)
    2/12, 2/14 Clause types and illocutionary force, Precision grammars and corpus data Lab 7: Polar questions, embedded clausesDue: 2/18Baldwin et al 2005
    2/21
    (No class on 2/19)
    Raising, control, argument composition, sentential negation Lab 8: I can eat glass. It doesn't hurt me.Due: 2/25 
    2/26, 2/28 The LOGON MT architecture Lab 9: Grammar clean up; Transfer rules Due: 3/4Oepen et al 2004
    3/5, 3/7 The Grammar Matrix: Future directions Machine Translation Extravaganza
    Course evals
    Nope 

    ebender at u dot washington dot edu
    Last modified: Dec 29 2006