Ling 472/CSE 472: Introduction to Computational Linguistics
Spring 2019

Course Info

Instructor Info

  Olga Zamaraeva Sara Ng
Office Hours: T 12:30-1:20
Th 12:30-1:20
W 2:30-3:20
F 12:30-1:20
Office Location: Guggenheim 407 Guggenheim 407
Email: olzama at uw sbng at uw

Syllabus

Description

Goals: By the end of this course, you will:

Computational linguistics is a broad field incorporating research and techniques for processing language with computers at all levels of linguistic structure. In this class, we will survey various topics and tasks in computational linguistics focusing on linguistic structure. While we will cover some of the basics of Natural Language Processing (which we will consider a separate subfield), this class will not focus on one specific approach (such as deep learning). Students in this class are expected to have a background in either computer science or linguistics, but not necessarily both. Expect this class to be difficult at times and easy at others. We hope to offer something new and interesting for everyone.

Note: To request academic accommodations due to a disability, please contact Disabled Student Services, 448 Schmitz, 206-543-8924 (V/TTY). If you have a letter from Disabled Student Services indicating that you have a disability which requires academic accommodations, please present the letter to the instructor so we can discuss the accommodations you might need in this class.

Requirements

Students are expected to complete the assigned readings before each lecture. Lecture and Lab/Section will connect with the readings, but not everything in the readings will be covered in lecture. Homework assignments and exams may nonetheless cover material in the readings not gone over in class.

All homework assignments and the final project will include a significant writing component, weight at or near 1/2 of the assignment grade. Be sure to save time to do a careful job on your write up.

We expect all write ups to be turned in as pdf files, even if they started as plain text files that we gave you.

Collaboration policy: Students are encouraged to work with each other on the homework, both in small groups and by posting & answering questions on Canvas. However, each student must turn in their own answers (both code and write up). No copying or sharing code or prose is allowed. Also, students who have collaborated must acknowledge the collaboration in their write ups (e.g. "I discussed this problem with Kim Smith/with classmates on Canvas as we were working on it.").

Plagiarism policy: Plagiarism is strictly forbidden. The offender will get 0 points for the plagiarized assignment and will be reported to the University. NB: It is very easy to detect not only plagiarized text but also a (piece of a) program, or even a mathematical solution that was adapted from something posted on the internet. Just don't. Submit your own solution, and rest assured, it will be unique!

Late homework policy: Unless prior arrangements are made, homework turned in late but within 24 hours of the deadline will be graded at 80% credit, homework turned between 24 and 48 hours will be graded at 70% credit, and homework turned in later than that will not be graded. No late final projects will be accepted.

Grades will be based on:

Schedule of Topics and Assignments (tentative)

DateTopicReadingDue
4/2 Introduction & logistics J&M Ch 1; (optional: L&C Ch 1)
4/4 Topics overview
Regular expressions, Formal languages, FSA
J&M Ch 2 (through 2.2.4); 16.1; 12.2; (optional: L&C 4.4)  
4/5 Lab/Section Slides
Cheat Sheet
Unix Tutorial
Assignment 0
4/9 FSA; FST for morphology and phonology For hw2: foma tutorial
J&M Ch.2.3-2.4; Ch 3.0-3.1.2; 3.2-3.4.0; 3.5; 11.1
 
4/11 Intonation (phonetics; guest lecture)
FST contd (regular lecture)
Optional:J&M Ch 7,8,9 (skim or read any parts you find approachable); L&C Ch 1.4  
4/12 Lab/Section Slides Assignment 1
4/16 Machine Learning. Bird's eye view. T. Mitchel. Key ideas in ML. (2017)
4/18 Evaluation and Error analysis Resnik & Lin, 2010
Kummerfeld et al. (2012)
 
4/19 Lab/Section Tutorial Form
Assignment 2
4/23 Language models, N-grams J&M Ch 4 (through 4.4); (optional: L&C pp 26-28)  
4/25 N-gram models, smoothing and discounting J&M Ch 4.5-4.9, 4.12  
4/26 Lab/Section Slides Project Milestone 1
4/30 Guest lecture: NLP for cancer research  
5/2 Midterm    
5/3 Lab/Section  
5/7 Grammars, Parsing, and CFG - overview J&M Ch 12 (optional: L&C pp 50-58)  
5/9 CFG parsing algorithms J&M 13  
5/10 Lab/Section   Assignment 3
5/14 Probabilistic CFG parsing J&M Ch 14 through 14.5; 14.10  
5/16 Syntactic Theory, Unification J&M Ch 15 through 15.3;15.6  
5/17 Lab/Section
Classroom moved to ART 317
shell.py Project Milestone 2
5/21 Unification (contd.); Grammar Engineering either: J&M Ch 15.4-15.5
or: Bender (2008) (through at least 2.3)
 
5/23 Computational semantics J&M Ch. 17-18  
5/24 Lab/Section   Assignment 4
5/28 Word vectors J&M ch 6 new ed.; J&M ch 7 new ed. (through 7.2)
 
5/30 Deep Learning. Guest lecture. TBA  
5/31 Lab/Section   Assignment 5, Part 1
Project Milestone 3 (revisions to Milestone 2)
6/4 Ethics, design and NLP Hovy & Spruit (2016)
Nathan et al 2007
Bolukbasi et al 2016
 
6/6 Presentations  
6/7 Lab/Section: More presentations   Assignment 5, Parts 2 & 3
6/13 (Thursday)     Final projects due 11:59 pm


Last modified: