Syllabus for Linguistics 570

Shallow Processing Techniques for Natural Language Processing

Autumn 2008 

 

 

Professor:

William Lewis

Time & Location:

MW 4:30-5:50, LOW 202

 

 

Office:

LOW 202 (for now)

Hours:

M 6-7

 

 

e-mail:

wlewis2 at u

Please include "570" in the subject line.

 

TA:

Bill McNeill

e-mail:

Hours:

billmcn at u

F 2-3, Art 337

 

 

Course Description:

 

Techniques and algorithms for associating relatively surface-level structures and information with natural language corpora, including: POS tagging, morphological analysis, preprocessing/segmentation, named entity recognition, chunk parsing, and word-sense disambiguation.  Linguistic resources that can be leveraged for these tasks (e.g., WordNet).

 

Course Texts:

 

Manning & Schütze (1999). Foundations of Statistical Natural Language Processing.  Cambridge: MIT Press

Jurafsky and Martin (2006). Speech and Language Processing, 2nd Edition.  Prentice-Hall.

 

Other Materials:

 

Miscellaneous readings as required.

 

Prerequisites:    Ling 200 or equivalent introductory linguistics course

                           Ling 473 (Basics for Computational Linguistics) or placement exam

                           CS 326 (Data Structures) or equivalent

                           Stat 391 (Prob. and Stats for CS) or equivalent

                           Programming in Perl, C, C++, Java, or Python

 

Grading:

Homework assignments: 50%

Projects:  40%

Class participation: 10%


Tentative Course Schedule:

Day

Date

Topic

Reading/Homework Assignments

1

Sep 24

(Slides)

Introduction

Overview:

- Shallow Approaches to Natural Language Processing

- Basics

Corpora:

- Utility in Lang. Processing

- Types of Corpora

Brief overview of FSA/FSTs

Review:  Read M&S Ch. 3, J&M Ch. 2

Overview:  Read J&M Ch. 1

HW #1

2

Sep 29

(Slides)

Review of HW#1

Overview, Corpora, Evaluation (con’t)

- Methods for evaluation

Morphological Processing

- Tokenization

- Stemming

- Evaluating Stemmers

Tools for Stemming

M&S Ch. 1,4

M&S Ch. 8: § 8.1

J&M Ch. 3: § 3.1, § 3.2, § 3.4, § 3.5

Abney 1996

HW #2

3

Oct 1

(Slides)

Markov Models:

- Adding weights and probabilities

- Morphological processing

- POS Tagging

Sang 1998, Ch. 2, § 1.1

Marcus et al 1993, § 1 & 2

Charniak 97 Ch 3: § 3.2, 3.3

M&S Ch. 9: § 9.2-9.3

4

Oct 6

(Slides)

 

HMM’s:

- POS Tagging

M&S Ch. 9: § 9.4

M&S Ch. 10: § 10.1-10.3

 

5

Oct 8

(Slides)

 

HMM’s:

- POS Tagging

- DP algorithms (review)

- Viterbi (review and application)

Project #1 (HTML|PDF)

J&M Ch. 5: § 5.5

J&M Ch. 6: § 6.4

Review:  M&S Ch. 10: § 10.2.2

6

Oct 13

(Slides)

 

Finish review of Project 1, Viterbi

POS Tagging

- Other methods for POS Tagging

- Evaluating Taggers

Remainder of M&S Ch. 10

Roche & Schabes 1995, through section 7 (inclusive)

7

Oct 15

(Slides)

Smoothing

 

J&M Ch 6: § 6.5

M&S Ch 6: § 6.2.5

8

Oct 20

(Slides)

Class cancelled due to illness

M&S § 1.4, 2.2

J&M Ch. 4: § 4.1 & 4.2

 

9

Oct 22

(Slides)

N-gram models

- N-grams and HMM’s

- N-gram models of language

Language Identification

- N-gram models for Language ID

- Hybrid models for Language ID

N-gram models

- Estimators

- Entropy/perplexity (intro)

- Evaluation of language models

Cavnar and Trenkle 1994

J&M Ch. 4: § 4.10

HW#3

10

Oct 27

(Slides)

Project 1 & HW#3 review

Continuation on entropy:

- Cross-entropy

- Perplexity

 

M&S Ch 2: § 2.2.5 - 2.2.8

11

Oct 29

(Slides)

Shallow Parsing

- Text Chunking

- Phrasal Identification

 

 

Bird & Loper 2005 (see dropbox on patas)

- Concentrate on general chunking issues (ignore NLTK specifics)

Optional:  Molina & Pla 2002

HW #4

12

Nov 3

(Slides)

 

 

Word Sense Disambiguation

Information Retrieval

 

M&S Ch. 7: § -7.2.1, 7.3.2, 7.5

Ide and Veronis 1998

 

13

Nov 5

(Slides)

 

WSD (con’t)

IR (con’t)

VS Model discussion, clustering

J&M Ch 23: § 23.1

Project #2

 

14

Nov 10

(Slides)

 

The bigger context:

- Question Answering

- TREC Competition

Tools for IR

- Lucene

Jinguji et al 2006

Dang et al 2006

Optional:  Wang et al 2008

15

Nov 12

(Slides)

Clustering discussion

IR issues, weighting

M&S Ch 14: through 14.2.1 (inclusive)

 

16

Nov 17

(Slides)

Classification discussion

Basic Machine Learning issues

M&S: Ch 16

 

17

Nov 19

Slides: Day 16 con’t

Classification discussion (con’t)

Xia & Lewis 2008

18

Nov 24

(Slides)

WSD (con’t)

- WordNet and Disambiguation

- Semantic Distance and Disambiguation

J&M Ch 20: through § 20.2

Miller 1995

Milhalcea 2002

McCarthy et al 2004

19

Nov 26

(Slides)

Word Sense Disambiguation

- Naïve Bayes Classifiers

M&S Ch. 2: § 2.2-2.3, 7.2.2-7.3

Chen et al 1998

HW #5

20

Dec 1

(Slides)

Named Entity Recognition

- Named entity tagging

- Evaluating and comparing contexts

 

Named Entity Recognition

- Clustering of NE pairs

- Evaluation of NE Systems

Tools for NER

- LingPipe

Hasegawa et al 2004

Minkov et al 2005

 

21

Dec 3

(Slides)

The Big Challenge

HW #6 (PDF|Word)

 

 


Bibliography (for those documents that can’t be found online):

 

Charniak, E. (1997).  Statistical Language Learning.  Cambridge, Mass:  MIT Press.