University of Washington: Linguistics: Ling 573: Spring 2014: Deliverable #4

Ling 573 - Natural Language Processing Systems and Applications
Spring 2014
Deliverable #4: Final Question-Answering Systems:
Code, Outputs, and Scores: Due May 30, 2014: 23:59
Final reports: Due June 10, 2014: 23:59

Goals

In this deliverable, you will complete development of your question-answering system. You will

Refine and finalize your end-to-end question answering system.
Improve extraction of answer strings.
Exploit information from any source, including answer type, web snippet redundancy, and answer extraction techniques to improve your results.
Perform final evaluation on a held-out test set. NOTE: The held-out test set includes BOTH new questions AND new documents. Also, the document DTD has changed, moving the location of the document identifier!!!

System Enhancement

This final deliverable must include substantive enhancements beyond your baseline system and further extensions over your D3 system.

Answer Extraction

For this deliverable, you will continue to refine your previous approach to achieve better answer extraction. Given the limited time in the course, we will not require Jeopardy!-style answers, but will continue to allow pattern-matching responses, possibly with some neighboring text. Specifically, for each question, you should produce the best 20 answer snippets, not to exceed 250 characters. You may build on techniques presented in class, described in the reading list, and proposed in other research articles.

Data

Document Collections

Training and Devtest Corpus

The Aquaint Corpus was employed as the document collection for the question-answering task for a number of years, and will form the basis of retrieval for part of this deliverable. The collection can be found on patas in /corpora/LDC/LDC02T31/.

Evaltest Corpus

The Aquaint-2 Corpus was employed as the document collection for the question-answering task for the most recent years and will form the basis of retrieval for the final evaluation of this deliverable. The collection can be found on patas in /corpora/LDC/LDC08T25/. Note!!!!:The DTD differs a bit from the original Aquaint corpus structure, mostly in terms of the location of the document identifier as an attribute of the DOC element.

Training Data and Development Test Data

Training Data

You may use any of the TREC question collections through 2005 for training your system. For 2003, 2004, and 2005 there are prepared gold standard documents and answer patterns to allow you to train and tune your Q/A system.

All pattern files appear in /dropbox/13-14/573/Data/patterns.

All question files appear in /dropbox/13-14/573/Data/Questions.

Training data appear in the training subdirectories.

Development Test Data

You should perform your development test evaluation on the TREC-2006 questions and their corresponding documents and answer string patterns. You are only required to test on the factoid questions.

Evaluation Test Data:

Question set available as of 5/20

Answer patterns will be posted 5/27

You should perform your final evaluation on the TREC-2007 questions and their corresponding documents and answer string patterns. You are only required to test on the factoid questions.
NOTE:Please do NOT tune on these questions. This data is/will be in the corresponding 'evaltest' directories.

Outputs

Create two (2) output files in the outputs directory, based on running your question-answering system on the 2006 devtest data file and the 2007 evaluation test data file. They should be named QA.outputs_year,

where year is 2006 or 2007.

Evaluation

You will compute MRR (Mean Reciprocal Rank), strict and lenient, of your devtest and evaltest runs. These scores should be placed in files called QA.results_year_type,

where year is 2006 or 2007, and
type is 'strict' or 'lenient'

in the results directory.

A simple script for calculating MRR based on the Litkowski pattern files and your outputs is provided in /dropbox/13-14/573/code/compute_mrr.py. It should be called as follows: python2.6 compute_mrr.py pattern_file QA.outputs {type} where

pattern_file is the factoid Litowski pattern file,
QA.outputs is your question-answering output file, and
type is "strict" or "lenient". If you omit the type, it will default to strict

Completing the project report

This final version should include all required sections, as well as a complete system architecture description and proper bibliography including all and only the papers you have actually referenced. See this document for full details.

The project report must also include a substantive error analysis. Please name your report D4.pdf.

Presentation

Your presentation may be prepared in any computer-projectable format, including HTML, PDF, PPT, and Word. Your presentation should take about 10-15 minutes to cover your main content, including:

Your overall system, emphasizing refinements to answer extraction.
Discussion of error analysis.
Issues and successes
Related reading which influenced your approach

Your presentation should be deposited in your doc directory, but it is not due until the actual presentation time. You may continue working on it after the main deliverable is due, and create a new tag (e.g. 4.1) for the corresponding release.

Summary

Finish coding and document all code.
Verify that all code runs effectively on patas using Condor.
Add any specific execution or other notes to a README.
Create your D4.pdf and add it to the doc directory.
Verify that all components have been added and any changes checked in.

Ling 573 - Natural Language Processing Systems and Applications Spring 2014 Deliverable #4: Final Question-Answering Systems: Code, Outputs, and Scores: Due May 30, 2014: 23:59 Final reports: Due June 10, 2014: 23:59