University of Washington: Linguistics: Ling 573: Spring 2014: Deliverable #3

Ling 573 - Natural Language Processing Systems and Applications
Spring 2014
Deliverable #3: System Improvement: Question Processing:
Code and Results: Due May 16, 2014: 23:59
Updated Project Report: Due May 20, 2014: 09:00

Goals

In this deliverable, you will continue development and improvement of your question-answering system. You will

Enhance your question-answering system by improving question processing.
Perform additional question processing to aid passage retrieval and answer extraction.
Explore QA taxonomies, such as those developed by ISI and UIUC.
Continue improving your baseline QA system.
Identify the resources - software and corpus - that can support this task.

Question processing

For this deliverable, one focus will be on improving your baseline question answering through enhanced question processing. Question processing techniques include:

Question classification
Question reformulation
Query expansion
Question series handling

You may build on techniques presented in class, described in the reading list, and proposed in other research articles.

Additional resources

Some additional resources are available for question classification. You may build on the Question Classification Taxonomy developed by Roth and Li at UIUC. Their site includes training data, as well as the taxonomy itself.

Additional annotated question data is also available on patas. The data can be found in the /dropbox/13-14/573/Data/Question_classification/training/ directory on patas. The directory contains files, with offset question annotation. The *-tagged.txt files are associated with the .xml files for the original TREC questions.

The tagged.txt file contains coarse grained tags in the Li and Roth style. The lines are of the form:
Question_ID\tQuestion_type.
For 'OTHER' type questions, the Question_type field is left intentionally blank.

Data

Document Collection

The AQUAINT Corpus was employed as the document collection for the question-answering task for a number of years, and will form the basis of retrieval for this deliverable. The collection can be found on patas in /corpora/LDC/LDC02T31/.

Training Data

You may use any of the TREC question collections through 2005 for training your system. For 2003, 2004, and 2005 there are prepared gold standard documents and answer patterns to allow you to train and tune your Q/A system.

All pattern files appear in /dropbox/13-14/573/Data/patterns.

All question files appear in /dropbox/13-14/573/Data/Questions.

Training data appear in the training subdirectories.

Development Test Data

You should evaluate on the TREC-2006 questions and their corresponding documents and answer string patterns. You are only required to test on the factoid questions. Development test data appears in the devtest subdirectories.

Evaluation

You will employ the standard mean reciprocal rank (MRR) measure to evaluate the results from your baseline end-to-end question-answering system. These scores should be placed files called D3.results_strict and D3.results_lenientin the results directory. A simple script for calculating MRR based on the Litkowski pattern files and your outputs is provided in /dropbox/13-14/573/code/compute_mrr.py. It should be called as follows: python2.6 compute_mrr.py pattern_file D3.outputs {type} where

pattern_file is the factoid Litkowski pattern file,
D3.outputs is your passage retrieval output file, and
type is "strict" or "lenient". If you omit the type, it will default to strict

Outputs

Create one output file in the outputs directory, based on running your baseline question-answering system on the test data file. You should do this as follows:

Answer Extraction and Ranking

You should return the top 20 answer candidates, where a candidate is no longer than 250 characters in length. The required format for the answer extraction phase appears here. The file should be named D3.outputs and should appear in the outputs directory.

Extending the project report

This extended version should include all the sections from your previous report revised to reflect any improvements (but with some still as stubs) and additionally include the following new material:

Approach

Query Processing: this subsection describes your improved query processing component.

Evaluation

Component Evaluation: this subsection will describe focused evaluation of your query processing component. For example, comparison with your baseline system or specific evaluation of query classification, as appropriate.

Please name your report D3.pdf.

Presentation

Your presentation may be prepared in any computer-projectable format, including HTML, PDF, PPT, and Word. Your presentation should take about 10-15 minutes to cover your main content, including:

Query processing
Other improvements to your baseline system
Issues and successes
Related reading which influenced your approach

Your presentation should be deposited in your doc directory, but it is not due until the actual presentation time. You may continue working on it after the main deliverable is due.

Summary

Finish coding and document all code.
Verify that all code runs effectively on patas using Condor.
Add any specific execution or other notes to a README.
Create your D3.pdf and add it to the doc directory.
Verify that all components have been added and any changes checked in.

Ling 573 - Natural Language Processing Systems and Applications Spring 2014 Deliverable #3: System Improvement: Question Processing: Code and Results: Due May 16, 2014: 23:59 Updated Project Report: Due May 20, 2014: 09:00