University of Washington: Linguistics: Ling 573: Spring 2015: Deliverable #3

Ling 573 - Natural Language Processing Systems and Applications
Spring 2015
Deliverable #3: Summarization Improvement; Information Ordering
Code and Results due: May 16, 2015: 23:59
Updated Project Report due: May 19, 2015: 09:00

Goals

In this deliverable, you will continue development and improvement of your summarization system. You will

Enhance your summarization system by improving information ordering, based on techniques discussed in class and in the readings to organize relevant content.
Refine your summarization system by extending your initial content selection components.
Consider the impact of mechanisms addressing topic orientation for guided summarization.
Identify the resources - software and corpus - that can support these tasks.

Information Ordering

For this deliverable, one focus will be on improving your baseline summarization systems through enhanced information ordering. Information ordering can address

temporal organization,
discourse cohesion through entity mentions,
discourse coherence through relations between text spans.

You may build on techniques presented in class, described in the reading list, and proposed in other research articles.

Content Selection Improvement

You should continue to revise and improve your content selection approach to enhance your summarization system. One strategy to do so in the context of TAC is through topic-focused summarization, as discussed below.

Topic-focused summarization

The TAC summarization task is a topic-focused, or "guided", summarization task. Summaries are expected to focus on the topic, specified by the title element given in test topics XML file, and address the relevant aspects for the corresponding category. Most approaches augment existing content selection strategies to further focus on the desired topics. You may build on approaches presented in lecture or readings.

Data

We will be focusing on the TAC summarization shared task. We will use one year's data as devtest for most of the term, and then use a new unseen year's data as final evaltest in the last deliverable.

Document Collection

The AQUAINT and AQUAINT-2 Corpora have been employed as the document collections for the summarization task for a number of years, and will form the basis of summarization for this deliverable. The collections can be found on patas in /corpora/LDC/LDC02T31/ (AQUAINT, 1996-2000) and /corpora/LDC/LDC08T25/ (AQUAINT-2, 2004-2006).

Training Data

You may use any of the DUC or TAC summarization data through 2009 for training and developing your system. For previous years, there are prepared document sets and model summaries to allow you to train and tune your summarization system.

All model files appear in /dropbox/14-15/573/Data/models.

All document specification files appear in /dropbox/14-15/573/Data/Documents.

Training data appear in the training subdirectories and devtest data in the devtest directory.

Development Test Data

You should evaluate on the TAC-2010 topic-oriented document sets and their corresponding model summaries. You should only evaluate your system on the the 'A' sets. Development test data appears in the devtest subdirectories.

Evaluation

You will employ the standard automatic ROUGE method to evaluate the results from your summarization ystem.

Evaluation results should be stored in your results directory.
This results file should be named D3.results.
You should provide results for ROUGE-1, ROUGE-2, ROUGE-3,and ROUGE-4 which have, in aggregate, been shown to correlate well with human assessments of responsiveness. This can be done with the "-n 4" switch in ROUGE.

Code implementing the ROUGE metric is provided in /dropbox/14-15/573/code/ROUGE/ROUGE-1.5.5.pl. Example configuration files are given.

rouge_run_ex.xml gives a model configuration file covering the 2010 data.
- You will need to change the "PEER-ROOT" to point to your own outputs.
- You also adjust the "PEERS" filenames to handle differences in file naming.
- The directories and models for the model files are set correctly for the TAC 2010 evaluation. You would to point them to alternative files/directories if you wish to use other data, such as the 2009 data.
To be safe, please call with perl5.10.0.
You should use the following flag settings for your official evaluation runs:
-e ROUGE_DATA_DIR -a -n 4 -x -m -c 95 -r 1000 -f A -p 0.5 -t 0 -l 100 -s -d CONFIG_FILE_WITH_PATH
- where, ROUGE_DATA_DIR is /dropbox/14-15/573/code/ROUGE/data
- CONFIG_FILE_WITH_PATH is the location of your revised configuration file
Output is writtent to standard output by default.
Further usage information can be found using the -H flag or invoking ROUGE with no parameters.

Outputs

Create a directory D3 under the outputs directory containing the summaries based on running your updated summarization system on the test data file. You should do this as follows:

Summary output

Each summary should be well-organized, in English, using complete sentences. It should have one sentence per line. (Other formats can be used, but require modifications to the scoring configuration.) A blank line may be used to separate paragraphs, but no other formatting is allowed (such as bulleted points, tables, bold-face type, etc.). Each summary can be no longer than 100 words (whitespace-delimited tokens). Summaries over the size limit will be truncated.
Summaries should be based only on the 'A' group of documents for each of the topics in the specification file. All processing of documents and generation of summaries must be automatic.
Submission format: A run will comprise exactly one file per topic summary, where the name of each summary file is the ID of its document set. Please include a file for each summary, even if the file is empty. Each file will be read and assessed as a plain text file, so no special characters or markups are allowed.

Extending the project report

This extended version should include all the sections from the original report with some content in all sections. You should especially focus on the following new material:

Approach

Content Selection: This subsection should present your improved content selection method.
Information Ordering: This subsecion describes your information ordering approach.

Evaluation

Current results: this subsection should describe the results of your updated system.
Observations about improvements to your system.
- You should include a table presenting the ROUGE-1, ROUGE-2, ROUGE-3, and ROUGE-4 scores of your system.
- You should also present some error analysis help to motivate future improvements.

Please name your report D3.pdf.

Presentation

Your presentation may be prepared in any computer-projectable format, including HTML, PDF, PPT, and Word. Your presentation should take about 10 minutes to cover your main content, including:

Information Ordering
Other improvements to your baseline system
Issues and successes
Related reading which influenced your approach

Your presentation should be deposited in your doc directory, but it is not due until the actual presentation time. You may continue working on it after the main deliverable is due.

Summary

Finish coding and document all code.
Verify that all code runs effectively on patas using Condor.
Add any specific execution or other notes to a README.
Create your D3.pdf and add it to the doc directory.
Verify that all components have been added and any changes checked in.

Ling 573 - Natural Language Processing Systems and Applications Spring 2015 Deliverable #3: Summarization Improvement; Information Ordering Code and Results due: May 16, 2015: 23:59 Updated Project Report due: May 19, 2015: 09:00