ABSTRACTIVE TEXT SUMMARIZATION BY HIGHLIGHTING

Open Access
- Author:
- Ambati, Rajeev Bhatt
- Graduate Program:
- Electrical Engineering
- Degree:
- Master of Science
- Document Type:
- Master Thesis
- Date of Defense:
- July 11, 2019
- Committee Members:
- David Jonathan Miller, Thesis Advisor/Co-Advisor
Prasenjit Mitra, Committee Member
Victor P Pasko, Committee Member - Keywords:
- text summarization
summarization
natural language processing
deeplearning
memory augmented neural network
abstractive summarization
extractive summarization
sequence modelling
seq2seq - Abstract:
- Traditional sequence-to-sequence (seq2seq) models and other variations of attention mechanism such as hierarchical attention have been applied to the text summarization problem. Though there is a hierarchy in the way humans use language by forming paragraphs from sentences and sentences from words, hierarchical models have failed to perform better than their traditional seq2seq counterparts. This effect is mainly because either the hierarchical attention mechanisms are too sparse using hard attention; selecting the maximally aligned sentence or noisy using soft-attention; assigning an alignment score to each sentence/word and using a weighted sum. Long Short-Term Memory (LSTM) is the backbone of such seq2seq models, and the architectures are primarily inspired by the machine translation literature [1] [2] [3]. In a typical text summarization dataset [4] which consists documents that are 800 tokens in length, capturing long-term dependencies is very important, for example, if the last sentence can be grouped with the first sentence of a document. LSTMs often fail to capture long term dependencies while modeling long sequences. To address these issues, we have made the following contributions through this work: • Adapted Neural Semantic Encoders (NSE) to text summarization by making the following changes: o Improved its attention mechanism by using a feed-forward neural network in place of simple dot-product attention. o Improved the compose function using an LSTM so that it can remember what is previously composed and maintain novelty. o Added a copy mechanism for facilitating the usage of document words in the summary. This helps the model in maintaining factual consistency with the document and also in the usage of words that are not present in our limited size vocabulary. • Proposed a novel hierarchical NSE that performs better than previous abstractive summarization models of the same complexity and is also faster to train. o Separate memories are used for each sentence to enrich the word representation. o A shared document memory is used to enrich the sentence representation and also to retrieve the highlights of the document. • Improved the best abstractive summarizer (supervised) by 1 ROUGE point.