Automatic Keyphrase Extraction from Scholarly Documents
Open Access
Author:
Zafeiroudi, Kyriaki
Graduate Program:
Computer Science and Engineering
Degree:
Master of Science
Document Type:
Master Thesis
Date of Defense:
November 18, 2019
Committee Members:
Clyde Lee Giles, Thesis Advisor/Co-Advisor Daniel Kifer, Committee Member Chitaranjan Das, Program Head/Chair
Keywords:
keyphrase extraction PageRank word embeddings
Abstract:
Keyphrase extraction is a major natural language processing and information retrieval task, that although important for the advancement of many core tasks in the respective fields, is lacking a true solution. Different approaches have been researched, both supervised and unsupervised. We are proposing a new automatic unsupervised approach to perform keyphrase extraction from scholarly documents by leveraging graph-based ranking algorithms (PageRank) and the progress of word representation in a high dimensional
space in the form of word embeddings. Our method is novel, while relying on the foundation of previous successful methods, and aiming to achieve better performance by combining methods used in other natural language processing tasks.