Acknowledgments in Scientific Documents: Extraction, Storage, Search, and Social Network

Open Access
- Author:
- Khabsa, Madian
- Graduate Program:
- Computer Science and Engineering
- Degree:
- Master of Science
- Document Type:
- Master Thesis
- Date of Defense:
- April 16, 2012
- Committee Members:
- C Lee Giles, Thesis Advisor/Co-Advisor
Wang Chien Lee, Thesis Advisor/Co-Advisor
Raj Acharya, Thesis Advisor/Co-Advisor - Keywords:
- Acknowledgments
Information Extraction
Entity Resolution
Search Engines
Digital Libraries - Abstract:
- Acknowledgments are widely used in scientific articles to express gratitude and credit collaborators. Despite suggestions that indexing acknowledgments will give interesting insights, there is currently, to the best of our knowledge, no such system to track acknowledgments and index them. In this thesis we introduce AckSeer, a search engine and repository for automatically extracting and storing acknowledgments in digital libraries. AckSeer is a fully automated system that scans items in digital libraries including conference papers, journals, and books and extracts acknowledgment sections and identifies the entities within. We describe the architecture of AckSeer and discuss the extraction algorithms, which achieve an F1 measure above 83%. We use multiple Named Entity Recognition (NER) tools and propose a method for merging the outcome from different recognizers. The resulting entities are stored in a database. They are then added to the AckSeer index along with the metadata of the containing paper/book, and thus the entities are made searchable. We build AckSeer on top of the documents in the CiteSeerx digital library yielding more than 500,000 acknowledgments and more than 4 million mentioned entities. After building a repository for acknowledgments, we construct an acknowledgments graph, and study the relationships between the entities therein. The social networks of authors and publications have been well studied in the literature, with an exhaustive study of nearly all network properties. However, to the best of our knowledge the social graph of acknowledgments have never been investigated.