Context Modeling for Semantic Text Matching and Scene Text Detection

Open Access
- Author:
- Huang, Wenyi
- Graduate Program:
- Information Sciences and Technology
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- June 08, 2016
- Committee Members:
- C Lee Giles, Dissertation Advisor/Co-Advisor
C Lee Giles, Committee Chair/Co-Chair
Zhenhui Li, Committee Member
Anna Cinzia Squicciarini, Committee Member
Daniel Kifer, Outside Member - Keywords:
- Citation Recommendation
Scene Text Detection
Deep Learning
Neural Networks
Context Modeling
Text in the Wild - Abstract:
- Context is the information that surrounds and defines the target information it encapsulates. Without context, the most related target information could be misinterpreted. Most existing models utilize context by encoding it as a set of human-crafted heuristic features for machine learning, which may not fully capture many connections between the context and the target. We contend that, in the setting of big data, context information should be modeled in a more principled way that is tightly coupled with learning algorithms. We present several machine learning models that learn the relations between context and related target information for two fundamental tasks in natural language processing and computer vision: semantic text matching and scene text detection. In particular, this dissertation addresses two different applications with context modeling: citation recommendation for scientific papers and localizing text in the wild. Citations are crucial in academic attribution. A good citation recommendation engine can help both researchers and reviewers check the completeness of citations. Existing models for citation recommendation were mostly built on general recommendation models. Such methods usually project context into high dimensional feature vectors without directly modeling the relation between the citation context and the citation. Here, we propose two context-based models which learn the semantic relations between the citation contexts and the cited documents. Both models achieve state-of-the-art recommendation results on the CiteSeerX dataset. Detecting text in an unconstrained natural scene environment is a challenging task because of the many fonts, sizes, backgrounds, and alignments of the characters. Most existing models for scene text detection focus on small image patches of character areas. However, text in natural scenes is surrounded with informational context which can help locate the wanted text. We present a novel context-based attention model for detecting arbitrary oriented and curved scene text. Combining the model with an off-the-shelf text region proposal method, Extremal Regions, the text detection pipeline achieves the state-of-the-art performance on the ICDAR 2013 dataset and the MSRA Text Detection 500 dataset.