Text text extraction localization classification text features
Abstract:
This thesis explores the potential of harnessing text extraction within a constrained environment aimed at guiding and augmenting a visually impaired end-user’s shopping experience. Most current visual assistance pipelines attempt to track user location for indoor navigation; some attempt to classify relevant products of user interest; and others provide relevant contextual information by first extracting local environmental text. However, no prior art explores how text might be used as an input to other parts of the visual assistance pipeline. For instance, can text extraction improve localization and product classification in addition to providing relevant contextual information? Thus, this thesis explores the viability of introducing text into such seemingly disjoint problem spaces and ultimately concludes that environmental text extraction can enhance both indoor localization and last mile product classification. Additionally, this thesis answers the general problem of where in the pipeline multidimensional inputs should be combined.