Mining User-Generated Geo-Social Data for Search and Recommendation

Open Access
Ye, Mao
Graduate Program:
Computer Science and Engineering
Doctor of Philosophy
Document Type:
Date of Defense:
Committee Members:
  • Wang Chien Lee, Committee Chair
  • Tom La Porta, Committee Member
  • Piotr Berman, Committee Member
  • Xiaolong Zhang, Committee Member
  • Social Network
  • Data Mining
  • Location-based Services
  • Recommender Systems
  • Information Retrieval
With the increasing availability of GPS-enabled smart phones, rapid development of locationbased services, and growing interests in on-line social networking, a number of location-based social networking (LBSN) services such as Facebook Places, Foursquare, Whrrl, Yelp, EveryTrail and TripAdvisor have emerged. These services allow users to explore places, write reviews and blogs, and share their locations and experiences with others. In this thesis, we propose to mine those user-generated geo-social data (e.g., places, reviews, blogs and social links) to enable better search and recommendation services through several innovative techniques. First, we develop a travelogue service that discovers and conveys various travelogue (or trip blog) digests, in form of theme locations, geographical scope, traveling trajectory and location snippet, to users. In this service, theme locations in a travelogue are the core information to discover. Thus we aim to address the problem of theme location discovery to enable the above travelogue services. Due to the inherent ambiguity of location relevance, we perform location relevance mining (LRM) in two complementary angles, relevance classification and relevance ranking, to provide comprehensive understanding of locations. Furthermore, we explore the textual (e.g., surrounding words) and geographical (e.g., geographical relationship among locations) features of locations to develop a co-training model for enhancement of classification performance. Built upon the mining result of LRM, we develop a series of techniques for provisioning of the aforementioned travelogue digests in our travelogue system. Second, we develop a semantic annotation technique for location-based social networks (e.g., Foursquare and Whrrl) to automatically annotate all places with category tags which are a crucial prerequisite for location search. Our annotation algorithm learns a binary support vector machine (SVM) classifier for each tag in the tag space to support multi-label classification. Based on the check-in behavior of users, we extract features of places from i) explicit patterns (EP) of individual places and ii) implicit relatedness (IR) among similar places. The features extracted from EP are summarized from all check-ins at a specific place. The features from IR are derived by building a novel network of related places (NRP) where similar places are linked by virtual edges. Upon NRP, we determine the probability of a category tag for each place by exploring the relatedness of places. Both EP and IR features are complementary with each other and beneficial for the proposed classification task. Third, we provide a point-of-interests (POIs) recommendation service for the rapid growing location-based social networks (LBSNs), e.g., Foursquare, Whrrl, etc. The idea is to explore user preference, social influence and geographical influence for POI recommendations. In addition to deriving user preference based on user-based collaborative filtering and exploring social influence from friends, we put a special emphasis on geographical influence due to the spatial clustering phenomenon exhibited in user check-in activities of LBSNs. We argue that the geographical influence among POIs plays an important role in user check-in behaviors and model it by power-law distribution. Accordingly, we develop a collaborative recommendation algorithm based on naive Bayesian method, by incorporating geographical influence. Furthermore, We propose a unified POI recommendation framework, which fuses user preference to a POI with social influence and geographical influence. Finally, social friendship has been shown beneficial for item recommendation for years. However, existing approaches mostly incorporate social friendship into recommender systems by heuristics. Here, we argue that social influence between friends can be captured quantitatively and propose a probabilistic generative model, called social influenced selection(SIS), to model the decision making of item selection (e.g., what book to buy or where to dine). Based on SIS, we mine the social influence between linked friends and the personal preferences of users through statistical inference. To address the challenges arising from multiple layers of hidden factors in SIS, we develop a new parameter learning algorithm based on expectation maximization (EM). Moreover, we show that the mined social influence and user preferences are valuable for group recommendation and viral marketing. We conduct extensive empirical studies to evaluate the performance of our proposed approaches. The experiment results demonstrate the effectiveness of our approaches and their superiority over state-of-the-art approaches in corresponding domains.