Characterization, Understanding, and Prediction of Temporal Dynamics of Web Contents
Open Access
- Author:
- Liao, Yiming
- Graduate Program:
- Information Sciences and Technology
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- April 13, 2020
- Committee Members:
- Dongwon Lee, Dissertation Advisor/Co-Advisor
Dongwon Lee, Committee Chair/Co-Chair
Xiang Zhang, Committee Member
Suhang Wang, Committee Member
Lin Lin, Outside Member
Mary Beth Rosson, Program Head/Chair - Keywords:
- time series data
temporal data
temporal pattern
social media
user behavior
news popularity
graph embedding
dynamic graph embedding
timeline summarization
news timeline - Abstract:
- Time series data is ubiquitous in various domains, such as users' behavioral change over time on e-commercial platforms and content popularity evolution on social media. Analyzing sequential data and characterizing temporal dynamics make it possible to uncover causal factors leading to future series data. Besides, the ability to detect potential temporal patterns contributes to predicting future changes and developing timely strategies, providing a wide range of applications (e.g., improving user retention and content recommendation). In this dissertation, I will show our path along with characterization, understanding, and prediction of the temporal dynamics of web content. First, we start with two case studies about characterizing temporal patterns on two different domains. In the first study, we investigate users' funding behavior patterns on crowdfunding platforms (e.g., Indiegogo and Kickstarter). Employing time series clustering methods, we discover four distinct temporal funding patterns on both platforms. In the second study, we examine news articles' long-term popularity patterns. Although the majority of news articles are only prevalent for a very short time, there are a small fraction of news articles displaying much longer popularity, thus named as evergreen news articles. Motivated by the fact that evergreen news articles maintain a timeless quality and are of consistent interest to the public, we analyze evergreen articles and shed light on their long-term popularity. Second, to help readers better understand event evolution, journalists usually summarize a compact but complete storyline from thousands of news articles for each event, which is both time-consuming and labor-intensive. In order to deliver more accurate news timelines to the audience, we develop a fast and effective news timeline summarization algorithm to achieve state-of-the-art performances in both quality and speed. Third, we go beyond characterization and understanding the temporal evolution of web contents for prediction. More specifically, we introduce a temporal translation based framework to learn dynamic graph embeddings. Our framework allows us to train all graph snapshots simultaneously while still preserving the temporal constraint in learning, making it scalable to industrial-level graphs. Finally, we close this dissertation by discussing the limitation of our works and signaling potential directions.