video hashing: algorithm design and performane analysis

Open Access
Author:
Li, Mu
Graduate Program:
Electrical Engineering
Degree:
Doctor of Philosophy
Document Type:
Dissertation
Date of Defense:
June 12, 2014
Committee Members:
  • Vishal Monga, Dissertation Advisor
  • William Evan Higgins, Committee Member
  • David Miller, Committee Member
  • Yanxi Liu, Committee Member
Keywords:
  • video hashing
  • video fingerprinting
  • near-duplicate detection
  • anti-piracy search
Abstract:
The fast growth of video data on Internet is making a big challenge on present near-duplicate detection (NDD) methods. Furthermore, the emergence of websites such as YouTube and Dailymotion poses an anti-piracy video search problem. That is, given an original video (provided to YouTube by the content owner), we must find pirated videos that are uploaded to YouTube by users to identify instances of illegal or undesirable uploads. Moreover, recent progress in mobile device technology stimulates many new applications such as augmented reality where the limitations on speed, power consumption, memory, processing time, and communication bandwidth at the mobile device put new challenges on present NDD methods. In this thesis, we try to overcome the above-mentioned challenges by using the technique of video hashing. By definition, video hashing is a randomized video dimensionality reduction technique which maps a video to a short digest called as its hash vector based on which the visual similarity between two videos can be measured. In order to use video hashing to solve the challenges mentioned above, some open problems need to be addressed, mainly including: What is a really robust hashing method in terms of resilience to severe content-preserving distortions? How to efficiently fuse different types of hashes to further improve the detection performance? After having acceptable detection performance, how to make the hashes as compact as possible to enable scalable applications? How to efficiently and accurately detect the positions of missing / inserted frames so that temporal synchronization and hash extraction can form a single automatic process? This thesis makes some effort to solve these open problems. Specifically, contributions of this thesis consist of three aspects: 1)Robustness: We apply tensor factorization to extract video hashes which are robust to most distortions due to a spatio-temporal separation property: When the attack is spatial, the temporal components of hash vectors stay approximately invariant; likewise spatial components stay unperturbed if the attack is purely temporal. 2)Compactness: We model the input video to be a structural graphical model and partition the graph into several subgraphs. The number of subgraphs enables an explicitly controllable and very nice tradeoff between detection performance and hash length. 3) Automatic synchronization and hash fusion: We try to incorporate the automatic temporal synchronization, which is based on dynamic time warping, into the video hashing system. We also proposed a hash fusion method called distance boosting which can fuse different types of hashes in a future-proof manner to further improve detection performance.