Self-Supervised Recurring Pattern Discovery From A Single View

Open Access
- Author:
- Asthana, Yashasvi
- Graduate Program:
- Computer Science and Engineering
- Degree:
- Master of Science
- Document Type:
- Master Thesis
- Date of Defense:
- June 21, 2021
- Committee Members:
- Chitaranjan Das, Program Head/Chair
Yanxi Liu, Thesis Advisor/Co-Advisor
Robert Collins, Committee Member
David Jonathan Miller, Committee Member - Keywords:
- recurring pattern detection
semi-supervised learning on an image
GRASP recurring pattern discovery
unsupervised foreground segmentation - Abstract:
- Recurring patterns occur in abundance in nature and are readily acknowledged by humans from a single view. However, this perspective is one that we rarely share with machines. Training convolutional neural networks and other statistical algorithms have proven helpful in understanding patterns. Yet, without prior training or knowledge about the recurring patterns, capturing similarity in an image becomes algorithmically impossible. Identifying similar patterns in an image requires us to form different visual cues or visual words and group them to create prominent recurring patterns. Humans are good at this task and require no supervision to understand which parts of an image share some form of semantic similarity. For the first time, unsupervised methods like “GRASP Recurring Patterns from a Single View” (GRASP RP) illustrated the ability to discover unstructured recurring patterns. However, due to their dependence on SIFT features and tuning of hyper-parameters that cannot be adjusted on the fly, they tend to miss many instances of the detected recurring patterns. Thus, while some results might be excellent, the results on other images are inadequate with the same hyper-parameters. This thesis improves GRASP RP’s performance by going beyond a purely unsupervised method to employ a semi-supervised learning method, in addition to enhancing each step of the unsupervised recurring pattern discovery process. We test all our approaches on a newly labeled unstructured Recurring Pattern dataset with 1K (78% frontal and 22% projective view) images. We demonstrate that the proposed changes improve the original GRASP RP’s efficiency by up to 10 times and improve its correspondence with human vision by assigning 9% more recurring patterns compatible with ground truth. We also demonstrate that the newly proposed self-supervised system statistically significantly outperforms the baseline method in terms of the RP-level average precision, RP-instance-level mean-average recall, and RP-instance-level mean-average precision by 17%, 9%, and 5%, respectively. It also statistically significantly outperforms the best unsupervised method in terms of the RP-level average precision, RP-instance-level mean-average recall, and RP instance-level mean-average precision by 8%, 6%, and 2%, respectively on the 1K human-labeled RP image dataset.