Comprehensive Photographic Composition Assistance through Meaningful Exemplars
Open Access
- Author:
- Farhat, Farshid
- Graduate Program:
- Computer Science and Engineering
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- August 15, 2018
- Committee Members:
- James Wang, Committee Member & Major Field Represnt
Robert Collins, Committee Member & Major Field Represnt
Wang-Chien Lee, Member Added & Major Field Represnt
Jia Li, Committee Member & Related Areas Repres
Jesse Barlow, Chair & Major Field Represnt
Zihan Zhou, Member Added & Related Areas Represent
Chitaranjan Das, Program Head/Chair - Keywords:
- image aesthetics
deep learning
image retrieval
recommender system - Abstract:
- Many people are interested in taking good photos and sharing them with others. The size and value of visual content on the web are growing because people share their memories on social media and trade as non-fungible tokens with digital coins. Also, emerging high-tech hardware and software facilitate the ubiquitousness and functionality of digital photography and marketing. These trends lead to many challenging areas in visual content analysis, such as computational image aesthetics, composition-aware image retrieval, and meaningful feedback in photographic systems. Since people like to take better photos of themselves using an app in their handy device, there is a vast market demand for photographic composition assistance to evaluate the essential aspects affecting the beauty of a taken photo and convey meaningful feedback to users. This dissertation investigates new scientific and applied computational photography methods for helping people interested in taking astonishing photos. Because composition matters in photography, researchers have leveraged standard composition techniques, such as the rule of thirds and the perspective-aware methods, in providing photo-taking assistance. To assess the aesthetic quality of photos computationally, researchers also attempted to manipulate the images to improve the aesthetic quality. However, composition techniques developed by professionals are far more diverse than well-documented methods can cover. Also, there is a lack of a holistic framework to capture important aspects of a given scene and help individuals by constructive clues to take a better shot in their adventure. We leverage one of the aspects of image aesthetics in landscape photography which is a linear perspective, i.e., illustrating a 3D depth view as a 2D image. To analyze the linear perspective of a 2D image, we use a contour detector to recognize the vanishing lines, and then we cluster them to find potential vanishing points (VPs) accurately. Then, our proposed strength measure chooses the dominant VP among the potential VPs. We use this approach to provide on-site feedback to users via an image retrieval system based on linear perspective. Also, we leverage the triangle technique widely used in photography. We manage a large portrait dataset for this study and retrieve triangle-shaped human poses from the dataset to help amateur photographers. Finally, we leverage the underexplored photography ideas, which are virtually unlimited, diverse, and correlated. We propose a comprehensive fork-join framework, named CAPTAIN (Composition Assistance for Photo Taking), to guide a photographer with a variety of photography ideas. The framework consists of a few components: integrated object detection, photo genre classification, artistic pose clustering, personalized aesthetics-aware image retrieval, and style set matching. A large managed dataset crawled from a Website with ideas from photography enthusiasts and professionals backs CAPTAIN. The work proposes steps to decompose a given amateurish shot into composition ingredients and compose them to bring the photographer a list of related and valuable ideas that researchers have not explored in the past. The work addresses personal preferences for composition by presenting a user-specified preference list of photography ideas. The framework extracts ingredients of a given scene as a set of composition-related features ranging from low-level features such as color, pattern, and texture to high-level features such as pose, category, rating, gender, and object. Our composition model, indexed offline, provides visual ideas for the given scene, a novel model for an aesthetics-related recommender system. The matching algorithm recognizes the best shot among a sequence of photos concerning the user's preferred style set. We have conducted many experiments on the proposed components and reported findings. Also, this study is backed by a comprehensive user study demonstrating that the work is helpful to those taking photos.