Visual Characteristics for Computational Prediction of Aesthetics and Evoked Emotions

Open Access
Lu, Xin
Graduate Program:
Information Sciences and Technology
Doctor of Philosophy
Document Type:
Date of Defense:
August 05, 2015
Committee Members:
  • James Z Wang, Dissertation Advisor
  • James Z Wang, Committee Chair
  • Mary Beth Rosson, Committee Member
  • Reginald Adams Jr., Committee Member
  • Michelle Gayle Newman, Committee Member
  • Jia Li, Committee Member
  • Image Aesthetics Assessment
  • Emotion Prediction
  • Deep Learning
Human emotions and aesthetic feelings that are aroused by natural photographs have been actively studied during the past decades due to their potential applications to the development of intelligent computer systems and to broad areas of science and technology related to human emotion and aesthetics. In this dissertation, investigations of visual characteristics that evoke human emotion and aesthetic feelings are presented. First, shape features were studied in natural images in terms of how they influence emotions aroused in human beings. Shapes and their characteristics-such as roundness, angularity, simplicity, and complexity-have been found to evoke emotions in human perceivers, with evidence from psychological studies of facial expressions, dancing poses, and even simple synthetic visual patterns. Capturing these characteristics algorithmically to incorporate in computational studies, however, has proven difficult. Moreover, little prior research has modeled the dimensionality of emotions aroused by roundness and angularity. In this study, a collection of shape features was developed, which encoded the visual characteristics of roundness, angularity, and complexity using edge, corner, and contour distributions. Evaluation of those features were performed on the International Affective Picture System (IAPS) dataset, where evidence was provided regarding the significance of roundness-angularity and simplicity-complexity on predicting emotional content in images. Second, an investigation into three visual characteristics, i.e., roundness, angularity, and simplicity, of complex scenes that evoke human emotion was performed. Built upon the high-dimensional shape features, novel computational methods were developed to map visual content to the scales of roundness, angularity, and simplicity as three new constructs. The scope of the previous psychological hypothesis was, therefore, expanded by examining these three visual characteristics in computer analysis of complex scenes. The results produced by the three new constructs were compared to the hundreds of visual qualities examined by previous studies. The three constructs were completely interpretable and could be used in other applications involving roundness, angularity, and simplicity of visual scenes. Meanwhile, a large collection of ecologically valid stimuli (i.e., photographs humans regularly encounter on the Web), containing more than 40K images crawled from web albums, was generated using crowdsourcing and was subjected to human subject emotion ratings. Critically, these three new visual constructs achieved classification accuracy comparable to the hundreds of shape, texture, composition, and facial feature characteristics previously examined. This reduces the number of features required for classification by about two orders of magnitude. In addition, our experimental results showed that the three constructs showed consistent capacity in classifying both dimensions of emotions. Finally, a novel deep learning algorithm was developed to automatically learn effective visual characteristics for image aesthetics assessment. The proposed RAPID (RAting PIctorial aesthetics using Deep learning) system, incorporates heterogeneous inputs generated from the image, which include a global view and a local view, and unifies the feature learning and classifier training using a double-column deep convolutional neural network. The experimental results showed that the RAPID system significantly outperformed the state of the art on the AVA dataset. The results of the three studies demonstrate (1) the capability of roundness, angularity, and complexity of complex scenes to evoke human emotions, and (2) the capability of global view and fine-grained details of complex scenes to evoke aesthetic feelings.