Image Analysis and Retrieval by Spatial Models and Information Integration

Open Access
Joshi, Dhiraj
Graduate Program:
Computer Science and Engineering
Doctor of Philosophy
Document Type:
Date of Defense:
July 18, 2007
Committee Members:
  • James Z Wang, Committee Chair/Co-Chair
  • Jia Li, Committee Chair/Co-Chair
  • Robert Collins, Committee Member
  • Wang Chien Lee, Committee Member
  • David Russell Hunter, Committee Member
  • story picturing
  • image retrieval
  • statistical modeling
The last decade has witnessed an explosion in the production of digital data from all over the world. In comparison, the rate of meaningful information extraction has been rather slow. One of the key reasons is that digital data usually comes in multiple modalities or media forms (i.e. text, images, video), thus making it difficult for generic information extraction methods to be used across multiple media. Moreover, extraction of precise semantics from multimedia largely remains an open problem. A large amount of multi-dimensional and multi-spectral data is being acquired and used in specialized domains such as medical science, surveillance and astronomy. The presence of multiple dimensions and multi-spectral information poses yet another challenge for certain special kinds of image analysis applications. In spite of the challenges there is a growing demand for computer intervention in all these aspects. In this thesis, I present my research in multi-dimensional image modeling and multimedia information integration. I begin by characterizing image retrieval in the real world from a user and system perspective. Next, I will present a conceptually clean hidden Markov model based framework to model multi-dimensional image data. Extension of the Markovian property to a third dimension dramatically increases the computational challenge for estimation of model parameters. I have developed a locally optimal but scalable parameter estimation algorithm for two and three dimensional hidden Markov models (2-D and 3-D HMMs). I will next introduce the problem of story picturing and present an unsupervised approach for illustration of stories using an image database. In a special scenario, the process of image ranking bears an interesting analogy to obtaining the stationary distribution of a random walk performed in the image similarity space. This will be followed by discussion of a generic framework for next generation multi-modal image management engine with presentation of a prototype. I will conclude by describing the architecture and working of story picturing engine within the realm of the presented framework.