Towards Trusted Computational Services: Result Verification Schemes for MapReduce

Open Access
Huang, Chu
Graduate Program:
Information Sciences and Technology
Master of Science
Document Type:
Master Thesis
Date of Defense:
May 12, 2011
Committee Members:
  • Sencun Zhu, Thesis Advisor
  • Dinghao Wu, Thesis Advisor
  • watermark
  • verification
  • MapReduce
Recent development in Internet-scale data applications and services, combined with the proliferation of cloud computing, has created a new computing model for data intensive computing best characterized by the MapReduce paradigm. The MapReduce computing paradigm, pioneered by Google in its Internet search application, is an architectural and programming model for efficiently processing massive amount of raw unstructured data. With the availability of the open source Hadoop tools, applications built based on the MapReduce computing model are rapidly growing. This thesis focuses on a unique security concern on the MapReduce architecture given its loosely-coupled computational resources. We study the potential security risks from lazy or malicious servers involved in a MapReduce task. We introduce innovative mechanisms for detecting cheating services under the MapReduce environment based on watermark injection and random sampling methods. Results of these new detection schemes are expected to significantly reduce the cost of verification overhead. Also, we believe that the research reported is an important step towards trusted computational services and will assist in directing avenues for future research. In practical applications, we hope the results reported here will hopefully further promote the secure adoption towards trusted MapReduce services and therefore help to bring profits to MapReduce service providers with increasing number of potential clients.