SplitServe: Efficiently splitting complex workloads over IaaS and FaaS

Jain, Aman

SplitServe: Efficiently splitting complex workloads over IaaS and FaaS

Open Access

Author:: Jain, Aman
Graduate Program:: Computer Science and Engineering
Degree:: Master of Science
Document Type:: Master Thesis
Date of Defense:: May 24, 2019
Committee Members:: Bhuvan Urgaonkar, Thesis Advisor/Co-Advisor
George Kesidis, Committee Member
Keywords:: Public Cloud Computing
Distributed Computing
Abstract:: Serverless computing products such as AWS Lambdas and other ``Cloud Functions'' (CFs) can offer much lower startup latencies than Virtual Machine (VM) instances with lower minimum cost. This has made them an attractive candidate for autoscaling to handle unpredictable spikes in simple, mostly stateless workloads both from a performance and a cost point of view. For complex (stateful, I/O intensive) and latency-critical workloads, however, the efficacy of using CFs in combination with VMs has not been fully explored. In this paper, we motivate a ``split-service'' application framework that can, for a given job (workload), {\it simultaneously} exploit both infrastructure-as-a-service (VM) and function-as-a-service (CF) products. Specifically, we design and implement a SplitServe-Spark embodiment of our proposal by modifying Apache Spark to use both Amazon VMs and Lambdas. Rather than letting performance degrade following the arrival of such jobs, we show that SplitServe-Spark is able to effectively use CFs to start servicing them immediately while new VMs are being launched, thus reducing the overall execution time. Further, when the new VMs do become available, SplitServe-Spark is able to move ongoing work from Lambdas to the new VMs, if that is deemed desirable from the cost or performance perspectives. Our experimental evaluation of SplitServe-Spark using four different workloads (K-means clustering, PageRank, TPC-DS, and Pi) shows that SplitServe-Spark improves performance up to 55\% for workloads with small to modest amount of shuffling and up to 31\% in workloads with large amounts of shuffling, when compared to only VM based autoscaling. Furthermore, SplitServe-Spark along with novel segueing techniques can help us save up to 21\% of cost by still giving almost 40\% improvement in execution time.

Tools