Pushing the power and performance envelope of Next-generation Handheld platforms

Open Access
Author:
Chidambaram Nachiappan, Nachiappan
Graduate Program:
Computer Science and Engineering
Degree:
Doctor of Philosophy
Document Type:
Dissertation
Date of Defense:
September 23, 2015
Committee Members:
  • Chitaranjan Das, Committee Chair
  • Dr Mahmut Kandemir, Dissertation Advisor
  • Anand Sivasubramaniam, Committee Member
  • Kenneth Jenkins, Committee Member
Keywords:
  • SoC platforms
  • mobile
  • handheld
  • performance
  • energy
  • power
  • architecture
  • system design
  • virtualization
  • IP chaining
Abstract:
Today’s handhelds have grown in sophistication to run demanding applications. With the number of wearables and IoTs expanding, the tablets and smartphones are proposed to be used as their compute hubs. This places high compute demand and significant energy drain on these handhelds. In such a landscape, the consumers expectations are antithetical – needing the highest performance delivered all the time along with a long battery runtime all through a modest Li-ion battery on device! To provide higher performance and lower energy consumption, vendors have resorted to the use of hardware accelerators. In current generation handhelds, many applications (especially multimedia apps) rely heavily on multiple accelerators. In spite of these efforts from the vendors, full fledged support for multiple concurrent applications have been futile as they are unable to meet the consumer’s power-performance expectations. The main motivation of this dissertation is to propose techniques to push the performance boundaries by alleviating the bottlenecks and to efficiently make use of the power envelope when multiple accelerators are involved. It consists of four main components. The first part of the dissertation presents GemDroid, a comprehensive simulation infrastructure to study SoC architectures. Currently, this is one of the first publicly available tool to conduct a holistic evaluation of mobile platforms consisting of cores, IPs and system software. As the second part, the dissertation analyzes a spectrum of applications with GemDroid, and observes that the memory subsystem is a vital cog in the mobile platform because, it needs to handle both core and IP traffic, which have very different characteristics. Consequently, a heterogeneous memory controller (HMC) design is presented, where the memory is physically divided into two address regions, where the first region with one memory controller (MC) handles core-specific application data and the second region with another MC handles all IP related data. In the third part, the dissertation focuses on improving system throughput by short- circuiting the memory traffic and enabling multiple-applications to run concurrently by virtualizing the data paths. Through measurements on a current generation tablet, it shows that the frequent invocation of the CPU for processing applications frames and the involvement of main memory as a data flow conduit, are serious limitations. Instead, the dissertation proposes a novel IP virtualization framework (VIP), involving three key ideas that allow several IPs to be chained together and made to appear to the software as a single device. Firstly, chaining of IPs avoids data transfer through the memory system, enhancing the throughput of flows through the IPs. Secondly, by using a burst-mode, the CPU can initiate the processing of several frames through the virtual IP chain, without getting involved and interrupted for each frame, thereby allowing better energy saving and utilization opportunities. Third, the dissertation also makes a case for supporting multiple applications through the creation on several virtual paths – one for each flow, and hardware scheduling is used to enforce QoS guarantees despite any contention for resources along the way. As the final part, the dissertation strives to address the most critical and a daunting task - efficient energy management in handheld platforms. With the growing number of accelerators, memory demands are increasing and high computing capacities are required to support applications with stringent QoS needs. Current DVFS techniques that modulate power states of a single hardware component, or even recent proposals that manage multiple components, can lose out opportunities for attaining high energy efficiencies that may be possible by leveraging application domain knowledge. Thus, this dissertation proposes a coordinated multi-component energy optimization mechanism for handheld devices, where the energy profile of different components such as CPU, memory, GPU and IP cores are considered in unison to trigger the appropriate DVFS state by exploiting the application domain knowledge. Specifically, it shows that for the important class of frame-based applications, the domain knowledge - frame processing rates, component utilization and available slack - can be used to decide effective DVFS states for each component from among the numerous choices.