Addressing Power, Performance and End-to-end Qos in Emerging Multicores through system-wide Resource Management

Open Access
- Author:
- Sharifi, Akbar
- Graduate Program:
- Computer Science and Engineering
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- May 21, 2013
- Committee Members:
- Mahmut Taylan Kandemir, Dissertation Advisor/Co-Advisor
Mahmut Taylan Kandemir, Committee Chair/Co-Chair
Chitaranjan Das, Committee Member
Padma Raghavan, Committee Member
Qian Wang, Committee Member - Keywords:
- Multicore Systems
Resource Management
Power
Performance
End-to-End QoS - Abstract:
- Multicores are now ubiquitous, owing to the benefits they bring over single core architectures including improved performance, lower power consumption and reduced design complexity. Several resources ranging from the cores themselves to multiple levels of on-chip caches and off-chip memory bandwidth are typically shared in a multicore processor. Prudent management of these shared resources for achieving predictable performance and optimizing energy efficiency is critical and thus, has received considerable attention in recent times. In my research study, I have been focusing on proposing novel schemes to dynamically manage various available shared resources in emerging multiocres, while targeting three main goals: (1) Maximizing the overall system performance and (2) Meeting end-to-end QoS targets defined by the system administrator (3) Optimizing power and energy consumption. We consider a wide range of available resources including cors, shared caches, off-chip memory bandwidth, on-chip communication resources and power budgets. Further, towards achieving our goals, we employ {\em formal control theory} as a powerful tool to provide the high-level performance targets through dynamically managing and partitioning the shared resources. Providing end-to-end QoS in future multicores is essential for supporting wide-spread adoption of multicore architectures in virtualized servers and cloud computing systems. An initial step towards such an end-to-end QoS support in multicores is to ensure that at least the major computational and memory resources on-chip are managed efficiently in a coordinated fashion. In this dissertation we propose a platform for end-to-end on-chip resource management in multicore processors. Assuming that each application specifies a performance target/SLA, the main objective is to dynamically provision sufficient on-chip resources to applications for achieving the specified targets. We employ a feedback based system, designed as a Single-Input, Multiple-Output (SIMO) controller with an Auto-Regressive-Moving-Average (ARMA) model, to capture the behaviors of different applications. Dynamic management of the shared resources in multicore systems with the goal of maximizing the overall system performance is another main part of this dissertation. As we move towards many-core systems, interference in shared cache continues to increase, making shared cache management an important issue in increasing overall system performance. As a part of my research, we propose a dynamic cache management scheme for multiprogrammed, multithreaded applications, with the objective of obtaining maximum performance for both individual applications and the multithreaded workload mix. From an architectural side, in parallel to increasing core counts, network-on-chip (NoC) is becoming one of the critical shared components which determine the overall performance, energy consumption and reliability of emerging multicore systems. In my research, targeting Network-on-Chip (NoC) based multicores, we propose two network prioritization schemes that can cooperatively improve performance by reducing end-to-end memory access latencies. In our another work on NoCs, focusing on a heterogenous NoC in which each router has (potentially) a different processing delay, we propose a process variation-aware source routing scheme to enhance the performance of the communications in the NoC based system. Our scheme assigns a route to each communication of an incoming application, considering the processing latencies of the routers resulting from process variation as well as the communications that already exist in the network to reduce traffic congestion. Power and energy consumption in multicores is another important area that we target in my research. In one of the sections of this dissertation, targeting NoC based multicores, we propose a two-level power budget distribution mechanism, called PEPON, where the first level distributes the overall power budget of the multicore system among various types of on-chip resources like the cores, caches, and NoC, and the second level determines the allocation of power to individual instances of each type of resource. Both these distributions are oriented towards maximizing workload performance without exceeding the specified power budget. As the memory system is a large contributor to the energy consumption of a server, there have been prior efforts to reduce the power and energy consumption of the memory system. DVFS schemes have been used to reduce the memory power, but they come with a performance penalty. In my research study, we propose HiPEMM, a high performance DVFS mechanism that intelligently reduces memory power by dynamically scaling individual memory channel frequencies. Our strategy also involves clustering the running applications based on their sensitivity to memory latency, and assigning memory channels to the application clusters.