Temperature-Aware Computing
Open Access
- Author:
- Link, Greg
- Graduate Program:
- Computer Science and Engineering
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- May 22, 2006
- Committee Members:
- Vijaykrishnan Narayanan, Committee Chair/Co-Chair
Mary Jane Irwin, Committee Member
Chitaranjan Das, Committee Member
Kenneth Jenkins, Committee Member - Keywords:
- design automation
architecture
thermal
Temperature
hotspot
hot spot - Abstract:
- In the future, the peak temperature of a chip will be a primary design constraint. Higher temperatures can accelerate various chip failure mechanisms, reducing the lifetime of the system. These high temperatures also place additional burden on cooling systems, which must prevent thermal runaways due to increased standby power consumption. Consequently, temperature must be considered in the earliest phases of the design process. Many existing thermal management techniques focus on reducing the overall power consumption of the chip by throttling performance, eventually resulting in an overall reduction in chip temperature. These techniques, while effective, often do not address location-specific temperature problems referred to as hotspots. Recent research into hotspots has shown that different functional units in general purpose processors can have significantly different temperature profiles, and that moving workloads between units can reduce the creation of hotspots on the die. Using a newly developed thermal analysis tool, HS3d, this work explores the thermal profile of modern processor architectures, and discusses the types and characteristics of hotspots in future technologies, as process variation, multi-core design, and multi-wafer stacking techniques become prevalent. Means of mitigating these hotspots are presented, including workload migration for homogenous architectures, and means of reducing hotspots near the integer ALU. One proposed method, integer offloading to floating-point, redirects integer operations to the floating-point hardware, slightly increasing latency and power consumption, but distributing heat more evenly across the die. . Finally, a model of the impact of temperature on circuit timing is presented, and the impact of temperature gradients on multi-core processors is explored, showing that by the 45nm technology node, thermally-induced timing variations of 5% per 10 degrees C are possible. Traditional worst-case design techniques, which assume a single high temperature for the entire device, can therefore not take full advantage of the much more common typical-case conditions. This thesis discusses how thermal-aware design can be incorporated into the automated design flow, allowing variable frequency systems to achieve maximal performance across a wide operating range.