Power Efficiency and Scaling of the Cell Broadband Engine

Open Access
Author:
Johnson, Jacob
Graduate Program:
Computer Science and Engineering
Degree:
Master of Science
Document Type:
Master Thesis
Date of Defense:
April 13, 2009
Committee Members:
  • Padma Raghavan, Thesis Advisor
  • Sanjukta Bhowmick, Thesis Advisor
Keywords:
  • memory bottleneck
  • power efficiency
  • Littles Law
  • Cell processor
  • chip multiprocessor
Abstract:
<p>Since the 1980s, frequency scaling, brought on by Moore's Law, has given us increased uniprocessor performance at a steady rate and with no cost to application programmers. Recently, however, the limitations of interconnect delay and thermal capacity in the hardware have ended this trend, and frequency scaling has become less viable. Computer architects have turned to the chip multiprocessor (CMP) design paradigm to continue the performance trend, but designing and programming these systems has proven to be difficult.</p> <p>Although general-purpose CMPs have been commercially available for several years, most have had homogeneous designs, and have been used to run multiple independent threads concurrently. The Cell/B.E., on the other hand, is a heterogeneous shared-memory CMP available today, most notably in Sony's PlayStation 3 game console. This thesis will evaluate the Cell/B.E. with respect to the design goals of CMPs.</p> <p>In addition to increased performance, another motivating factor in CMP development is power efficiency. We will see that, for a variety of applications, porting code to the Cell processor can lead to an energy savings of more than 60\% over the same applications on a traditional uniprocessor. We will also see the limitations of using the Cell processor: the difficulty involved in programming on it, and the limitations of the hardware.</p> <p>In CMPs, memory bus width can become as serious a problem as memory latency. With multiple processing cores issuing memory requests simultaneously, bus contention can lead to slowdown even when memory latency would be completely masked. This thesis will therefore also address the limitations of scientific computing on the Cell/B.E. with respect to memory bandwidth, and show how this bandwidth can diminish or even eliminate the beneficial effects of increased concurrency.</p>