COMPILER DIRECTED MEMORY HIERARCHY DESIGN AND MANAGEMENT IN CHIP MULTIPROCESSORS

Open Access
Author:
Ozturk, Ozcan
Graduate Program:
Computer Science and Engineering
Degree:
Doctor of Philosophy
Document Type:
Dissertation
Date of Defense:
February 07, 2007
Committee Members:
  • Mahmut Taylan Kandemir, Committee Chair
  • Mary Jane Irwin, Committee Member
  • Yuan Xie, Committee Member
  • Aylin Yener, Committee Member
Keywords:
  • memory
  • compiler
  • management
  • chip
  • multiprocessor
  • computer
  • hierarchy
  • design
Abstract:
Two trends, namely, increasing importance of memory subsystems and increasing use of chip multiprocessing, motivate conducting research on memory hierarchy optimization for chip multiprocessors. One of the critical research topics along this direction is to design an application-specific, customized, software-managed on-chip memory hierarchy for a chip multiprocessor. Another important issue is to optimize the application code and data for such a customized on-chip memory hierarchy. This thesis proposes solutions to the problem of memory hierarchy design and data access management. First, an integer linear programming (ILP) based solution to the combined problem of memory hierarchy design and data allocation in the context of embedded chip multiprocessors is given. The proposed solution uses compiler analysis to extract data access patterns of parallel processors and employs integer linear programming for determining the optimal on-chip memory partitioning across processors and data allocations across memory components. Second, we present and evaluate a compiler-driven approach to data compression for reducing memory space occupancy. Our goal is to study how automated compiler support can help in deciding the set of data elements to compress/decompress and the points during execution at which these compressions/decompressions should be performed. The proposed compiler support achieves this by analyzing the source code of the application to be optimized and identifying the order in which the different data blocks are accessed. Based on this analysis, the compiler then inserts compression/decompression calls in the application code. The compression calls target the data blocks that are not expected to be used in the near future, whereas the decompression calls target those data blocks with expected reuse but currently in compressed form. Third, we present a constraint network based formulation of the integrated loop-data optimization problem to improve locality of data accesses for a given application code. We present two alternate solutions to the data locality problem with our constraint network based formulation and discuss the pros and cons of each scheme. The first solution is a pure backtracking based one, whereas the second one improves upon the first one by employing three additional optimizations, including backjumping. Fourth, we extend our constraint network based approach to code parallelization for chip multiprocessors. Fifth, we explore how processor cores and storage blocks can be placed in a 3D architecture to minimize data access costs under temperature. This process (topology exploration) has been implemented using integer linear programming. Finally, we present a Scratch Pad Memory space management technique for chip multiprocessors. We focus on the management of a scratch pad space shared by multiple applications executing at the same time that can potentially share data. The proposed approach has three major components; a compiler analysis phase, a runtime space partitioner, and a local partitioning phase. This thesis also presents experimental evidence, demonstrating the success of each of the proposed techniques, and compares our results with those obtained through previously-proposed approaches.