PERFORMANCE ANALYSIS AND VISUALIZATION FOR HIGH PERFORMANCE PARALLEL I/O

Open Access
Author:
Kim, Seong Jo
Graduate Program:
Computer Science and Engineering
Degree:
Master of Science
Document Type:
Master Thesis
Date of Defense:
None
Committee Members:
  • Mahmut Taylan Kandemir, Thesis Advisor
  • Chitaranjan Das, Thesis Advisor
Keywords:
  • MPI-IO
  • PVFS
  • collective I/O
  • I/O stack
  • code instrumentation
  • Parallel I/O visualization
Abstract:
Efficient execution of large-scale scientific applications requires high-performance computing systems designed to meet the I/O requirements. Most I/O- and data-intensive parallel applications access multiple layers in the I/O stack during the operations. Typ- ical I/O requests from these applications may include accesses to high-level I/O libraries such as Parallel netCDF and HDF5, the MPI I/O library, and parallel file systems such as PVFS, in turn supported by native file systems in Linux. To design and implement parallel applications that exercise this I/O stack, one must understand the flow and in- teraction of I/O calls through the entire I/O stack. Such understanding helps identify I/O bottlenecks and thus exploit the potential performance in different layers of the storage hierarchy. To trace the execution of I/O calls and to understand the complex interactions among multiple user libraries and file systems, we implement a GUI-based integrated profiling and analysis environment, PAVIS. Our implementation automati- cally generates an instrumented I/O stack, runs applications, and visualizes detailed statistics in terms of user-specified metrics of interest.