Beschreibung
General-purpose graphics processing units (GPGPU) have emerged as an important class of shared memory parallel processing architectures, with widespread deployment in every computer class from high-end supercomputers to embedded mobile platforms. Relative to more traditional multicore systems of today, GPGPUs have distinctly higher degrees of hardware multithreading (hundreds of hardware thread contexts vs. tens), a return to wide vector units (several tens vs. 1-10), memory architectures that deliver higher peak memory bandwidth (hundreds of gigabytes per second vs. tens), and smaller caches/scratchpad memories (less than 1 megabyte vs. 1-10 megabytes). In this book, we provide a high-level overview of current GPGPU architectures and programming models. We review the principles that are used in previous shared memory parallel platforms, focusing on recent results in both the theory and practice of parallel algorithms, and suggest a connection to GPGPU platforms. We aim to provide hints to architects about understanding algorithm aspect to GPGPU. We also provide detailed performance analysis and guide optimizations from high-level algorithms to low-level instruction level optimizations. As a case study, we use n-body particle simulations known as the fast multipole method (FMM) as an example. We also briefly survey the state-of-the-art in GPU performance analysis tools and techniques. Table of Contents: GPU Design, Programming, and Trends / Performance Principles / From Principles to Practice: Analysis and Tuning / Using Detailed Performance Analysis to Guide Optimization
Autorenportrait
Hyesoon Kim is an Assistant professor in the School of Computer Science at Georgia Institute of Technology.Her research interests include high-performance energy-efficient heterogeneous architectures, programmer-compiler = microarchitecture interaction and developing tools to help parallel programming. She received a B.A. in mechanical engineering from Korea Advanced Institute of Science and Technology (KAIST), an M.S. in mechanical engineering from Seoul National University, and an M.S. and a Ph.D. in computer engineering at The University of Texas at Austin. She is a recipient of the NSF career award in 2011.Richard (Rich) Vuduc is an assistant professor in the School of Computational Science and Engineering at the Georgia Institute of Technology. His research lab, The HPC Garage, is interested in high-performance computing, with an emphasis on parallel algorithms, performance analysis, and performance tuning. His lab's work has been recognized by numerous best paper awards and his lab was part of the team that won the 2010 Gordon Bell Prize, supercomputing's highest performance achievement award. He is a recipient of the National Science Foundation's CAREER Award (2010) and has served as a member of the Defense Advanced Research Projects Agency's Computer Science Study Group (2009). Rich received his Ph.D. from the University of California, Berkeley, and was a postdoctoral scholar at Lawrence Livermore National Laboratory.Sara S. Baghsorkhi is a research scientist in the Programming System Lab at Intel, Santa Clara. She received her Ph.D.in Computer Science from University of Illinois at Urbana Champaign.Her primary areas of research include auto-tuning and code generation for high performance computer architectures with a focus on wide vector SIMD designs. She has published 10 research papers and has 7 patents.Jee W. Choi is a 5th year Ph.D. student in the school of Electrical and Computer Engineering at Georgia Institute of Technology. His research interests include modeling for performance and power for multi-core, accelerators and heterogeneous systems. Jee received his B.S. and M.S. from Georgia Institute of Technology.Wen-mei W. Hwu is Sanders-AMD Endowed Chair Professor at the Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign. His research interest is in the area of architecture, compilation, and programming techniques for high performance, energy-efficient computer systems. He is well known for the development of the IMPACT compiler technology for predicated execution and speculative execution, which widely used in DSP and GPU cores and compilers today. He is the chief scientist of Parallel Computing Institute and a Co-PI of the $208M NSF Blue Waters Supercomputer Project. He is a co-founder and CTO of MulticoreWare.For his contributions to compiler optimization and computer architecture, he received the 1993 Eta Kappa Nu Outstanding Young Electrical Engineer Award, 1994 University Scholar Award of the University of Illinois, 1998 ACM SigArch Maurice Wilkes Award, 1999 ACM Grace Murray Hopper Award, ISCA Influential Paper Award, and Distinguished Alumni Award in Computer Science of the University of California, Berkeley. Dr. Hwu has also been in the forefront of computer engineering education. He and David Kirk jointly created an undergraduate heterogeneous parallel programming course at the University of Illinois (ECE498AL-Programming Parallel Processors), which has become ECE408/CS483 - Applied Parallel Programming. The course has been adopted by many universities. Hwu and Kirk have been offering summer school versions of this course worldwide. In 2010, Kirk and Hwu published the textbook for the course, entitled âEurooeProgramming Massively Parallel Processors - A Hands-on Approach,âEuro by Elsevier. As of 2012, more than 12,000 copies have been sold. For his teaching and contribution to education, he has received the 1997 Eta Kapp