Tutorial on Architecture and Compiler Support for Speculative Precomputation

12th Annual International Conference on Parallel Architectures and Compilation Techniques
New Orleans, Louisiana, Sept.27-Oct.1, 2003

Donald Yeung (UMCP), Dean Tullsen (UCSD), and Steve Shih-wei Liao (Intel)

· Intended Audience

This tutorial is intended for university and industry computer architects who are interested in recent research developments in the area of Speculative Precomputation.

· Abstract

Speculative Precomputation, or pre-execution, is a new latency tolerance technique that uses spare hardware contexts in a multithreaded processor to accelerate the execution of a single-threaded executable. It does this without offloading any of the computation of the original program, in contast to traditional parallelism. Speculative Precomputation executes code that precomputes data in the spare contexts that allows the main program to eliminate performance-degrading events, such as cache misses and branch mispredictions. It typically does this by borrowing code from the original program (using hand-coded, compiler driven, or automatic techniques), allowing it to precompute or prefetch things that traditional pattern based techniques for cache prefetching and branch prediction cannot. The increasing availability of multithreaded processors, coupled with the increasing importance of memory latencies and branch mispredictions to processor performance, make these techniques relevant and important. Researchers from both academia and industry have recently proposed and evaluated various techniques in this area. This tutorial covers several topics related to such recent developments, including architecture support for executing speculative threads, hardware and compiler techniques for extracting effective precomputation code, and performance evaluation of Speculative Precomputation systems on both simulators and silicon. In addition to covering current techniques and performance, this tutorial will also discuss the impact of Speculative Precomputation on processor and compiler design developments in industry.

· Outline

· Introduction (slides)

· Architectural Support for Speculative Precomputation (slides)

o Integrating precomputation information into the main thread
o Register integration
o Dynamic Speculative Precomputation
o Monitoring and controlling speculative threads
o Support for spawning speculative threads

· Compiler Support for Speculative Precomputation (slides)

o Strategies for compiler-based Speculative Precomputation
o Compiler optimizations for precomputation code
o Exploiting speculation to increase aggressiveness of compiler optimizations
o Compiler implementation tradeoffs

· Speculative Precomputation: an industrial perspective (slides)

· Presenters' Bios

Donald Yeung received his Ph.D. in 1998 from the Massachusetts Institute of Technology, where he was a member of the MIT Alewife Project. Currently, Dr. Yeung is an Assistant Professor in the Electrical and Computer Engineering Department at the University of Maryland at College Park, and co-directs the University of Maryland's Systems and Computer Architecture Laboratory. His research interests lie in the areas of computer architecture, performance evaluation of computer systems, and the interaction of architectures, compilers, and applications. Dr. Yeung is a recipient of an NSF Faculty Early Career Development Award.

Dean Tullsen received his PhD. from the University of Washington in 1996, where he did his dissertation on simultaneous multithreading. He is an associate professor in the Computer Science and Engineering department at UCSD. He co-directs the High-performance Processor Architecture and Compilation Lab at UCSD. His research interests are in high performance computer architecture, including multithreading architectures, memory and cache subsystems, and architecture-compiler interaction. He holds three patents in the area of multithreading architectures. Dr. Tullsen is a recipient of an NSF Faculty Early Career Development Award.

Steve Shih-wei Liao currently works in the Microprocessor Research Group at Intel Labs. His research interests are in advanced microarchitectures and compiler optimizations. He received his B.S. degree from National Taiwan University, and M.S. and Ph.D. degrees from Stanford University.

· Contact

Donald Yeung
1327 A. V. Williams
College Park, MD 20742
voice: (301) 405-3649
fax: (301) 314-9281