Programmable and Energy Efficient Extreme-Scale Processors
Overview:
Today, integrating 4--8 state-of-the-art cores or 10s of smaller
cores on a single chip is commonplace. Since Moore's Law scaling is
expected to continue for the forseeable future, processors with as
many as 1000 cores will become possible within a few processor
generations. This project is investigating programming and
architectural support for such extreme-scale processors. Recent areas
of research include reuse distance analysis for evaluating
extreme-scale processors, scalability of processors out to extreme
scale, cache management techniques, locality optimizations, implicit
synchronization, and techniques for power efficiency.
People:
Faculty
Students
Alumni
Publications:
- Earlier Related Tech Report:
Meng-Ju Wu and Donald Yeung. Memory Performance Analysis for
Parallel Programs Using Concurrent Reuse Distance.
University of Maryland Institute for Advanced Computer Studies
Technical Report, UMIACS-TR-2010-10. October 2010.
(pdf)
Eric Lau, Jason Miller, Inseok Choi, Donald Yeung, Saman
Amarasinghe, and Anant Agarwal. Multicore Performance Optimization
Using Partner Cores. In Proceedings of the 3rd USENIX
Workshop on Hot Topics in Parallelism (HotPar '11). Berkeley,
CA. May 2011. (pdf)
Inseok Choi, Minshu Zhao, Xu Yang, and Donald Yeung.
Experience with Improving Distributed Shared Cache Performance on
Tilera's Tile Processor. IEEE Computer Architecture
Letters. Vol 10, No 2. July-December 2011.
(pdf, gzip'd
postscript)
- Earlier Related Workshop Paper:
Inseok Choi, Minshu Zhao, Xu Yang, and Donald Yeung. Early
Experience with Profiling and Optimizing Distributed Shared Cache
Performance on Tilera's Tile Processor. In Proceedings of
the 6th International Workshop on Unique Chips and Systems.
Atlanta, GA. December 2010. One of 2 best papers out of 12 papers
appearing in the workshop. (pdf, gzip'd postscript)
Wanli Liu and Donald Yeung. Using Aggressor Thread Information
to Improve Shared Cache Management for CMPs. In Proceedings
of the 18th International Conference on Parallel Architectures and
Compiler Techniques. Raleigh, NC. September 2009. (pdf, gzip'd postscript)
- Earlier Related Tech Report:
Wanli Liu and Donald Yeung. Probabilistic Replacement:
Enabling Flexible Use of Shared Caches for CMPs.
University of Maryland Institute for Advanced Computer
Studies Technical Report, UMIACS-TR-2008-13. July 2008.
(pdf)
ACM permission notice:
The documents contained in these directories are included by the
contributing authors as a means to ensure timely dissemination of
scholarly and technical work on a non-commercial basis. Copyright and
all rights therein are maintained by the authors or by other copyright
holders, notwithstanding that they have offered their works here
electronically. It is understood that all persons copying this
information will adhere to the terms and constraints invoked by each
author's copyright. These works may not be reposted without the
explicit permission of the copyright holder.
ACM copyright notice:
Copyright (c) 2000 by the Association for Computing Machinery,
Inc. Permission to make digital or hard copies of part or all of this
work for personal or classroom use is granted without fee provided
that copies are not made or distributed for profit or commercial
advantage and that copies bear this notice and the full citation on
the first page. Copyrights for components of this work owned by
others than ACM must be honored. Abstracting with credit is permitted.
To copy otherwise, to republish, to post on servers, or to
redistribute to lists, requires prior specific permission and/or a
fee. Request permissions from Publications Dept, ACM Inc., fax +1
(212) 869-0481,
or permissions@acm.org.
Funding:
Last updated: May 2012 by Donald Yeung (yeung@ece.umd.edu)