Algorithms
Data Structures and Algorithms for Maintaining Data Locality
PI: Michael A. Bender
As modern memory architectures grow in complexity, it is becoming increasingly important to design algorithms with high data locality.
Standard approaches parameterize algorithms by aspects of the memory hierarchy, such as the size and block size of each memory level. Unfortunately, this parameterization often leads to complex algorithms that are tuned to particular architectures. A promising new line of research is to develop memory-hierarchy-sensitive algorithms that avoid any memory-specific parameterization. Such platform-independent algorithms are said to be "cache-oblivious." If a cache-oblivious algorithm works optimally on a two-level hierarchy, then it works optimally on all levels of a multilevel memory hierarchy: cache-oblivious algorithms automatically tune to arbitrary memory architectures. This research involves maintaining data locality in irregular and dynamic settings, where the data flow is continually changing and unpredictable. The investigator will design cache-oblivious solutions for a variety of common problems in data manipulation. For example, the investigator has already designed the cache-oblivious B-tree. The cache oblivious B-tree simultaneously exploits locality at all granularities in a memory hierarchy, even when the hierarchy is dynamic and without any parameter tuning. He has already proved theoretically that a cache-oblivious B-tree has worst-case expected performance within roughly twice optimal for any block size. However, because the structure runs efficiently even with memory effects such as disk-prefetching, the empirical performance is even better, and cache-oblivious B-trees have been shown to outperform B-trees by over a factor of 2 in searches and an order of magnitude in range queries. (NSF)
Transactions EverywhereÂ
PIs: Michael A. Bender (with B. Kuszmaul, MIT and C. Leiserson, MIT)Â
Among the most basic problems inherent in the coordination of concurrent tasks is the enforcing of atomicity so that the partial results of one task do not inadvertently corrupt another task. Atomicity is typically enforced through locking protocols, but these protocols can introduce other complications, such as deadlock, unless restrictive methodologies in their use are adopted. This research proposes a novel approach, called transactions everywhere, which offers a way out of the rat's nest of concurrency protocols, allowing ordinary programmers to exploit parallelism freely. In 1993, Herlihy and Moss proposed transactional memory as an alternative mechanism for enforcing atomicity, since it allows the user to avoid many of the complications of locking. With transactional memory, a program can read and modify multiple, disparate memory locations as a single atomic operation, much as occurs within a database transaction. But, despite the innovative nature of Herlihy and Moss's proposal, hardware transactional memory (HTM) has never been implemented in a real system. Instead, the trend has been towards software transactional memory (STM), the overhead of which discourages the use of transactions.Â
This vertically integrated research project refocuses on Herlihy and Moss's original proposal of HTM, but adopting the point of view that transactions, rather than occurring infrequently in code, should be the rule, not the exception. The researchers contend that the transactions-everywhere approach can simplify parallel programming dramatically and that hardware support can make overheads negligible.Â
Transactions everywhere will make it far easier for all programmers, not just those who are specialists in today's arcane practices of parallel computing, to write correct high-performance multithreaded programs, thereby allowing concurrency to be employed in routine applications for ordinary users. This research project aims to confront and overcome the problems in developing transactions everywhere so that transactional memory may eventually become, like cache, an expected subsystem of any computer architecture. (NSF ITR)
Adversarial Contention Resolution
PI: Michael A. BenderÂ
Randomized backoff remains the method of choice for resolving contention in the use of multiple-access channels. The idea of backoff is that whenever a packet experiences a collision with another packet when attempting to use the channel, it retries, but with a diminished probability of transmission per subsequent time slot. If all packets cooperate in using this strategy and the channel is not oversubscribed, all packets are eventually transmitted without interference from other packets. Randomized backoff is perhaps best known in the context of the ethernet local-area network, although it is used in many different types of systems for resource sharing. Given the prominent role played by randomized backoff in computer systems, it is surprising that many aspects of backoff are not understood. To date, most analysis has focused on statistical arrival of packets on simple channels. The assumption of statistical arrival rates dangerously neglects the worst-case, however, because bursts and other pathological inputs are often the common case. Furthermore, many channels are not simple. This research develops a theory of the worst-case performance of backoff algorithms under various assumptions about the channel.
The research is guided by contention-resolution problems in transactional memory applications, which have multiple channels, have packets sizes that differ by orders of magnitude, can sometimes provide more feedback on collisions, and allow one packet to succeed in the case of a collision. The research provides a solid theoretical foundation and performance model for the many applications having multiple access channels. Without these performance models, implementations are likely to exhibit unpredictable performance and thus be buggy. For the target application of transactional programming in shared memory, a proper understanding of contention resolution is on the critical path for creating a viable system. Given the importance of contention resolution to many systems, this first attempt at understanding adversarial contention resolution will likely yield a rich bounty of theoretical problems whose solutions can have a major impact on the performance of many systems. (NSF)
Studies on Average ComplexityÂ
PI: Ker-I Ko
This project studies the average complexity of computational problems in three directions:Â
1) It studies the notion of average NP-completeness, particularly the notion of reductions for average NP-complete problems. It proposes a new tool for proving new reductions.Â
2) It investigates the relations between instance complexity and average time-complexity. It aims to use the notion of instance complexity to characterize the notion of pseudo randomness and apply this concept to define the notion of average hardness.Â
3) It studies the average complexity of numerical computation in the Turing machine-based theory of real-valued complexity theory. It proposes to use measure-theoretic concepts in this study.Â
The topics studied in this project are closely related to the design and analysis of algorithms, including the algorithms for the network design, data mining, and search algorithms, which are central in wireless and information technology. Advances in this research will also have a substantial effect on many areas of information

