Enabling Technologies for On-Chip Networks
Future large-scale chip multiprocessors will rely on sophisticated on-chip interconnection networks to provide efficient core-to-core communication. These networks must provide high throughput and low latency while meeting stringent power and area constraints at the same time.
Our research is focusing on the development of enabling technology for such on-chip networks; in particular, we are investigating aspects ranging from the circuit level, where we design efficient implementations of basic building blocks including channels, buffers and crossbar switches, to high-level aspects like network topologies, flow-control techniques, routing algorithms and fairness considerations.
Due to their long physical length, channels are usually the most dominant source of energy dissipation with a NoC. Fortunately, due to their highly regular structure they can benefit greatly from custom circuit design. Low-swing channels can yield a 2-4x improvement in energy efficiency over standard full-swing repeaters. However, deterioration in device mismatch is making it increasingly difficult to build low-swing channels with good reliability. We are investigating the design of a self-calibrating low-swing channel that can automatically detect and correct for input offset voltages in the receivers due to device mismatch. This allows us to fully realize the energy benefits of the low-swing channels without paying the price in reliability. [ more ]
We will evaluate the popular packet switching against elastic buffering and circuit switching. We will choose an optimal topology for each one. Elastic buffering and circuit switching remove all buffering from the routers.
Evaluations of elastic buffering suggest it can yield significant power savings and latency up to equal to packet switching. [ more ]
We are investigating various microarchitectural cost-performance tradeoffs for NoC routers. In particular, we are evaluating different allocator architectures in terms of matching efficiency, cost (delay/area/power), and their impact on network-level performance and developing improved allocation mechanisms based on the insights gained from this study. We are furthermore evaluating design trade-offs in the context of input buffer organization and management; specifically, we are developing methods for avoiding performance pathologies commonly associated with dynamically managed input buffers.
To facilitate detailed cost-benefit analysis of proposed microarchitectural enhancements, we have developed a highly parameterized, fully synthesizable RTL implementation of a state-of-the-art VC router. We are currently in the process of validating the design; once validation is complete, we plan to release the full Verilog source code to the research community as open source. [ more ]
Most applications use simple topologies (e.g. 2D mesh) with a large network diameter and average hop count. The flattened butterfly topology guarantees only two hops to every destination (at 2D) with minimal routing. This yields latency and power savings.
We will also investigate duplicating links in every direction, and dividing the network into sub-networks. [ more ]
[ top ]
James Chen [ email ]
Nan 'Ted' Jiang [ email ]
[ top ]