In this paper, we address a key challenge in designing flow-based traffic managers (TMs) for next-generation networks. One key functionality of a TM is to schedule the departure of packets on egress ports. This scheduling ensures that packets are sent in a way that meets the allowed bandwidth quotas for each flow. A TM handles policing, shaping, scheduling, and queuing. The latter is a core function in traffic management and is a bottleneck in the context of high-speed network devices. Aiming at high throughput and low latency, we propose a single-instruction-multiple-data (SIMD) hardware priority queue (PQ) to sort out packets in real time, supporting independently the three basic operations of enqueuing, dequeuing, and replacing in a single clock cycle. A proof of validity of the proposed hardware PQ data structure is presented. The implemented PQ architecture is coded in C++. Vivado high-level synthesis is used to generate synthesizable register transfer logic from the C++ model. This implementation on a ZC706 field-programmable gate array (FPGA) shows the scalability of the proposed solution for various queue depths with almost constant performance. It offers a 10 × throughput improvement when compared to prior works, and it supports links operating at 100 Gb/s.
CITATION STYLE
Benacer, I., Boyer, F. R., & Savaria, Y. (2018). A fast, single-instruction-multiple-data, scalable priority queue. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 26(10), 1939–1952. https://doi.org/10.1109/TVLSI.2018.2838044
Mendeley helps you to discover research relevant for your work.