Network-adaptive low-latency video communication over best-effort networks
- ISSN: 10518215
- DOI: 10.1109/TCSVT.2005.856919
Abstract
The quality of service limitation of today's best-effort networks poses major challenge for low-latency video communication. To combat network losses for real-time and on-demand video communication, which exhibits stronger dependency across packets, a network-adaptive coding scheme is employed to dynamically manage the packet dependency using optimal reference picture selection. The selection of the reference is achieved within a rate-distortion optimization framework and is adapted to the varying network conditions. For network-adaptive streaming of prestored video, based on an accurate loss-distortion model, a prescient scheme that optimizes the dependency of a group of packets is proposed to achieve global optimality as well as improved rate-distortion performance. With the improved trade-off between compression efficiency and error resilience, the proposed system does not require retransmission of lost packets, which makes less than one-second low-latency communication possible.
Author-supplied keywords
Network-adaptive low-latency video communication over best-effort networks
Network-Adaptive Low-Latency Video
Communication Over Best-Effort Networks
Yi J. Liang, Member, IEEE, and Bernd Girod, Fellow, IEEE
Abstract—The quality of service limitation of today’s best-effort
networks poses major challenge for low-latency video communi-
cation. To combat network losses for real-time and on-demand
video communication, which exhibits stronger dependency across
packets, a network-adaptive coding scheme is employed to dy-
namically manage the packet dependency using optimal reference
picture selection. The selection of the reference is achieved within
a rate-distortion optimization framework and is adapted to the
varying network conditions. For network-adaptive streaming of
prestored video, based on an accurate loss-distortion model, a
prescient scheme that optimizes the dependency of a group of
packets is proposed to achieve global optimality as well as im-
proved rate-distortion performance. With the improved trade-off
between compression efficiency and error resilience, the proposed
system does not require retransmission of lost packets, which
makes less than one-second low-latency communication possible.
Index Terms—Error resilience, H.264, low latency, network-
adaptive video coding, rate-distortion optimization, reference
picture selection, video streaming.
I. INTRODUCTION
S INCE the introduction of the first commercial products in1995, Internet video communication has experienced phe-
nomenal growth [1]. However, despite of the rapid expansion of
the underlying infrastructure, technological challenges are still
a major barrier to the wide adoption of online streaming media
today. Internet video communication today is plagued by vari-
ability in throughput, packet loss, and delay, due to network con-
gestion and the heterogeneous infrastructure. To mitigate these
effects, media streaming systems typically employ a large re-
ceiver buffer that introduces a latency of 5–15 s. This is unde-
sirable since the slow start-up is annoying and high latency se-
verely impairs the interactive playback features, such as VCR
functionality.
In contrast, for IP-based speech communication, the
end-to-end latency can usually be kept on the order of a
hundred milliseconds [2]. Nevertheless, low latency in video
streaming is much more difficult due to the sensitivity of the
compressed video stream against channel losses. For speech
coding, the dependency across successive data units is weak or
Manuscript received January 7, 2004; revised October 15, 2004. This work
was completed at the Department of Electrical Engineering, Stanford University,
and was supported in part by Hewlett Packard Laboratories and in part by the
Stanford Network Research Center (SNRC). This paper was recommended by
J. Arnold.
Y. J. Liang is with Qualcomm CDMA Technologies, San Diego, CA 92121
USA (e-mail: yiliang@stanfordalumni.org).
B. Girod is with the Department of Electrical Engineering, Stanford Univer-
sity, CA 94305 USA.
Digital Object Identifier 10.1109/TCSVT.2005.856919
there is no dependency at all. For typical motion-compensated
video coding, as is used in most of today’s codecs, this de-
pendency is much stronger. An inter-coded frame is predicted
from a reference picture with motion compensation, so that the
temporal redundancy across successive pictures is removed or
reduced to provide higher coding efficiency. However, proper
decoding of such inter-coded pictures depends on the error-free
reception and reconstruction of the reference picture it uses,
which is not guaranteed over lossy networks.
Assume a simplified scenario where an IP packet contains one
video frame. If a packet (frame) is lost, the proper reconstruc-
tion of all subsequent frames that depend on the lost frame is
affected. Hence, in a typical automatic repeat request (ARQ)-
based system, whenever a packet is lost, retransmission is re-
quired to guarantee the correct reception of each frame. How-
ever, the time for retransmission constitutes the major part of
the undesirable end-to-end delay.
In this paper, we are to address the latency issue in real-time
and streaming video applications that demand very low commu-
nication delay. Here we are not focusing on video applications
that can torlerate latency of up to a few seconds, but aiming to
achieve latency of only hundreds of miliseconds. For conver-
sational real-time applications, low latency is extreamly desir-
able since excess delay impaires communication interactivity.
Even for on-demand streaming applications in which delay re-
quirements used to be considered as more relaxed, voice over
IP (VoIP)-like low latency of hundreds of miliseconds signifi-
cantly improves interactive playback features, such as random
indexing, fast-forwarding and switching channels, which will,
hence, fundamentaly change the typical user experiences today.
To address the various challenges for video communication,
research efforts in recent years have particularly been directed
toward communication efficiency, error-robustness, and low la-
tency [3]–[10]. Many of the recent algorithms use rate-distortion
(R-D) optimization techniques to improve the compression ef-
ficiency [11]–[13], as well as to increase the error-resilient per-
formance over lossy networks [14], [15]. The goal of these opti-
mization algorithms is to minimize the expected distortion due
to both compression and channel losses subject to the bit-rate
constraint.
As is mentioned above, ARQ techniques incorporate channel
feedback and employ the retransmission of erroneous data
[16]–[20]. ARQ intrinsically adapts to the varying channel con-
ditions and tends to be more efficient in transmission. However,
for real-time communication and low-latency streaming, the
latency introduced by ARQ is a major concern.
Examples of different error-resilience schemes that introduce
lower latency than ARQ include intra/inter-mode switching
1051-8215/$20.00 © 2006 IEEE
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime


