Practical verification techniques for wide-area routing
- ISSN: 01464833
- DOI: 10.1145/972374.972390
Abstract
Protocol and system designers use verification techniques to analyze a system's correctness properties. Network operators need verification techniques to ensure the "correct" operation of BGP. BGP's distributed dependencies cause small configuration mistakes or oversights to spur complex errors, which sometimes have devastating effects on global connectivity. These errors are often difficult to debug because they are sometimes only exposed by a specific message arrival pattern or failure scenario. This paper presents an approach to BGP verification that is primarily based on static analysis of router configuration. We argue that: (1) because BGP's configuration affects its fundamental behavior, verification is a program analysis problem, (2) BGP's complex, dynamic interactions are difficult to abstract and impossible to enumerate, which precludes existing verification techniques, (3) because of BGP's flexible, policy-based configuration, some aspects of BGP configuration must be checked against a higher-level specification of intended policy, and (4) although static analysis can catch many configuration errors, simulation and emulation are also necessary to determine the precise scenarios that could expose errors at runtime. Based on these observations, we propose the design of a BGP verification tool, discuss how it could be applied in practice, and describe future research challenges.
Practical verification techniques for wide-area routing
Nick Feamster
MIT Computer Science and Artificial Intelligence Laboratory
200 Technology Square, Cambridge, MA 02139
feamster@lcs.mit.edu
Abstract
Protocol and system designers use verification techniques to an-
alyze a system’s correctness properties. Network operators need
verification techniques to ensure the “correct” operation of BGP.
BGP’s distributed dependencies cause small configuration mistakes
or oversights to spur complex errors, which sometimes have devas-
tating effects on global connectivity. These errors are often difficult
to debug because they are sometimes only exposed by a specific
message arrival pattern or failure scenario.
This paper presents an approach to BGP verification that is pri-
marily based on static analysis of router configuration. We ar-
gue that: (1) because BGP’s configuration affects its fundamen-
tal behavior, verification is a program analysis problem, (2) BGP’s
complex, dynamic interactions are difficult to abstract and impossi-
ble to enumerate, which precludes existing verification techniques,
(3) because of BGP’s flexible, policy-based configuration, some as-
pects of BGP configuration must be checked against a higher-level
specification of intended policy, and (4) although static analysis
can catch many configuration errors, simulation and emulation are
also necessary to determine the precise scenarios that could expose
errors at runtime. Based on these observations, we propose the de-
sign of a BGP verification tool, discuss how it could be applied in
practice, and describe future research challenges.
1. Overview and Motivation
Networks establish global reachability by exchanging informa-
tion using the Border Gateway Protocol v4 (BGP) [18]. Operators
need BGP’s configuration flexibility to achieve tasks in a wide va-
riety of network settings, but this flexibility also implies that mis-
configuration can cause the protocol to operate incorrectly. Cor-
rect routing depends on the consistent configuration of routers both
within an AS and across ASes; thus, minor localized errors can
cause BGP’s behavior to violate properties that are fundamental to
globally correct routing.
Most large ISPs contain several hundred routers, and the latest
Cisco IOS release contains over 7,000 configuration commands,
many of which can be configured with arbitrary parameters. Given
this complexity, mistakes are frequent and expected. While existing
tools facilitate some degree of configuration management [3, 9],
operators must frequently perform configuration tasks manually in
low-level, vendor-specific configuration languages [19].
As a result, misconfiguration is an all-too-frequent occurrence
that often disrupts connectivity [15]. Even minor aspects of local
configuration can influence global Internet connectivity. For exam-
ple, in 1997, a small ISP blackholed a significant portion of Internet
destinations when a misconfigured router announced a large por-
tion of prefixes in the Internet’s routing table as destinations inside
its own network [4]. This incident, now notorious as “AS 7007”,
resulted from a configuration that “redistributed” routes from that
AS’s interior routing protocol into BGP; as a result, much of the
Internet became unreachable. One week in July 2003 alone saw
several cases of local misconfiguration that caused global events,
including: (1) a misconfigured router in a small ISP caused it to be-
come a transit network for two tier-1 ISPs [17], and (2) a small Eu-
ropean ISP was assigning IP addresses to routers from unassigned
address space [2]. The ability to verify BGP configuration before
deployment most likely would have prevented these mishaps.
Today, network operators, protocol designers, and researchers
typically reason about BGP’s behavior by observing the effects of
a particular configuration on a live network. The state of the art
for router configuration typically involves logging configuration
changes and rolling back to a previous version when a problem
arises [3, 9]. This approach is not effective. First, the symptoms of
a configuration error may appear long after the configuration error
was introduced. For example, a network operator may not notice
a misconfigured route filter until a route advertisement that should
have been filtered is advertised. Therefore, operators may find re-
verting the configuration difficult, since the appearance of an error
is often not correlated with the configuration change that caused it.
Second, although reverting to a previous configuration may solve
the problem, it does not explain why the configuration was wrong
in the first place; this means that an operator may eventually rein-
troduce the error, especially if the error is conceptual.
The lack of a formal reasoning framework means that router con-
figuration is time-consuming and ad hoc. Furthermore, today’s
routing configuration is based on the manipulation of low-level
mechanisms (e.g., access control lists, import and export filters,
etc.), which makes routing configuration tedious, error-prone, and
difficult to reason about. Network operators need tools based on
systematic verification techniques to ensure that BGP’s operational
behavior is consistent with the intended behavior (i.e., that the
network is operating “correctly”). We propose a verification tool
that helps operators and protocol designers reason about high-level
properties of routing protocols.
We believe that wide-area routing will always need verification
tools, even as configuration languages improve. Although some
configuration errors are caused by simple typographical errors,
many configuration errors are conceptual. The policy-based nature
of wide-area routing configuration allows operators to accommo-
date a wide variety of conditions and network engineering tasks,
but this flexibility also enables operators to make catastrophic mis-
takes. As long as this possibility exists, operators will need mech-
anisms to check the correctness of routing configuration.
Verification implies a notion of correctness; for traditional pro-
grams and protocols, correctness simply requires adherence to a
specification. Our routing logic—a set of rules that facilitates rea-
soning about whether a routing protocol satisfies a particular prop-
ness for wide-area routing [6]. The logic classifies “correctness” in
terms of five properties: validity (the existence of a route implies
the existence of a corresponding path), visibility (the existence of
a path implies the existence of a corresponding route), safety (the
existence of a stable, unique path assignment), determinism (best
route selection is independent of message ordering and the presence
of sub-optimal routes), and information-flow control (the protocol
conforms to a specified information flow policy; that is, it does not
“leak” information). Although many aspects of BGP configura-
tion are clearly correct or incorrect, correctness is often determined
by how well the actual configuration conforms to the intended be-
havior. For example, an operator might specify that routes heard
from one AS should not be readvertised to another; a verifier should
check that access control lists, import and export policies, and com-
munities across multiple routers correctly implement this policy.
BGP’s flexible configuration makes it much more like a dis-
tributed program than a traditional protocol. BGP’s configuration
determines which (and whether) routing messages are distributed;
therefore, BGP verification must analyze the configuration.
We believe that a combination of static analysis and offline simu-
lation or emulation, which we call a sandbox, is the right approach
to BGP verification. Static analysis is useful for verifying systems
that are inherently difficult to model, which makes it appealing
for verifying BGP. Applying static analysis to BGP verification in-
volves examining the static configuration of all routers within an
AS and applying a set of rules to check for inconsistencies. This
approach poses several challenges. We must identify the configura-
tion aspects that can potentially affect correctness and design meth-
ods for expressing and testing correctness rules. In cases where the
notion of “correctness” is not hard and fast, static analysis should
at least verify that BGP’s configuration conforms to the intended
behavior. These challenges are exacerbated by the fact that BGP
configuration is distributed across many routers. Furthermore, be-
cause BGP’s behavior depends on the route advertisements it re-
ceives as well as failure scenarios, static analysis alone often cannot
determine whether a BGP configuration will cause incorrect behav-
ior. We believe that a sandbox can use inputs suggested by static
analysis to help operators identify problematic scenarios and ask
“what-if” questions about BGP configuration.
In this paper, we propose the design of a BGP verifier that a
network operator could use to test possible configurations offline,
prior to deploying them on a live network. We primarily focus on
unintentional configuration errors that result from configuration in-
consistencies within a single AS because, in practice, network oper-
ators typically only have access to their own configuration. Verifi-
cation could use external information (e.g., from a database [10] or
policy registry [11]) to find data inconsistencies or inter-AS policy
conflicts, but we focus on relatively language-independent config-
uration aspects that can be verified using configuration from one
AS with no additional information. Additionally, we discuss how
aspects of our work may provide insights that help improve future
configuration languages.
This paper presents the case for new, domain-specific BGP ver-
ification techniques (Section 2), identifies and classifies how these
techniques can check configuration aspects that affect BGP’s cor-
rectness (Section 3), and proposes an approach to verifying BGP
configuration using static analysis and sandboxing (Section 4).
2. Practical BGP Verification Techniques
In this section, we explain why we pursue static analysis, rather
than model checking, to identify errors. Since BGP’s behavior de-
pends on dynamic conditions (e.g., link failures, advertisements
from neighboring ASes, etc.), static analysis alone will not catch
all configuration errors. We argue that static analysis should be
combined with a BGP sandbox—simulation or emulation that de-
scribes BGP’s behavior under various dynamic conditions.
2.1 The case for static analysis
A BGP verifier should efficiently identify cases where BGP vi-
olates some correctness property. Although BGP operates much
like a distributed program, most program verification techniques
are not directly applicable. Theorem proving facilitates reasoning
about high-level system properties but is inefficient and often man-
ual; verification must be automatic to provide timely feedback to
network operators. Model checking has been popular for exam-
ining the correctness of traditional communications protocols with
well-defined state machines (e.g., TCP), so it might seem like a
reasonable approach. Unfortunately, applying model checking to
wide-area routing presents two serious difficulties. First, it requires
state space exhaustion, which is tedious—all possible message or-
derings, prefix combinations, AS paths, and so on—and BGP’s
complexity makes it difficult to collapse these states. More im-
portantly, the policy-based nature of wide-area routing makes it
impossible to ever know every possible state.
Figure 1 helps explain why model checking BGP is hard to get
right. In this example, AS 0 wants to load balance traffic between
its two links to AS 1. The router s sends traffic via the default exit
route unless it hears a route advertisement from the shaded router.
The shaded router only advertises routes to s that match the AS
path regular expression “1 [0-9]+ 3”; furthermore, it is misconfig-
ured and readvertising invalid routes to s. Let’s assume that it is
incorrectly setting the next-hop when readvertising routes. In this
situation, the shaded router only advertises invalid routes if AS 1
switches over to a backup AS. Since AS path “1 2 3” is a backup
path for AS 1, it would not be visible to AS 0 under normal cir-
cumstances. Thus, model checking AS 0’s configuration may fail
to explore the very states that expose bugs. Model checking might
be able to able to verify some aspects of BGP correctness under cer-
tain simplifying assumptions, but subtleties make model checking
BGP almost as complex as BGP itself.
Static analysis, which tests configurations against constraints
that correspond closely to the configuration itself, is a simpler ap-
proach to BGP verification. By analyzing only the configurations,
static analysis could identify that the shaded router was readvertis-
ing routes with unreachable next-hop values. In Section 3.1, we
discuss the configuration details that cause the incorrect behavior
in Figure 1 and describe how static analysis can catch this error.
2.2 The case for a BGP sandbox
Static analysis can efficiently identify configurations that may
cause BGP’s correctness to be violated, but BGP’s behavior also
depends on dynamics, which may expose emergent incorrect be-
havior from a seemingly correct configuration. For example, a
network operator may want to verify that the network operates
“correctly” during link failures (or more serious disasters), fluctu-
ations in traffic volumes, planned maintenance, or instances where
a neighboring AS suddenly begins advertising new routes. A BGP
sandbox (i.e., a simulation or emulation tool that models BGP’s be-
havior, given its configuration [8]) explores these interactions and
can help operators perform higher-level tasks (e.g., traffic engineer-
ing) correctly.
Obviously, verifying BGP configuration with a sandbox alone is
not tractable because the space of possible inputs is too large. Static
analysis can prune the input space to include only inputs that are
most likely to expose incorrect behavior. Additionally, to simplify
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime



