Sign up & Download
Sign in

Analysis of a continuous-time model of structural balance

by Seth A Marvel, Jon M Kleinberg, Robert D Kleinberg, Steven H Strogatz
Writing (2010)

Abstract

It is not uncommon for certain social networks to divide into two opposing camps in response to stress. This happens, for example, in networks of political parties during winner-takes-all elections, in networks of companies competing to establish technical standards, and in networks of nations faced with mounting threats of war. A simple model for these two-sided separations is the dynamical system dX/dt = X 2 where X is a matrix of the friendliness or unfriendliness between pairs of nodes in the network. Previous simulations suggested that only two types of behavior were possible for this system: either all relationships become friendly, or two hostile factions emerge. Here we prove that for generic initial conditions, these are indeed the only possible outcomes. Our analysis yields a closed-form expression for faction membership as a function of the initial conditions, and implies that the initial amount of friendliness in large social networks (started from random initial conditions) determines whether they will end up in intractable conflict or global harmony.

Cite this document (BETA)

Available from arxiv.org
Page 1
hidden

Analysis of a continuous-time model of structural balance

ar
X
iv
:1
01
0.
18
14
v1
[
nli
n.A
O]
9
O
ct
20
10
Analysis of a Continuous-time Model of Structural Balance
Seth A. Marvel†, Jon M. Kleinberg‡,∗ Robert D. Kleinberg‡, and Steven H. Strogatz†
†Center for Applied Mathematics, Cornell University, Ithaca, New York 14853
‡Department of Computer Science, Cornell University, Ithaca, New York 14853
It is not uncommon for certain social networks to divide into two opposing camps in response to
stress. This happens, for example, in networks of political parties during winner-takes-all elections,
in networks of companies competing to establish technical standards, and in networks of nations
faced with mounting threats of war. A simple model for these two-sided separations is the dynamical
system dX/dt = X2 where X is a matrix of the friendliness or unfriendliness between pairs of nodes
in the network. Previous simulations suggested that only two types of behavior were possible for
this system: either all relationships become friendly, or two hostile factions emerge. Here we prove
that for generic initial conditions, these are indeed the only possible outcomes. Our analysis yields a
closed-form expression for faction membership as a function of the initial conditions, and implies that
the initial amount of friendliness in large social networks (started from random initial conditions)
determines whether they will end up in intractable conflict or global harmony.
INTRODUCTION
The mathematical model that we want to study is best
understood as an outgrowth of a theory from social psy-
chology known as structural balance [1]. So let’s begin
with a brief explanation of what this theory says.
Consider three individuals: Anna, Bill and Carl, and
suppose that Bill and Carl are friends with Anna, but are
unfriendly with each other. If the sentiment in the rela-
tionships is strong enough, Bill may try to strengthen his
friendship with Anna by encouraging her to turn against
Carl, and Carl might likewise try to convince Anna to
terminate her friendship with Bill. Anna, for her own
part, may try to bring Bill and Carl together so they can
reconcile and become friends. In abstract terms, rela-
tionship triangles containing exactly two friendships are
prone to transition to triangles with either one or three
friendships.
Alternately, suppose that Anna, Bill and Carl all view
each other as rivals. In many such situations, there are
incentives for the two people in the weakest rivalry to co-
operate and form a working friendship or alliance against
the third. In these cases, a single friendship may be
prone to appear in a relationship triangle that initially
has none.
These two thought experiments suggest a notion of sta-
bility, or balance, that can be traced back to the work of
Heider [2]. Heider’s theory was expanded into a graph-
theoretic framework by Cartwright and Harary [3], who
considered graphs on n nodes (representing people, coun-
tries or corporations) with edges signed either positive
(+) to denote friendship or negative (−) to denote rivalry.
If a social network feels the proper social stresses (those
felt by Anna, Bill and Carl in the examples above), then
Cartwright and Harary’s theory predicts that in steady
state the triangles in the graph should contain an odd
number of positive edges—in other words, three positive
edges or one positive edge and two negative edges. We
refer to such triangles as balanced, and triangles with an
even number of positive edges as unbalanced. Finally, we
call a graph complete if it contains edges between all pairs
of nodes, and we say that a complete graph with signs
on its edges is balanced if all its triangles are balanced.
(All graphs in our discussion will be complete.)
As it turns out, these local notions of balance theory
are closely related to the global structure of two opposing
factions. In particular, suppose that the nodes of a com-
plete graph are partitioned into two factions such that
all edges inside each faction are positive and all edges
between nodes in opposite factions are negative. (One
of these factions may be empty, in which case the other
faction includes all the nodes in the graph, and conse-
quently all edges of the network are positive.) Note that
this network must be balanced, since each triangle either
has all three members in the same faction (yielding three
positive edges) or has two members in one faction and
the third member in the other faction (yielding one pos-
itive edge and two negative ones). In fact, a stronger
and less obvious statement is true: any balanced graph
can be partitioned into two factions in this way, with one
faction possibly empty [3]. As a result, when we speak of
balanced graphs, we can equivalently speak of networks
with this type of two-faction structure.
MODEL
Structural balance is a static theory—it posits what
a “stable” signing of a social network should look like.
However its underlying motivation is dynamic, based on
how unbalanced triangles ought to resolve to balanced
ones. This situation has led naturally to a search for
a full dynamic theory of structural balance. Yet find-
ing systems that reliably guide networks to balance has
proved a challenge in itself.
A first exploration of this issue was conducted by Antal
Page 2
hidden
2et al. [4] who considered a family of discrete-time models.
In one of the main models of this family, an edge of the
graph is examined in each time step, and its sign is flipped
if this produces more balanced triangles than unbalanced
ones. While a balanced graph is a stable point for these
discrete dynamics, it turns out that many unbalanced
graphs called jammed states are as well [4, 5].
Thus, the natural problem became to identify and rig-
orously analyze a simple system that could progress to
balanced graphs from generic initial configurations. A
novel approach to this problem was taken by Ku lakowski,
Gawron´ski, and Gronek [6], who proposed a continuous-
time model for structural balance. They represented the
state of a completely connected social network using a
real symmetric n×n matrix X whose entry xij represents
the strength of the friendliness or unfriendliness between
nodes i and j (a positive value denotes a friendly relation-
ship and a negative value an unfriendly one). Note that
for a given X , there is a signed complete graph with edge
signs equal to the signs of the corresponding elements xij
in X . We will call X balanced if this associated signed
complete graph is balanced.
Ku lakowski et al. considered variations on the follow-
ing basic differential equation, which they proposed as
a dynamical system governing the evolution of the rela-
tionships over time:
dX
dt = X
2. (1)
Remarkably, simulations showed that for essentially any
initial X(0), the system reached a balanced pattern of
edge signs in finite time.
Writing Eq. 1 directly in terms of the entries xij gives
a sense for why this differential equation should promote
balance:
dxij
dt =

k
xikxkj . (2)
Notice that xij is being pushed in a positive or negative
direction based on the relationships that i and j have
with k: if xik and xkj have the same sign, their product
guides the value of xij in the positive direction, while if
xik and xkj have opposite signs, their product guides the
value of xij in the negative direction. In each case, this
is the direction required to balance the triangle {i, j, k}.
Note also that Eq. 2 applies for the case that i = j. While
this case is harder to interpret, the monotonic increase
of xii implied by Eq. 2 might be viewed in psychological
terms as an increase of self-approval or self-confidence as
i becomes more resolute in its opinions about others in
the network.
For a network with just three nodes, it can be eas-
ily proved that a variant of these dynamics generically
balances the single triangle in this network; such a three-
node analysis has been given by Ku lakowski et al. [6],
and we describe a short proof in the Supporting Informa-
tion. What is much less clear, however, is how the system
should behave with a larger number of nodes, when the
effects governing any one edge {i, j} are summed over all
nodes k to produce a single aggregate effect on xij .
It has therefore been an open problem to prove that
Eq. 2 or any of the related systems studied by Ku lakowski
et al. will bring a generic initial matrix X(0) to a bal-
anced state. It has also been an open problem to charac-
terize the structure of the balanced state that arises as a
function of the starting state X(0).
RESULTS
In this paper, we resolve these two open problems.
We first show that for a random initial matrix (drawn
from any absolutely continuous distribution), the system
reaches a balanced matrix in finite time with a probabil-
ity converging to 1 in the number of nodes n. In addition,
we provide a closed-form expression for this balanced ma-
trix in terms of the initial one; essentially, we discover
that the system of differential equations serves to “col-
lapse” the starting matrix to a nearby rank-one matrix.
We also characterize additional aspects of the process,
giving for example a description of an “exceptional” set
of matrices of probability measure converging to 0 in n
for which the dynamics are not necessarily guaranteed to
produce a balanced state.
We then analyze the solutions of the system for classes
of random matrices in the large-n limit—in particular,
we consider the case in which each unique matrix entry
is drawn independently from a distribution with bounded
support that is symmetric about a number µ (the mean
value of the initial friendliness among the nodes). In this
case, we find a transition in the solution as µ varies: when
µ > 0, the system evolves to an all-positive sign pattern,
whereas when µ ≤ 0, the system evolves to a state in
which the network is divided evenly into two all-positive
cliques connected entirely by negative edges. We end by
discussing some implications of the model and the associ-
ated transition between harmony and conflict, including
an evaluation of the model on empirical data and some
potential connections to research on reconciliation in so-
cial psychology.
Behavior of Model: Evolution to a Balanced State
Suppose we randomly select the xij(0)’s from a contin-
uous distribution on the real line. Then the xij(t)’s found
by numerical integration generally sort themselves in fi-
nite time into the sign pattern of two feuding factions. To
reformulate this observation as a precise statement and
Page 3
hidden
3explain why the behavior holds so pervasively, we now
solve Eq. 1 explicitly.
Solution to model. The initial matrix X(0) is real
and symmetric by assumption, so we can write it as
QD(0)QT where D(0) is the diagonal matrix with the
eigenvalues of X(0), denoted λ1 ≥ λ2 ≥ · · · ≥ λn, as
diagonal entries ordered from largest to smallest, and Q
is the orthogonal matrix with the corresponding eigen-
vectors of X(0), denoted ω1, ω2, . . . , ωn, as columns. The
superscript T signifies transposition.
The differential equation Eq. 1 is a special case of a
general family of equations known as matrix Riccati equa-
tions [7]. The analysis of the full family is complicated
and not fully resolved, but we now show that the special
case of concern to us, Eq. 1, has an explicit solution with
a form that exposes its connections to structural bal-
ance. We proceed as follows. First, we observe that by
separation of variables, the solution of the single-variable
differential equation x˙ = x2 (overdot representing differ-
entiation by time) with initial condition x(0) = λk is
ℓk(t) =
λk
1 − λkt
. (3)
Therefore the diagonal matrix D(t) = diag(ℓ1(t),
ℓ2(t), . . . , ℓn(t)) is the solution of Eq. 1 for the initial
condition X(0) = diag(λ1, λ2, . . . , λn).
Moreover Y (t) = QD(t)QT is also a solution of Eq. 1
since Y˙ = QD˙QT = Q(D2)QT = (QDQT )2 = Y 2. But
Y (t) has the same initial condition as X(t) in our original
problem: Y (0) = QD(0)QT = X(0). So by uniqueness,
Y (t) = QD(t)QT must be the solution we seek.
Our solution X(t) can also be written in a different way
to mimic the solution of the one-dimensional equation
x˙ = x2. Since xij(t) =
∑n
k=1 qikℓk(t)qjk, where qij is
the (i, j)th entry of Q, we can expand the denominators
of the ℓk(t) functions in powers of t to rewrite X(t) as
X(0) + X(0)2t + X(0)3t2 + · · · , or more concisely,
X(t) = X(0)[I − tX(0)]−1. (4)
(Note that the matrices X(0) and [I − X(0)t]−1 com-
mute.) This equation is valid when t is less than the
radius of convergence of every λk, that is when t < 1/λ1
(assuming λ1 > 0).
Finally we note that the above method of solving
Eq. 1 contains a reduction of the number of dynami-
cal variables of the system from
(n+1
2
)
to n. The
(n
2
)
constants of motion generated by this reduction are
just the off-diagonal elements of QTX(t)Q = D(t), or
∑n
k=1
∑n
ℓ=1 qkixkℓ(t)qℓj = 0 for all 1 ≤ i < j ≤ n. Fur-
thermore, the procedure for reducing X(t) can be easily
generalized to any system of the form X˙ = f(X) where
f is a polynomial of X .
Behavior of solution. Let’s now examine the behav-
ior of our solution X(t) to see why in the typical case it
splits into two factions in finite time. It turns out that
this is the guaranteed outcome if the following three con-
ditions hold (and as we will see below, they hold with
probability converging to 1 as n goes to infinity):
1. λ1 > 0,
2. λ1 6= λ2 (and hence λ1 > λ2), and
3. all components of ω1 are nonzero.
To see why these conditions imply a split into two fac-
tions, observe from Eq. 3 that each ℓk(t) diverges to in-
finity at t = 1/λk. Since xij(t) =
∑n
k=1 qikℓk(t)qjk , all
xij ’s diverge to infinity when the ℓk with the smallest
positive 1/λk does. Under the first and second condi-
tions, this ℓk is ℓ1, so the blow-up time t∗ of Eq. 1 must
be 1/λ1. To show that the nodes are partitioned into two
factions as X(t) approaches t∗, let X(t) = X(t)/||X(t)||
on the half-open interval [0, t∗), where ||X(t)|| denotes
the Frobenius norm of X . The matrix X(t) has the sign
pattern of X(t), and as t approaches t∗ it converges to
the rank-one matrix
X∗ = Q diag(1, 0, 0, . . . , 0) QT = ω1ωT1 (5)
Now let ω1k denote the value of the kth coordinate of
ω1, and let S = {k : ω1k > 0} and T = {k : ω1k < 0}.
Then S and T partition the node indices 1, 2, . . . , n by our
condition that ω1 has no zero components. From Eq. 5,
this partition must correspond to two cliques of friends
joined by a complete bipartite graph of unfriendly ties.
The three conditions. We now return to the three
conditions above. We first show that the second and third
hold with probability 1. We then show that the first con-
dition holds with probability converging to 1 as n goes to
infinity. Lastly, we analyze the behavior of the system in
the unlikely event that the first condition does not hold.
The fact that the conjunction of all three conditions holds
with probability converging to 1 as n grows large justifies
our earlier claim that the behavior described above holds
for almost all choices of initial conditions.
First we show why the second and third conditions
hold with probability 1 so long as the (joint) distribution
from which X(0) is drawn is absolutely continuous with
respect to Lebesgue measure—in other words, assigns
probability zero to any set of matrices whose Lebesgue
measure is zero. Our arguments below make use of the
following two basic facts:
i. the set of zeros of a nontrivial multivariate polyno-
mial has Lebesgue measure zero, and
ii. the existence of a common root of two univariate
polynomials P and Q is equivalent to the vanishing of
a multivariate polynomial in the coefficients of P and
Q (specifically, it is equivalent to the vanishing of the
determinant of the Sylvester matrix of P and Q, also
called the resultant of P and Q).
Page 4
hidden
4To show that λ1 6= λ2 with probability 1, let P denote
the characteristic polynomial of X(0), and let Q denote
the derivative of P . Then X(0) has a repeated eigenvalue
if and only if P has a repeated root, which it does if and
only if P and Q have a common root. This condition
is equivalent to the vanishing of the resultant of P and
Q, which is a multivariate polynomial in the entries of
X(0). The polynomial cannot be zero everywhere, be-
cause there is at least one symmetric matrix that does
not have a repeated eigenvalue. So the set of matrices
having a repeated eigenvalue has Lebesgue measure zero.
Similarly, to show that all components of ω1 are
nonzero, let P denote the characteristic polynomial of
X(0) and Pi the characteristic polynomial of the (n −
1)× (n−1) submatrix Xi(0) obtained by deleting the ith
row and ith column of X(0). It is easy to check that if
any eigenvector of X(0) has a zero in its ith component,
then the vector obtained by deleting that component is
an eigenvector of Xi(0) with the same eigenvalue. Con-
sequently, P and Pi must have a common root, implying
that the resultant of P and Pi vanishes. This resultant
is once again a multivariate polynomial in the entries
of X(0), and once again it must be nonzero somewhere
because there is at least one symmetric matrix whose
eigenvectors all have nonzero entries. Hence, the set of
matrices having an eigenvector with zero in its ith com-
ponent has Lebesgue measure zero.
Finally, to determine the likelihood of the first condi-
tion, we first must say a bit more about the way that
X(0) is selected. Suppose that the off-diagonal xij(0)’s
are drawn randomly from a common distribution F and
the on-diagonal xii(0)’s are drawn randomly from a com-
mon distribution G. All selections are independent for
i ≤ j. (For i > j, we let xij(0) = xji(0), so that X(0)
is symmetric.) For this construction of X(0), Arnold [9]
has shown that with the remarkably weak additional as-
sumption that F has a finite second moment, Wigner’s
semicircle law holds in probability as n grows to infinity.
This in turn implies that λ1 > 0 in probability in the
same limit.
Moreover, suppose we are in the low-probability case
that λ1 ≤ 0. In this case, the analysis above shows that
all the functions ℓi(t) converge to 0 as t → ∞. Thus,
limt→∞ D(t) = 0, and since X(t) = QD(t)QT , we also
have limt→∞X(t) = 0.
Although the entries of X(t) converge to zero when
λ1 ≤ 0, one might still want to know if the sign pattern
of X(t) is eventually constant (i.e., remains unchanged
for all t above some threshold value) and, if so, what
determines this sign pattern. It is possible to answer
this question, again assuming the second and third con-
ditions. By expanding the function ℓi(t) = λi/(1 − λit)
in powers of u = 1/t, we obtain the asymptotic series
ℓi(t) = −u− u2λ−1i −O(u3), (6)
which implies
X(t) = QD(t)QT = −uI − u2X(0)−1 −O(u3). (7)
In the limit of small u, the leading order term of the
diagonal entries of X(t) is the linear term, which has
negative sign. For the off-diagonal entries of X(t), the
leading-order term as u tends to zero is the quadratic
term, whose sign matches the sign of the corresponding
off-diagonal entry of the matrix −X(0)−1.
Behavior of Model: From Factions to Unification
The analysis in the previous section tells us how to find
both the blow-up time t∗ and final sign configuration of
a network if we know its initial state X(0). However we
might also want to know whether we can characterize the
behavior of X(t) in the large-n limit in terms of statistical
parameters of X(0). This could, for example, help us
forecast the behavior of large populations when collecting
complete relationship-level data is not feasible.
In this section, we show that there is a transition from
final states consisting of two factions to final states con-
sisting of all positive relations as the “mean friendliness”
of X(0) (the mean of the distributions used to gener-
ate the off-diagonal entries of X(0)) is increased from
negative to positive values. This is consistent with the
numerical simulations shown in Fig. 1.
Before discussing the details though, we should de-
scribe how X(0) is selected in this section. We start by
adopting the procedure of Fu¨redi and Komlo´s [8]: the ele-
ments xij(0) are drawn independently from distributions
Fij with zero mass outside of [−K,K]. The off-diagonal
Fij ’s have a common expectation µ and finite variance
σ2, while the on-diagonal Fii’s have a common expec-
tation ν and variance τ2. In addition, we require that
each off-diagonal distribution Fij be symmetric about µ.
Now let’s consider the three cases of positive, zero and
negative µ.
Case 1: µ > 0. The results of Fu¨redi and Komlo´s [8]
show that when µ > 0, the deviation of ω1 from
(1, 1, . . . , 1)/√n vanishes in probability in the large-n
limit. Hence the final state of the system consists of
one large clique of friends containing all but at most a
vanishing fraction of the nodes. Moreover, by assuming
a bound on σ we can strengthen this statement further:
if σ < µ/2, then the findings of Fu¨redi and Komlo´s imply
that the final state consists of a single clique of friends,
with no negative edges. These observations are consistent
with the representative numerical trial shown in Fig. 1A.
Moreover, Fu¨redi and Komlo´s show that the asymptotic
behavior of λ1 grows like µn+O(1), and hence the blow-
up time scales like 1/(µn).
We can gain insight into the behavior of the system
for small t using an informal Taylor series calculation:
Page 5
hidden
50 1
time
0 10 1
-10
0
10
xij
A B C
+ –
– +
+/–
mixed
X t*( ) =ε–
X(0) =
FIG. 1: Representative large-n plots of the model for (A) µ > 0 (µ = 3/10 in the plot shown), (B) µ = 0, and (C) µ < 0 (µ = −3
in the plot shown). For all three plots, σ = 1 and n = 90. To reduce image complexity, only one randomly sampled fifth of the
trajectories is included. In the second plot, t∗ denotes the time at which the system diverges, and ǫ denotes a sufficiently small
displacement. The white curves superimposed on the three plots are the large-n trajectories xij(t) = xij(0)− µ + µ/(1− µnct)
for xij(0) = µ, µ ± 3σ/2, where c represents a rescaling of time. Since we want to fix the blow-up time t∗ near 1 and since
ct∗ = 1/λ1 as found in the text, we choose c = 1/(µn+ν−µ+σ2/µ) for (A) and c = 1/(2σ
√n) for (B) and (C) using estimates
of λ1 taken from Ref. [8]. The black dotted lines mark the blow-up times t∗ = 1/(cλ1).
if we rescale time in Eq. 1 by inserting a 1/n before the
summation, compute the Taylor expansion of xij(t) term-
by-term and then take the expectation of each term, we
obtain the geometric series x(t) = µ + µ2t + µ3t2 + · · · ,
or
x(t) = µ1 − µt . (8)
With significantly more work, it can be proved that every
trajectory xij(t) has this time dependence on [0, 1/K) in
the large-n limit with probability 1 (see the Supporting
Information), so we may write
lim
n→∞
xij(t) = xij(0) − µ +
µ
1 − µt with prob. 1 (9)
for all t in [0, 1/K). Observe that this limit has a blow-up
time t∗ of 1/µ. Since our rescaling of time represents a
zooming in or magnification of time by a factor of n, this
t∗ corresponds to a blow-up time asymptotic to 1/(µn)
for the unrescaled system, consistent with the results of
Fu¨redi and Komlo´s.
Case 2: µ = 0. In the event that the network starts
from a mean friendliness of zero, numerical experiments
indicate that the system ends up with two factions of
equal size in the large-n limit (Fig. 1B). We now prove
this to be the case. For the remainder of this discussion,
we will abbreviate X(0) as A and xij(0) as aij .
Since the off-diagonal entries of A have symmetric dis-
tributions by assumption, we have for any off-diagonal
aij and any interval Sij on the real line that P (aij ∈
Sij) = P (−aij ∈ Sij). Now let D be a diagonal ma-
trix with some sequence of +1 and −1 along its diagonal
(where the ith diagonal entry is denoted by di). Then
the random matrices A and B = DAD are identically
distributed, as we will now show.
To say that A and B are identically distributed means
that for every Borel set of matrices S, P (A ∈ S) = P (B ∈
S). To prove this, it suffices to consider the case in which
S is a product of intervals Sij , since these product sets
generate the Borel sigma-algebra. The entries of A are
independent, so P (A ∈ S) = Πi≤jP (aij ∈ Sij). Simi-
larly, P (B ∈ S) = Πi≤jP (diaijdj ∈ Sij). By the symme-
try of the off-diagonal distributions, Πi≤jP (aij ∈ Sij) =
Πi≤jP (diaijdj ∈ Sij), which gives us P (A ∈ S) = P (B ∈
S) as desired. (Note that when i = j, the factor didj is 1
so the on-diagonal distributions need not be symmetric.)
Now consider the set S of matrices with an ω1 consist-
ing of all positive components. The above demonstra-
tion implies that the probability of choosing an A in this
set is the same as choosing an A such that B is in this
set. Regarding the later event, A(Dωi) = λi(Dωi) im-
plies Bωi = λiωi, so the λ1 eigenvector of the A used
to compute B is Dω1. This demonstrates that all sign
patterns for the components of ω1 are equally likely. In
other words, the distribution of the number of positive
components in ω1 is the binomial distribution B(n, 1/2)
and the fraction of positive components in ω1 converges
(in several senses) to 1/2 as n grows large.
Additionally, we can consider how λ1 varies with n in
the case that µ = 0 to determine when the blow-up will
occur. Fu¨redi and Komlo´s [8] found for this case that
λ1 ∈ 2σ
√n+O(n1/3 logn) with probability tending to 1,
Page 6
hidden
6so with probability tending to 1 the blow-up time shrinks
to zero like 1/√n, an order of √n slower than in the µ > 0
case.
Case 3: µ < 0. For this final case, Fu¨redi and
Komlo´s [8] found that λ1 < 2σ
√n + O(n1/3 logn) with
probability tending to 1. The semicircle law gives a lower
bound: λ1 > 2σ
√n + o(√n) in probability. So the blow-
up time goes to zero like 1/√n in the unrescaled system.
Note also that if we define a new matrix C = −A where
A is now the initial matrix X(0) of Case 3, then C sat-
isfies the condition of Case 1, µ > 0. Thus the distance
between the top eigenvector of C and (1, 1, . . . , 1)/√n
declines to zero in probability just as in Case 1. Further-
more, every other eigenvector of C is orthogonal to the
largest one. Hence if σ < |µ|/2, then with probability
tending to 1, every other eigenvector acquires a mixture
of positive and negative components in the large-n limit,
including the bottom eigenvector of C, which is the top
eigenvector of A. This establishes that in the case that
µ < 0 and σ < |µ|/2, the system ends up in a state with
two factions with probability converging to 1 for all finite
n.
Numerical simulations of the case that µ < 0 suggest
the conjecture that the two factions are approximately
equal in size for large n. Furthermore, the derivation
of Eq. 9 is in fact valid for all µ, so each trajectory
rapidly decays from xij(0) toward xij(0)− µ on [0, 1/K)
(Fig. 1C). This transient decay appears to extend be-
yond t = 1/K in numerical simulations. So, for example,
if time is rescaled by 1/√n instead of 1/n, we would hy-
pothesize that (i) each trajectory makes a complete jump
from xij(0) to xij(0) − µ in the large-n limit, and that
(ii) from this point onward, the system behaves like an
initial configuration of the µ = 0 case and so separates
into two equal factions en route to its blow-up at 1/(2σ).
DISCUSSION
In this final section, we review our results and their sig-
nificance relative to previous work in structural balance
theory. We then compare the predictions of the model
with data, discuss potential criticisms of the model, and
finish with some intriguing connections between the be-
havior of the model and recent social-psychological work
on neutralizing two-sided conflicts.
Our first result is a demonstration that the model
forms two factions in finite time across a broad set of
initial conditions. As noted at the outset, similar demon-
strations have not been possible for dynamic models of
structural balance in earlier literature because these mod-
els contained so-called jammed states that could trap a
social network before it reached a two-faction configura-
tion [4, 5]. The model of Ku lakowski et al. by contrast
has no such jammed states for generic initial conditions
and hence provides a robust means for a social network
to balance itself.
The second result of the paper is the discovery and
characterization of a transition from global polarization
to global harmony as the initial mean friendliness of the
network crosses from nonpositive to positive values. Sim-
ilar transitions have been observed in other models of
structural balance but so far none has been character-
ized at a quantitative level. For example, Antal et al. [4]
found a nonlinear transition from two cliques of equal
size to a single unified clique as the fraction of positively
signed edges at t = 0 was increased from 0 to 1 (see Fig.
5 of Ref. [4]). The authors provided a qualitative argu-
ment for this transition, but left open the problem of its
quantitative detail. Our results both confirm the gen-
erality of their observations and provide a quantitative
account of a transition analogous to theirs.
To complement the theoretical nature of our work and
get a better sense of how the model behaves in practice,
we can numerically integrate it for several cases of empir-
ical social network data where the real-life outcomes of
the time-evolution are known. Our first example is based
on a study by Zachary [10] who witnessed the break-up
of a karate club into two smaller clubs. Prior to the sep-
aration, Zachary collected counts of the number of social
contexts in which each pair of individuals interacted out-
side of the karate club, with the idea being that the more
social contexts they shared, the greater the likelihood
for information exchange. These counts, or capacities as
Zachary called them, can be converted to estimates of
friendliness and rivalry in many different ways. For a
large class of such conversions, Eq. 1 predicts the same
division that Zachary’s method found, which misclassi-
fied only 1 of the 34 club members (Fig. 2A,B).
A second example can be constructed from the data of
a study by Axelrod and Bennett [11] regarding the aggre-
gation of Allied and Axis powers during World War II. If
we simply take the entries of their propensity(i, j)·size(i)·
size(j) matrix to be proportional to the friendliness felt
between the various pairs of countries in the war, then
running the model gives the correct Allied-Axis split for
all countries except Denmark and Portugal (Fig. 2C).
Despite these modest successes, the model could still
be criticized as “a simplification and an idealization, and
consequently a falsification” [12]. Clearly, human behav-
ior is more complicated than what is captured by Eq. 1.
However, deliberate simplicity is a common feature of
many foundational mathematical models of basic social
phenomena, which are often designed to isolate and study
the effect of a single social force. Such models can be par-
ticularly appropriate in extreme settings where this single
force plays a dominant role, making human choices more
constrained and thus perhaps more predictable. In the
present case, the Ku lakowski et al. model is designed
to ignore all other social behaviors besides the urge to
make one’s friendships and rivalries consistent. In this
Page 7
hidden
70 1 2
-15
0
15
time
0 1 2
xij
-50
0
50
B CA
0 0.1 0.2 0.3
FIG. 2: Tests of the model of Ku lakowski et al. (Eq. 1) against two existing data sets. (A) The evolution of the model
starting from Zachary’s capacity matrix with the capacity of each relationship reduced by 0.58. This is the minimal downward
displacement necessary (to two significant figures) for the resulting separation to be correct for all but 1 of the 34 club members.
For reasons described by Zachary [10], this is basically the best separation we can expect. (B) The evolution of the model
from Zachary’s capacity matrix with the capacity of zero between the two club leaders replaced by −11; the resulting factions
are identical to those in (A). Substituted values less than −11 yield the same two factions, while greater values produce less
accurate divisions. (C) The evolution of the model starting from Axelrod and Bennett’s 1939 propensity(i, j) · size(i) · size(j)
matrix for the 17 countries involved in World War II (by Axelrod and Bennett’s definition). The model finds the correct split
into Allied and Axis powers with the exceptions of Denmark and Portugal. Axelrod and Bennett’s own landscape theory of
aggregation does slightly better—its only misclassification is Portugal.
respect it is a bit like problems in classical physics in-
volving frictionless surfaces and massless springs; it is
a mathematical cartoon of a single aspect of our social
experience. It may give mechanistic insight but is not
designed for quantitative prediction.
A more specific objection might be raised regarding the
divergence to infinity in finite time. However, since the
purpose of the model is to study the pattern of signs
that emerges, our main conclusion from the model is
that the sign pattern eventually stabilizes at a point be-
fore the divergence. This stabilization of the sign pat-
tern is our primary focus, and one could interpret the
subsequent singularity as simply the straightforward and
unimpeded “ramping up” of values caused by the system
once all inconsistencies have been worked out of the social
relations—the divergence itself can be viewed as taking
place beyond the window of time over which the system
corresponds to anything real. Alternately, one can imag-
ine that as the community completes its separation into
two groups, other social processes take over. For exam-
ple, individuals with differing ideological views or social
preferences may self-segregate, breaking the all-to-all as-
sumption of the model. In other cases, mounting tensions
may erupt into violence, reflecting a sort of bound on the
relationship intensity achievable for pairs of nodes in the
network.
Lastly we can ask, rather speculatively, whether the
model provides any hints on how to guide divided com-
munities toward reconciliation (in the cases where this is
a sensible goal). The work presented here implies that
the mean friendliness of the social network should be
an important target for modulation. This suggests one
potential strategy: (i) direct the attention of the social
network away from its divided status, (ii) encourage the
formation of friendships across the divide, and then (iii)
bring the network back to the task of managing the issue
that originally divided it, with the hope that the increase
in mean friendliness will push the network toward the
all-friends configuration. Remarkably Pettigrew, a social
psychologist, has recently proposed a similar hypothesis
with respect to overcoming prejudice, recommending the
longitudinal process of (i) diverting attention away from
ingroup-outgroup distinctions, (ii) allowing strong inter-
group friendships to form, and then (iii) refocusing the
community on social categorization until a single group
category emerges [13, 14]. Considering the differences in
discipline and methodology, the similarity between Pet-
tigrew’s sequence of steps and ours is striking, and the
combined lesson is clear: given the right combination of
diversion and bonding exercises, it may be possible to get
a fractured social network to resolve its differences and
begin to heal.
Acknowledgments. Research supported in part by
the John D. and Catherine T. MacArthur Foundation,
a Google Research Grant, a Yahoo! Research Alliance
Grant, an Alfred P. Sloan Foundation Fellowship, a Mi-
crosoft Research New Faculty Fellowship, a grant from
the Air Force Office of Scientific Research, and NSF
grants CCF-0325453, BCS-0537606, CCF-0643934, IIS-
0705774, and CISE-0835706. We would also like to thank
Page 8
hidden
8Nick Trefethen for pointers to the literature on matrix
Riccati equations.
∗ Electronic address: kleinber@cs.cornell.edu
[1] S. Wasserman and K. Faust, Social Network Analysis:
Methods and Applications, Structural Analysis in the So-
cial Sciences (Cambridge University Press, New York,
1994), pp. 220-248.
[2] F. Heider, The Journal of Psychology 21, 107 (1946).
[3] D. Cartwright and F. Harary, The Psychological Review
63, 277 (1956).
[4] T. Antal, P. L. Krapivsky, and S. Redner, Physical Re-
view E 72, 036121 (2005).
[5] S. A. Marvel, S. H. Strogatz, and J. M. Kleinberg, Phys-
ical Review Letters 103, 198701 (2009).
[6] K. Ku lakowski, P. Gawron´ski, and P. Gronek, Interna-
tional Journal of Modern Physics C 16, 707 (2005).
[7] H. Abou-Kandil, G. Freiling, V. Ionescu, and G. Jank,
Matrix Riccati Equations in Control and Systems Theory
(Birkha¨user, Basel, 2003), p. 21.
[8] Z. Fu¨redi and J. Komlo´s, Combinatorica 1, 233 (1981).
[9] L. Arnold, Probability Theory and Related Fields 19,
191 (1971).
[10] W. W. Zachary, Journal of Anthropological Research 33,
452 (1977).
[11] R. Axelrod and D. S. Bennett, British Journal of Political
Science 23, 211 (1993).
[12] A. M. Turing, Philosophical Transactions of the Royal
Society of London, Series B, Biological Sciences 237, 37
(1952).
[13] T. F. Pettigrew, Annual Review of Psychology 49, 65
(1998).
[14] T. F. Pettigrew and L. R. Tropp, Journal of Personality
and Social Psychology 90, 751 (2006).
Page 9
hidden
ar
X
iv
:1
01
0.
18
14
v1
[
nli
n.A
O]
9
O
ct
20
10
Supporting Text
Seth A. Marvel†, Jon M. Kleinberg‡,∗ Robert D. Kleinberg‡, and Steven H. Strogatz†
†Center for Applied Mathematics, Cornell University, Ithaca, New York 14853
‡Department of Computer Science, Cornell University, Ithaca, New York 14853
REFERENCED RESULTS
In the introduction of our paper, we assert that a vari-
ant of the dynamics proposed by Ku lakowski et al. gener-
ically balances an isolated triangle. We explain what we
mean here.
Theorem 1. The system x˙12 = x13x23, x˙13 = x12x23,
x˙23 = x12x13 achieves balance when the initial values
x12(0), x13(0) and x23(0) are all unequal.
Proof. Multiplying each x˙ij by xij yields x12x˙12 =
x13x˙13 = x23x˙23. Integrating these equalities gives the
constraints x212−x213 = C1 and x212−x223 = C2 which par-
tition the three-dimensional space of (x12, x13, x23) into
trajectories (with the direction of flow given by the orig-
inal dynamical system). Examination of this flow reveals
that each initial condition (x12(0), x13(0), x23(0)) with
distinct coordinates flows into one of the four octants on
which Heider balance holds, that is where x12x13x23 > 0.
Furthermore, these octants each act as separate trapping
regions: once a trajectory enters, it cannot leave. Hence,
the theorem follows. 
The next theorem regards the main system of the pa-
per with a rescaling of time: ddtX = n−1X2, where X is
a real symmetric n×n matrix. Recall that xij(t) denotes
the (i, j)th element of the solution matrix X(t) subject
to the initial condition X(0). In the following, we will ab-
breviate X(0) as A and xij(0) as aij . Suppose that the
aij , i ≤ j, are drawn independently from distributions
Fij with zero mass outside [−K,K], and the off-diagonal
distributions Fij have common expectation µ and vari-
ance σ2.
Theorem 2. limn→∞ xij(t) = aij − µ + µ/(1 − µt)
with probability 1 for t ∈ [0, 1/K).
Proof. Regard each step of the limit n → ∞
as a selection and concatenation of elements
{ain}1≤i≤n−1, {anj}1≤j≤n−1, ann to the elements
{aij}1≤i,j≤n−1 selected in preceding steps. Now consider
the partial sum of the Taylor series expansion of xij(t):
xijnN (t) =
N

k=0
αkntk where αkn =
1
k!
dkxij
dtk




t=0
(1)
The first step of the proof of Theorem 2 consists of prov-
ing that limN→∞ limn→∞ xijnN (t) converges to aij −µ+
µ/(1 − µt) with probability 1 on [0, 1/|µ|) (see Lemma
1). The second step of the proof consists of proving that
limN→∞ limn→∞ xijnN (t) = limn→∞ xij(t) with prob-
ability 1 on [0, 1/K) (see Lemma 2). Since we can
write limn→∞ xij(t) as limn→∞ limN→∞ xijnN (t), this
amounts to showing that the two limits can be exchanged
on [0, 1/K). The above theorem then follows trivially by
a union bound. 
Lemma 1. Under the assumptions of Theorem 2,
limN→∞ limn→∞ xijnN = aij−µ+µ/(1−µt) with prob-
ability 1 for t ∈ [0, 1/|µ|).
Proof. For the sake of generality, we present a proof
with more mild assumptions than those of the rest of
the paper: we only require that the moments of the Fij
distributions be finite (and off-diagonal distributions to
have mean µ), not that the aij values be bounded by K
with probability 1.
Define αk∞ = limn→∞ αkn (merely shorthand—we do
not assume the limit exists). By a union bound, we have
Pr
(


k=1
[αk∞ = µk+1]
)
≥ 1 −


k=1
[1 − Pr(αk∞ = µk+1)]
(2)
So if we can show that Pr(αk∞ = µk+1) = 1 for all
k ≥ 1, then Pr(⋂∞k=1[αk∞ = µk+1]) = 1. In this case,
limN→∞ limn→∞ xijnN has the convergent Taylor series
aij +
∑∞
k=1 µk+1tk on [0, 1/|µ|) with probability 1, which
proves the lemma.
So our task reduces to showing that Pr(αk∞ = µk+1) =
1 for each k ≥ 1. In order to do this, we need to com-
pute the leading behavior of αkn in n. To calculate the k
time derivatives of xij in the formula for αkn (see Eq. 1),
we alternate between applying the chain rule of differ-
ential calculus and substituting in the right-hand side of
x˙ij = n−1

k xikxkj (our system X˙ = n−1X2 written in
element-wise fashion). This gives
αkn = n−k
n

m1=1
n

m2=1
· · ·
n

mk=1
aim1am1m2 · · · amkj (3)
where the factor n−k comes from the k factors of n−1
introduced by the k derivatives, and the factor 1/k! in
the formula for αkn cancels with a factor k! that arises
from repeated applications of the chain rule. In Eq. 3,
the dominant term is a sum of the edge value products
of all simple length-(k + 1) paths between i and j. This
Page 10
hidden
2sum contains (n− 2)!/(n− 2−k)! terms. All other paths
include fewer immediate nodes and thus have at least a
factor of n fewer terms in their sums.
Our goal then for the remainder of the proof is to show
that the first term of Eq. 3 is the only term that remains
after taking n to infinity, and that it converges to µk+1
with probability 1. To simplify notation, let ℓ denote the
product of the edge values aij along a particular path of
length k + 1 (not necessarily simple) from i to j, and let
L denote the set of all such products on paths with the
same configuration, or pattern of connectivity. Denote
the set of all L by {L}, and let S denote the one L in
{L} consisting of simple paths of length k + 1 from i to
j.
Now observe that ∩L[limn→∞ n−k

ℓ∈L ℓ = 0] ∩ [n−k

ℓ∈S ℓ = µk+1] ⊂ [αk∞ = µk+1]. So by another union
bound,
Pr(αk∞ = µk+1) ≥ 1 − Pr
(
lim
n→∞
n−k

ℓ∈S
ℓ 6= µk+1
)


{L}\S
Pr
(
lim
n→∞
n−k

ℓ∈L
ℓ 6= 0
)
(4)
Hence, we are done if we can show that (i)
Pr(limn→∞ n−k

ℓ∈S ℓ = µk+1) = 1 for S and (ii)
Pr(limn→∞ n−k

ℓ∈L ℓ = 0) = 1 for all other L. Al-
though ∑ℓ∈L ℓ is in general a sum of correlated random
variables, it is possible to adapt a standard proof of the
strong law of large numbers for uncorrelated random vari-
ables to prove both items. We do this next.
Let’s prove (ii) first. For brevity, let Sn =

ℓ∈L ℓ
and choose v to denote the number of nodes in the
path configuration of L. For any positive ǫ and r =
1, 2, . . . , Markov’s inequality gives Pr(|Sn| ≥ (nǫ)k) ≤
E(|Sn|r)/(nǫ)kr . So if we can find an r such that
E(|Sn|r)/(nǫ)kr ≤ C/n2 for some constant C (dependent
on ǫ), then

n Pr(|Sn| ≥ (nǫ)k) converges, and by the
first Borel-Cantelli lemma, Pr(|n−k ∑ℓ∈L ℓ| ≥ ǫ i.o.) = 0
for all ǫ > 0 (where i.o. stands for infinitely often).
Careful reflection reveals that ∪ǫ[|n−k

ℓ∈L ℓ| ≥ ǫ i.o.]
(for, say, all rational ǫ) is the complementary event of
[limn→∞ n−k

ℓ∈L ℓ = 0], and so we have arrived at the
desired result (ii).
Hence, in order to actually show (ii), we need to find
an r such that E(|Sn|r)/(nǫ)kr ≤ C/n2. Consider r = 2:
E(S2n) =
∑E(ℓxℓy), where each index of the sum ranges
independently over L. There are (n−2)!/(n−2−v)! paths
ℓ in L, so there are fewer than n2v terms in

E(ℓxℓy),
and E(S2n) ≤ Dn2v for some constant D. Since v <
k for all L other than S, we have E(|Sn|2)/(nǫ)2k ≤
Dn2v/(nǫ)2k ≤ C/n2 where C = Dǫ−2k, and the proof
of (ii) is complete.
Finally, to prove (i), start by replacing each factor axy
in ℓ with bxy + µ, where bxy = axy − µ. Now expand
the result and cancel µk+1 from both sides of n−kSn =
µk+1 to obtain n−kS′n = 0, where S′n is a sum over S of
a polynomial Q with 2k+1 − 1 terms, each of the form
µµ · · ·µbuvbwx · · · byz where at least one of the factors is
a bxy and the total number of factors in the term is k+1.
Note that each place of Q corresponds to a particular set
of bxy’s from the original simple path, e.g. the 14th place
of Q might have bxy’s corresponding to the 1st, 4th, 5th,
and 7th edges of the path, and µ’s for the other edges.
Now let mq denote the number of vertices (excluding i
and j) among the subscripts of the bxy’s in a given term.
The remaining k − mq nodes of the path not found in
the term (supplanted by the µ’s) can take any of (n −
2−mq)!/(n− 2 − k)! permutations. Hence, there are no
more than nk−mq identical copies of any one term in S′n
from the same place in Q.
Now consider one of the (2k+1−1)4 ways that terms in
the 2k+1−1 places of Q can be multiplied together in S′4n .
Note that this can produce no more than n4k−
∑4
q=1 mq
identical copies of the same term. Second, since the bxy’s
each have expectation zero, every bxy in the final term
must appear to at least a power of two or the whole term
has expectation zero. This implies that for each nonva-
nishing term, there must be some pattern of matching
between the bxy’s. The number of possible matchings
is clearly a function of k and not n (it certainly is not
more than the number of partitions of 4(k + 1) edges),
so consider one of these possible matchings. Now ob-
serve that if, as we stated above, we consider only one of
the (2k+1 − 1)4 ways that terms in the 2k+1 − 1 places
of Q can be multiplied together in S′4n , then no more
than n
∑4
q=1 mq/2 distinct nonvanishing terms can be con-
structed per matching for any such way of combining
terms. This holds because each bxy needs at least one
match, and so the number of free nodes cannot exceed
half the total number of bxy’s in the final term. Thus we
have shown the highest order of n possible for E(S′4n ) is
given by the maximum value of n
∑4
q=1 mq/2n4k−
∑4
q=1 mq .
Since mq ≥ 1 for each q = 1, . . . , 4, this can at most be
n4k−2, which by the above reasoning completes the proof
of (i) and hence the full theorem. 
Lemma 2. Under the assumptions of Theorem 2,
limn→∞ limN→∞ xijnN = limN→∞ limn→∞ xijnN with
probability 1 for t ∈ [0, 1/K).
Proof. We need three ingredients for this proof. We
will first describe the three ingredients and then show
how they together prove Lemma 2. Throughout the
following, all statements hold with probability 1 unless
stated otherwise.
As we found in the course of the proof of Lemma 1,
the limits limn→∞ αkn exist for all k and are µk+1 on
[0, 1/|µ|), so limn→∞
∑N
k=0 αkntk exists under the same
conditions, and we call it xij∞N (t). This gives us the
first ingredient: (i) limn→∞ xijnN (t) = xij∞N (t) for
Page 11
hidden
3t ∈ [0, 1/|µ|) and any N . Additionally, from Lemma
1 we know that limN→∞ xij∞N (t) exists and is aij −µ+
µ/(1−µt) on [0, 1/|µ|). We call this limit xij∞∞(t), and
write the second ingredient as (ii) limN→∞ xij∞N (t) =
xij∞∞(t) for t ∈ [0, 1/|µ|).
Finally, as we saw in the proof of Lemma 1, αkn =
n−k∑ aim1am1m2 · · · amkj (by definition, not just with
probability 1), where the k indices mx each range in-
dependently from 1 to n. Since each |aij | < K, we
must have that |αkn| ≤ Kk+1, which implies |xijn∞(t)−
xijnN (t)| ≤ K(Kt)N+1/(1 − Kt). So if |Kt| < 1, then
for any ǫ > 0, there is a sufficiently large N1 indepen-
dent of n such that |xijn∞(t) − xijnN (t)| ≤ ǫ for all
N ≥ N1. This constitutes our third ingredient, that
xijnN (t) converges uniformly to xijn∞(t): (iii) for ev-
ery ǫ > 0, there exists an N1 such that for all N ≥ N1
and all n, |xijn∞(t) − xijnN (t)| < ǫ.
To complete the proof of Lemma 2, we need to show
that limn→∞ xijn∞(t) exists and is just xij∞∞(t) on
[0, 1/K). Start by picking an ǫ > 0. Then by (iii),
there exists an N1 such that if N > N1 then |xijn∞(t)−
xijnN (t)| < ǫ for all n. Similarly, (ii) implies that there
exists an N2 such that if N ≥ N2, then |xij∞∞(t) −
xij∞N (t)| < ǫ. Finally, let N3 = max{N1, N2}. Then
by (i), we may choose an n1 such that if n ≥ n1, then
|xij∞N3 (t) − xijnN3 (t)| < ǫ. Now define the following
events:
E1 = [|xij∞∞(t) − xij∞N3 (t)| < ǫ]
E2 = [|xij∞N3 (t) − xijnN3 (t)| < ǫ]
E3 = [|xijnN3 (t) − xijn∞(t)| < ǫ]
E4 = [|xij∞∞(t) − xijn∞(t)| < 3ǫ]
(5)
Observe that, in similar form to Eq. 4, (E1 ∩E2 ∩E3) ⊂
E4, so Pr(E4) ≥ Pr(E1 ∩ E2 ∩ E3) = 1 − Pr(E′1 ∪ E′2 ∪
E′3) ≥ 1 − Pr(E′1) − Pr(E′2) − Pr(E′3) = 1 for all n ≥ n1.
Thus, |xij∞∞(t) − xijn∞(t)| < 3ǫ for all n ≥ n1 and
t ∈ [0, 1/K). 
UNREFERENCED RESULTS
In the main report, we establish that λ1 > 0 in prob-
ability by way of Wigner’s semicircle law. However, we
can also show that λ1 > 0 with high probability under
a different set of assumptions about how A is selected.
Loosely speaking, the selection of the diagonal entries is
more constrained in this alternative approach while the
selection of the off-diagonal entries is less so.
Suppose that all aij , |i − j| = 1, are chosen from the
same distribution F and all aii are chosen from the same
distribution G. All selections are independent for i ≤ j.
In addition, assume that the density of a11 − a12 is not
entirely confined to negative values. Then we have the
following theorem.
Theorem 3. Pr(λ1 ≤ 0) is exponentially small in n.
Proof. If λ1 ≤ 0, then the corresponding matrix A
is negative semi-definite. Therefore vTAv ≤ 0 for every
vector v (T denotes transposition). Let vk denote the
vector with +1 and −1 in its (2k−1)th and 2kth compo-
nents, respectively, and 0 in all other components. Then
the event that vTk Avk ≤ 0 is equivalent to the event that
a(2k−1)(2k−1) − a(2k−1)2k − a2k(2k−1) + a2k2k ≤ 0. Note
that the left-hand side of this final inequality has at least
a constant probability of being positive. Now as k ranges
from 1 to n/2, we encounter n/2 independent events, each
having at least a constant probability of failure. Hence
the probability that A is negative semi-definite is expo-
nentially small in n. 
Very roughly, the final result of this supporting text
says that in the case of negative µ, the distribution of
the sum of the components of ω1 = (v1, · · · , vn) retains
no more than constant width in the large-n limit. So, for
example, the mean of the components of ω1 must shrink
to zero at least as fast as 1/n. (Note that this conver-
gence is faster than the 1/√n convergence of means for
independent and identically distributed random variables
with finite mean and variance.) This result is consistent
with the picture that when µ is negative, the system is
destined for two-sided conflict in the large-n limit.
To make the proof less cumbersome, suppose that the
diagonal Fii have common expectation µ and variance σ2
just like the off-diagonal Fij .
Theorem 4. If µ < 0, then limn→∞ Pr(|n−δ
∑n
i=1 vi| < ǫ) = 1 for all δ, ǫ > 0.
Proof. Consider the eigenvalue equation
∑n
j=1 aijvj = λ1vi. Sum both sides over i, and
then rearrange and switch index labels to obtain
n

i=1
vi
n

j=1
aij = λ1
n

i=1
vi. (6)
We anticipate from standard results of random matrix
theory that the left side of Eq. 6 is asymptotic to
µn∑ni=1 vi and the right side to 2σ
√n∑ni=1 vi. So let’s
define the two events
E1 =
[



n−1−δ
n

i=1
vi
n

j=1
aij −
µ

n

i=1
vi



≤ ǫ
]
E2 =
[



λ1
n
n

i=1
vi −
2σ√n
n

i=1
vi



< ǫ
]
(7)
where [·] denotes an event. Now, Pr(E1 ∩ E2) = 1 −
Pr(E′1 ∪E′2) ≥ 1−Pr(E′1)−Pr(E′2), where we have writ-
ten E′x for the complementary event of Ex. So if we can
show that limn→∞ Pr(E1) = 1 and limn→∞ Pr(E2) = 1
for all δ, ǫ > 0, it follows that limn→∞ Pr(|µn−δ
∑n
i=1 vi−
Page 12
hidden
42σn−1/2−δ ∑ni=1 vi| < ǫ + ǫn−δ) = 1, which implies The-
orem 4.
Hence our task reduces to showing that
limn→∞ Pr(E1) = limn→∞ Pr(E2) = 1. First con-
sider Pr(E1). By Jensen’s inequality and normalization
of ω1, (n−1
∑n
i=1 |vi|)2 ≤ n−1
∑n
i=1 v2i = n−1, so




n

i=1
vi





n

i=1
|vi| ≤

n. (8)
Now define the additional event:
E3 =
[
max
1≤i≤n



n−1/2−δ
n

j=1
(aij − µ)



≤ ǫ
]
. (9)
By Eq. 8, we have E3 ⊂ [max1≤i≤n |n−1/2−δ
∑n
j=1(aij −
µ)|n−1/2∑ni=1 |vi| ≤ ǫ] ⊂ [n−1−δ
∑n
j=1 |vi||
∑n
j=1(aij −
µ)| ≤ ǫ] ⊂ E1. The Bernstein inequality and a union
bound together imply that limn→∞ Pr(E3) = 1, since
they give that
Pr(E′3) ≤ 2n exp
(
− n
2δǫ2/2
σ2 + (K + µ)n−1/2+δǫ/3
)
. (10)
So, limn→∞ Pr(E1) = 1 as desired.
Regarding Pr(E2), we know that λ1 ∈ 2σ
√n+o(√n) in
probability [1], which implies that limn→∞ Pr(|λ1/
√n−
2σ| < ǫ) = 1 and therefore limn→∞ Pr(E2) = 1 for all
ǫ > 0 by Eq. 8. This completes the proof. 
ADDITIONAL REFERENCES
Examples of two-sided social conflicts appear in a wide
range of settings, including political party coalitions [2–
6], inter-state wars [7, 8], corporate standard-setting [9],
intertribal feuding [10] and even experiments in which
group membership is based on arbitrary criteria [11–15].
∗ Electronic address: kleinber@cs.cornell.edu
[1] Fu¨redi, Z & Komlo´s, J. (1981) The eigenvalues of random
symmetric matrices. Combinatorica 1, 233–241.
[2] Duverger, M. (1954) Political Parties: Their Organi-
zation and Activity in the Modern State. (Wiley, New
York). See p. 217.
[3] Riker, W. H. (1962) The Theory of Political Coalitions.
(Yale University Press, New Haven). See pp. 174-189.
[4] Riker, W. H. (1982) The two-party system and Du-
verger’s law: an essay on the history of political science.
The American Political Science Review 76, 753.
[5] Benoit, K. (2007) Electoral laws as political conse-
quences: explaining the origins and change of electoral
institutions. Annual Review of Political Science 10, 363.
[6] Sagar, D. J, ed. (2009) Political Parties of the World.
(John Harper, London).
[7] Moore, M. (1979) Structural balance and international
relations. European Journal of Social Psychology 9, 323.
[8] Altfeld, M. F & de Mesquita, B. B. (1979) Choosing sides
in war. International Studies Quarterly 23, 87.
[9] Axelrod, R, Mitchell, W, Thomas, R. E, Bennett, D. S,
& Bruderer, E. (1995) Coalition formation in standard-
setting alliances. Management Science 41, 1493.
[10] Chagnon, N. A. (1988) Life histories, blood revenge, and
warfare in a tribal population. Science 239, 985.
[11] Sherif, M. (1966) In Common Predicament: So-
cial Psychology of Intergroup Conflict and Cooperation.
(Houghton Mifflin, Boston).
[12] Tajfel, H & Turner, J. C. (1979) in The Social Psychology
of Intergroup Relations, eds. Austin, W. G & Worchel, S.
(Brooks/Cole, Monterey, CA), p. 33. See pp. 38-40.
[13] Brewer, M. B. (1979) In-group bias in the minimal inter-
group situation: a cognitive-motivational analysis. Psy-
chological Bulletin 86, 307.
[14] Fiske, S. T. (2002) What we know now about bias and
intergroup conflict, the problem of the century. Current
Directions in Psychological Science 11, 123.
[15] Wildschut, T, Pinter, B, Vevea, J. L, Insko, C. A, &
Schopler, J. (2003) Beyond the group mind: a quantita-
tive review of the interindividual-intergroup discontinuity
effect. Psychological Bulletin 129, 698.

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

7 Readers on Mendeley
by Discipline
 
 
 
by Academic Status
 
57% Ph.D. Student
 
14% Doctoral Student
 
14% Researcher (at an Academic Institution)
by Country
 
14% China
 
14% India
 
14% Japan