Timing Analysis of Keystrokes and Timing Attacks on SSH
Abstract
SSH is designed to provide a secure channel between two hosts. Despite the encryption and authentication mechanisms it uses, SSH has two weakness: First, the transmitted packets are padded only to an eight-byte boundary (if a block cipher is in use), which reveals the approximate size of the original data. Second, in interactive mode, every individual keystroke that a user types is sent to the remote machine in a separate IP packet immediately after the key is pressed, which leaks the interkeystroke timing information of users typing. In this paper, we show how these seemingly minor weaknesses result in serious security risks. First we show that even very simple statistical techniques suffice to reveal sensitive information such as the length of users passwords or even root passwords. More importantly, we further show that by using more advanced statistical techniques on timing information collected from the network, the eavesdropper can learn significant information about what users type in SSH sessions. In particular, we perform a statistical study of users typing patterns and show that these patterns reveal information about the keys typed. By developing a Hidden Markov Model and our key sequence prediction algorithm, we can predict key sequences from the interkeystroke timings. We further develop an attacker system, Herbivore, which tries to learn users passwords by monitoring SSH sessions. By collecting timing information on the network, Herbivore can speed up exhaustive search for passwords by a factor of 50. We also propose some countermeasures. In general our results apply not only to SSH, but also to a general class of protocols for encrypting interactive traffic. We show that timing leaks open a new set of security risks, and hence caution must be taken when designing this type of protocol.
Timing Analysis of Keystrokes and Timing Attacks on SSH
Dawn Xiaodong Song David Wagner Xuqing Tian
University of California, Berkeley
Abstract
SSH is designed to provide a secure channel between
two hosts. Despite the encryption and authentication
mechanisms it uses, SSH has two weakness: First, the
transmitted packets are padded only to an eight-byte
boundary (if a block cipher is in use), which reveals the
approximate size of the original data. Second, in inter-
active mode, every individual keystroke that a user types
is sent to the remote machine in a separate IP packet im-
mediately after the key is pressed, which leaks the inter-
keystroke timing information of users’ typing. In this
paper, we show how these seemingly minor weaknesses
result in serious security risks.
First we show that even very simple statistical tech-
niques suffice to reveal sensitive information such as the
length of users’ passwords or even root passwords. More
importantly, we further show that by using more ad-
vanced statistical techniques on timing information col-
lected from the network, the eavesdropper can learn sig-
nificant information about what users type in SSH ses-
sions. In particular, we perform a statistical study of
users’ typing patterns and show that these patterns re-
veal information about the keys typed. By developing a
Hidden Markov Model and our key sequence prediction
algorithm, we can predict key sequences from the inter-
keystroke timings. We further develop an attacker sys-
tem, Herbivore , which tries to learn users’ passwords by
monitoring SSH sessions. By collecting timing informa-
tion on the network, Herbivore can speed up exhaustive
search for passwords by a factor of 50. We also propose
some countermeasures.
In general our results apply not only to SSH, but also
to a general class of protocols for encrypting interactive
traffic. We show that timing leaks open a new set of
security risks, and hence caution must be taken when
designing this type of protocol.
This research was supported in part by the Defense Advanced Re-
search Projects Agency under DARPA contract N6601-99-28913 (un-
der supervision of the Space and Naval Warfare Systems Center San
Diego) and by the National Science foundation under grants FD99-
79852 and CCR-0093337.
1 Introduction
Just a few years ago, people commonly used astonish-
ingly insecure networking applications such as tel-
net, rlogin, or ftp, which simply pass all confi-
dential information, including users’ passwords, in the
clear over the network. This situation was aggravated
through broadcast-based networks that were commonly
used (e.g., Ethernet) which allowed a malicious user to
eavesdrop on the network and to collect all communi-
cated information [CB94, GS96].
Fortunately, many users and system administrators have
become aware of this issue and have taken counter-
measures. To curb eavesdroppers, security researchers
designed the Secure Shell (SSH), which offers an en-
crypted channel between the two hosts and strong au-
thentication of both the remote host and the user [Ylo¨96,
SSL01, YKS
00b]. Today, SSH is quite popular, and it
has largely replaced telnet and rlogin.
Many users believe that they are secure against eaves-
droppers if they use SSH. Unfortunately, in this paper
we show that despite state-of-the-art encryption tech-
niques and advanced password authentication protocols
[YKS 00a], SSH connections can still leak significant
information about sensitive data such as users’ pass-
words. This problem is particularly serious because it
means users may have a false confidence of security
when they use SSH.
In particular we identify that two seemingly minor weak-
nesses of SSH lead to serious security risks. First, the
transmitted packets are padded only to an eight-byte
boundary (if a block cipher is in use). Therefore an
eavesdropper can easily learn the approximate length of
the original data. Second, in interactive mode, every
individual keystroke that a user types is sent to the re-
mote machine in a separate IP packet immediately af-
ter the key is pressed (except for some meta keys such
Shift or Ctrl). We show in the paper that this prop-
erty can enable the eavesdropper to learn the exact length
of users’ passwords. More importantly, as we have veri-
fied, the time it takes the operating system to send out the
packet after the key press is in general negligible com-
paring to the inter-keystroke timing. Hence an eaves-
users’ typing from the arrival times of packets.
Experience shows that users’ typing follows stable pat-
terns1. Many researchers have proposed to use the du-
ration of key strokes and latencies between key strokes
as a biometric for user authentication [GLPS80, UW85,
LW88, LWU89, JG90, BSH90, MR97, RLCM98,
MRW99]. A more challenging question which has not
yet been addressed in the literature is whether we can
use timing information about key strokes to infer the key
sequences being typed. If we can, can we estimate quan-
titatively how many bits of information are revealed by
the timing information? Experience seems to indicate
that the timing information of keystrokes reveals some
information about the key sequences being typed. For
example, we might have all experienced that the elapsed
time between typing the two letters “er” can be much
smaller than between typing “qz”. This observation is
particularly relevant to security. Since as we show the
attacker can get precise inter-keystroke timings of users’
typing in a SSH session by recording the packet arrival
times, if the attacker can infer what users type from the
inter-keystroke timings, then he could learn what users
type in a SSH session from the packet arrival times.
In this paper we study users’ keyboard dynamics and
show that the timing information of keystrokes does leak
information about the key sequences typed. Through
more detailed analysis we show that the timing informa-
tion leaks about 1 bit of information about the content
per keystroke pair. Because the entropy of passwords
is only 4–8 bits per character, this 1 bit per keystroke
pair information can reveal significant information about
the content typed. In order to use inter-keystroke tim-
ings to infer keystroke sequences, we build a Hidden
Markov Model and develop a n-Viterbi algorithm for the
keystroke sequence inference. To evaluate the effective-
ness of the attack, we further build an attacker system,
Herbivore, which monitors the network and collects tim-
ing information about keystrokes of users’ passwords.
Herbivore then uses our key sequence prediction algo-
rithm for password prediction. Our experiments show
that, for passwords that are chosen uniformly at random
with length of 7 to 8 characters, Herbivore can reduce the
cost of password cracking by a factor of 50 and hence
speed up exhaustive search dramatically. We also pro-
pose some countermeasures to mitigate the problem.
We emphasize that the attacks described in this paper are
a general issue for any protocol that encrypts interactive
traffic. For concreteness, we study primarily SSH, but
these issues affect not only SSH 1 and SSH 2, but also
1In this paper we only consider users who are familiar with key-
board typing and use touch typing.
any other protocol for encrypting typed data.
The outline of this paper is as follows. In Section 2
we discuss in more details about the vulnerabilities
of SSH and various simple techniques an attacker can
use to learn sensitive information such as the length
of users’ passwords and the inter-keystroke timings of
users’ passwords typed. In Section 3 we present our
statistical study on users’ typing patterns and show that
inter-keystroke timings reveal about 1 bit of information
per keystroke pair. In Section 4 we describe how we can
infer key sequences using a Hidden Markov Model and
a n-Viterbi algorithm. In Section 5 we describe the de-
sign, development and evaluation of an attacker system,
Herbivore, which learns users’ passwords by monitoring
SSH sessions. We propose countermeasures to prevent
these attacks in Section 7, and conclude in Section 8.
2 Eavesdropping SSH
The Secure Shell SSH [SSL01, YKS 00b] is used to en-
crypt the communication link between a local host and a
remote machine. Despite the use of strong cryptographic
algorithms, SSH still leaks information in two ways:
First, the transmitted packets are padded only to an
eight-byte boundary (if a block cipher is in use),
which leaks the approximate size of the original
data.
Second, in interactive mode, every individual
keystroke that a user types is sent to the remote
machine in a separate IP packet immediately after
the key is pressed (except for some meta keys such
Shift or Ctrl). Because the time it takes the op-
erating system to send out the packet after the key
press is in general negligible comparing to the inter-
keystroke timing (as we have verified), this also
enables an eavesdropper to learn the precise inter-
keystroke timings of users’ typing from the arrival
times of packets.
The first weakness poses some obvious security risks.
For example, when one logs into a remote site R in
SSH, all the characters of the initial login password
are batched up, padded to an eight-byte boundary if a
block cipher is in use, encrypted, and transmitted to R.
Due to the way padding is done, an eavesdropper can
learn one bit of information on the initial login pass-
word, namely, whether it is at least 7 characters long
or not. The second weakness can lead to some potential
anonymity risks since, as many researchers have found
previously, inter-keystroke timings can reveal the iden-
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime


