How to build a research system in your spare time
ACM SIGCOMM Computer Communication Review (2010)
- ISSN: 01464833
- DOI: 10.1145/1764873.1764884
Available from portal.acm.org
or
Author-supplied keywords
Available from portal.acm.org
Page 1
How to build a research system in your spare time
How to Build a Research System in Your Spare Time
Ratul Mahajan
Microsoft Research
This article is an editorial note submitted to CCR. It has NOT been peer reviewed.
The author takes full responsibility for this article’s content. Comments can be posted through CCR Online.
Abstract– This paper is based on a talk that I gave at CoNEXT
2009. Inspired by Hal Varian’s paper on building economic mod-
els,1 it describes a research method for building computer systems.
I find this method useful in my work and hope that some readers
will find it helpful as well.
Categories and Subject Descriptors
C.2.m [Miscellaneous]
General Terms
Design, Experimentation
Keywords
Research method, computer systems
1. INTRODUCTION
Last year, I had the (mis)fortune of serving on several conference
and workshop program committees, and I noticed that a common
set of complaints keep cropping up for papers that describe research
systems, such as:
• Do we need another paper on ....?
• Is this problem important?
• Does this solution work?
• What is new here?
• Why not solve the problem this other (simpler) way?
The authors of many of these papers had clearly put in a lot of
work, and the papers usually contained some good ideas. As an
aside, my papers too are not immune to these criticisms.
These complaints made me wonder if their root cause is poor
communication or poor research process. That is, is it the case that
we as authors are not communicating our results well or do the
complaints represent a flaw in the research process itself?
This question is hard to answer in general. As authors, we often
feel that concerns like the above can be addressed by better writing
because “the reviewers did not get it.” While good writing should
always be a goal, ultimately writing only reflects thinking and ex-
periences that are the results of the research process. Thus, the
research process appears to be at least a major contributory factor.
So then, what should an ideal research process be and can it
be articulated? A research process is not merely a set of dos and
1“How to Build an Economic Model in Your Spare Time,” Passion
and Craft: Economists at Work, Univ. of Michigan Press, 1997.
don’ts. It is a more systematic, step-by-step description of the re-
search activity. The sequence is important because the initial steps
can save you from making poor choices and pursuing less promis-
ing avenues, which you may be forced to paper over later with the
help of writing.
Naively, I decided to present in this paper a method for building
research systems. This method is based on what I have learned
from my collaborators and colleagues as well as my past mistakes.
I focus on building systems because much of my work falls in that
category, but I expect that some of what follows applies to other
types of networking research as well.
Note that adhering to a research method does not mean that there
is no role for intuition, creativity, and hard work. On the contrary,
these factors are absolutely necessary for success. The role of the
process is to help focus, avoid common mistakes, and proceed with
due haste. Think of it as best practices.
I do not intend to build consensus on the “right” way to do net-
work systems research. My hope is simply that some readers will
find the method below useful in their work, as I do in mine. There
are undoubtedly other productive methods. I would love to hear
from other researchers about important aspects that I have missed
and aspects with which they disagree. Better yet, I invite them to
articulate their method, in an editorial such as this one or elsewhere.
2. BUILDING A RESEARCH SYSTEM
In my work, I have found the following method to work well:
1. Pick the domain carefully The first step is to pick a domain
or an area that you want to investigate. You may not have to go
through this step if you already have one in mind. But I tend to
switch domains on a regular basis because I run out of ideas. Thus,
for me, this step is a conscious exercise.
2. Know the problem well before you start building By the
time you pick a domain, you should have an inkling of what you
want to do. But before rushing to design and build the system, iden-
tify a technical problem, solving which would represent a signifi-
cant improvement in the state of the world. For instance, it would
reduce cost, improve performance, or enable new functionality. If
you already know the problem, the purpose of this step is to con-
firm that the problem is real and important. Without this step, you
run the risk of solving a non-problem.
3. Debate several solution ideas and have a core idea behind
what you build Suppose you know the problem that you want to
solve, it is still not time to start building the system. First consider
ACM SIGCOMM Computer Communication Review 60 Volume 40, Number 2, April 2010
Ratul Mahajan
Microsoft Research
This article is an editorial note submitted to CCR. It has NOT been peer reviewed.
The author takes full responsibility for this article’s content. Comments can be posted through CCR Online.
Abstract– This paper is based on a talk that I gave at CoNEXT
2009. Inspired by Hal Varian’s paper on building economic mod-
els,1 it describes a research method for building computer systems.
I find this method useful in my work and hope that some readers
will find it helpful as well.
Categories and Subject Descriptors
C.2.m [Miscellaneous]
General Terms
Design, Experimentation
Keywords
Research method, computer systems
1. INTRODUCTION
Last year, I had the (mis)fortune of serving on several conference
and workshop program committees, and I noticed that a common
set of complaints keep cropping up for papers that describe research
systems, such as:
• Do we need another paper on ....?
• Is this problem important?
• Does this solution work?
• What is new here?
• Why not solve the problem this other (simpler) way?
The authors of many of these papers had clearly put in a lot of
work, and the papers usually contained some good ideas. As an
aside, my papers too are not immune to these criticisms.
These complaints made me wonder if their root cause is poor
communication or poor research process. That is, is it the case that
we as authors are not communicating our results well or do the
complaints represent a flaw in the research process itself?
This question is hard to answer in general. As authors, we often
feel that concerns like the above can be addressed by better writing
because “the reviewers did not get it.” While good writing should
always be a goal, ultimately writing only reflects thinking and ex-
periences that are the results of the research process. Thus, the
research process appears to be at least a major contributory factor.
So then, what should an ideal research process be and can it
be articulated? A research process is not merely a set of dos and
1“How to Build an Economic Model in Your Spare Time,” Passion
and Craft: Economists at Work, Univ. of Michigan Press, 1997.
don’ts. It is a more systematic, step-by-step description of the re-
search activity. The sequence is important because the initial steps
can save you from making poor choices and pursuing less promis-
ing avenues, which you may be forced to paper over later with the
help of writing.
Naively, I decided to present in this paper a method for building
research systems. This method is based on what I have learned
from my collaborators and colleagues as well as my past mistakes.
I focus on building systems because much of my work falls in that
category, but I expect that some of what follows applies to other
types of networking research as well.
Note that adhering to a research method does not mean that there
is no role for intuition, creativity, and hard work. On the contrary,
these factors are absolutely necessary for success. The role of the
process is to help focus, avoid common mistakes, and proceed with
due haste. Think of it as best practices.
I do not intend to build consensus on the “right” way to do net-
work systems research. My hope is simply that some readers will
find the method below useful in their work, as I do in mine. There
are undoubtedly other productive methods. I would love to hear
from other researchers about important aspects that I have missed
and aspects with which they disagree. Better yet, I invite them to
articulate their method, in an editorial such as this one or elsewhere.
2. BUILDING A RESEARCH SYSTEM
In my work, I have found the following method to work well:
1. Pick the domain carefully The first step is to pick a domain
or an area that you want to investigate. You may not have to go
through this step if you already have one in mind. But I tend to
switch domains on a regular basis because I run out of ideas. Thus,
for me, this step is a conscious exercise.
2. Know the problem well before you start building By the
time you pick a domain, you should have an inkling of what you
want to do. But before rushing to design and build the system, iden-
tify a technical problem, solving which would represent a signifi-
cant improvement in the state of the world. For instance, it would
reduce cost, improve performance, or enable new functionality. If
you already know the problem, the purpose of this step is to con-
firm that the problem is real and important. Without this step, you
run the risk of solving a non-problem.
3. Debate several solution ideas and have a core idea behind
what you build Suppose you know the problem that you want to
solve, it is still not time to start building the system. First consider
ACM SIGCOMM Computer Communication Review 60 Volume 40, Number 2, April 2010
Page 2
and debate several solution ideas to gain clarity on the design space
and identify the core idea that will underlie your system. It is im-
portant to articulate this idea clearly and concisely because research
systems are intended to validate a hypothesis such as this idea can
solve this problem. If you build a system without articulating the
central idea, it has little lasting research value.
4. When building, start small and then embellish It is now
time to start building the system. It helps to start small by imple-
menting your core idea and add complexity only as needed. This
incremental approach ensures that no unnecessary complexity is
added and it is easier to adapt if needed. When evaluating the sys-
tem, in addition to showing that it works, show why it works and
justify its complexities and simplifying assumptions.
5. Make it real Finally, make your research go beyond the paper.
This step will often require you to step outside systems research or
even research, but it is critical for your work to have an impact.
By necessity, the steps above are a simplification of reality, but
they do capture the essence. For maximum effectiveness, the steps
must be executed in the order specified, though there can be some
overlap between them and some back-and-forth. Skipping a step
altogether reduces research quality. The steps that authors tend to
forget are 2, 3, and 5.
The later steps usually take more time, but the earlier steps are
equally – if not more – important because the success of the later
ones hinges on them. Steps 2 and 3 are not necessarily pure thought
experiments, and they often involve an experimental component.
The rest of this paper describes these steps in detail. I will draw
on my experiences and use anecdotes from my work, not because
it is exemplary, but because I was there when it was done.
3. PICKING A DOMAIN
When picking a domain, the most important criterion is that you
find it fascinating. If inter-domain routing is not your cup of tea,
do not pick that research area (even if you are a graduate student
whose adviser’s tenure depends on it).
But the question is: what beyond that criterion? In trying to find
a domain, I am wary of hot trends. Currently, these trends appear
to be social networks and data centers. It is not that hot areas are
unimportant. But if enough smart people are already working in an
area, unless you have a novel insight or perspective before diving
in, your time and energy is probably best spent elsewhere. If you
are a graduate student, another downside to picking a hot area is
that you will graduate with several others who have worked in the
same area, which will make it harder for you to stand out. This
happened to me, and in retrospect, my choice of thesis topic was
not that inspired.
Amore promising strategy is to observe the world for big changes.
The fault lines created by such changes represent promising av-
enues for research on accommodating those changes. These changes
could be in workloads, technology trends, or even new concerns
such as energy consumption. Some of the very successful research
projects have been driven by such observations. For example, two
of the three award papers at the SOSP 2009 conference fall in this
category. As another example, my recent work on vehicular net-
works is driven by the observation of increasing demand for con-
nectivity from moving vehicles. I wanted to understand how to
enable that connectivity in a cheap and reliable manner.
Another kind of change to look for is adoption or availability of
new technologies. For instance, in the wireless domain, the avail-
ability of cheap software-defined radios led to significant, exciting
research. A similar phenomenon is occurring today with MIMO
(e.g., 802.11n) and programmable directional antennae.
Yet another kind of change, which researchers tend to overlook,
is change in government regulation. Current examples of expected
regulatory changes include net neutrality, privacy, and white spaces.
A second strategy for picking domains is to prefer those that are
underexplored. A domain can be underexplored because it is not on
other researchers’ radars, because folks have not figured out how
to do systems research in it, or because people presume that the
problems in that domain have been solved by solutions in related
domains. It is worthwhile to question such premises.
One of my recent projects focused on diagnosing faults in small
enterprise networks. My collaborators and I observed that these
networks had not been studied before, likely because of the pre-
sumption that solutions designed for large enterprises also work for
small enterprises. On a closer look, we found that presumption to
be false. Studying small enterprise networks gave us a very differ-
ent perspective on the design of diagnostic systems, a perspective
that we found later was useful for large enterprises as well.
A third strategy for picking domains is to look for unique oppor-
tunities. This opportunity could be data that yields novel insights
into the workings a real system. Alternatively, if you encounter a
new technique or tool, ask for what else it might be useful. This
question also applies to tools that you develop because they could
be useful in a different context. The starting point for Rocketfuel,
an ISP topology discovery system that I built with Neil Spring, was
a tool that I had built to understand routing misconfiguration.
4. IDENTIFYING THE PROBLEM
Picking a domain does not mean that you have also identified
a technical problem to solve. But you should have at least some
notion of the problem. If all you know is that you want to work on
data centers, you did not do a good job of picking a domain. Go
back to the previous step.
Before solving what you think is a potential problem, frame it
more concretely. Problem framing involves identifying the exact
weakness in the status quo that you want to address and estimating
the benefits of addressing that weakness. If you want to work on
scaling data centers, you must first understand the scalability bot-
tlenecks, the benefits of scaling, and any characteristics that you
can leverage to scale.
Framing the problem is an exercise that you must do for yourself.
Papers by other researchers are not a good source. If a paper de-
scribes a problem in detail but does not solve it, chances are that the
problem is not important or very hard. If it is very hard, you need
insights and perspectives that cannot be obtained only by reading
that paper.
Instead, you must “scrutinize” carefully using measurements,
data analysis, surveys, etc. Your goal is to establish a concrete
understanding of the real issues. Use your imagination to guide
what to scrutinize and how. Never let imagination alone or hearsay
frame the problem for you. I take this process seriously because I
have been surprised many times. Issues that I think are problems
turn out to non-problems after careful scrutiny.
ACM SIGCOMM Computer Communication Review 61 Volume 40, Number 2, April 2010
and identify the core idea that will underlie your system. It is im-
portant to articulate this idea clearly and concisely because research
systems are intended to validate a hypothesis such as this idea can
solve this problem. If you build a system without articulating the
central idea, it has little lasting research value.
4. When building, start small and then embellish It is now
time to start building the system. It helps to start small by imple-
menting your core idea and add complexity only as needed. This
incremental approach ensures that no unnecessary complexity is
added and it is easier to adapt if needed. When evaluating the sys-
tem, in addition to showing that it works, show why it works and
justify its complexities and simplifying assumptions.
5. Make it real Finally, make your research go beyond the paper.
This step will often require you to step outside systems research or
even research, but it is critical for your work to have an impact.
By necessity, the steps above are a simplification of reality, but
they do capture the essence. For maximum effectiveness, the steps
must be executed in the order specified, though there can be some
overlap between them and some back-and-forth. Skipping a step
altogether reduces research quality. The steps that authors tend to
forget are 2, 3, and 5.
The later steps usually take more time, but the earlier steps are
equally – if not more – important because the success of the later
ones hinges on them. Steps 2 and 3 are not necessarily pure thought
experiments, and they often involve an experimental component.
The rest of this paper describes these steps in detail. I will draw
on my experiences and use anecdotes from my work, not because
it is exemplary, but because I was there when it was done.
3. PICKING A DOMAIN
When picking a domain, the most important criterion is that you
find it fascinating. If inter-domain routing is not your cup of tea,
do not pick that research area (even if you are a graduate student
whose adviser’s tenure depends on it).
But the question is: what beyond that criterion? In trying to find
a domain, I am wary of hot trends. Currently, these trends appear
to be social networks and data centers. It is not that hot areas are
unimportant. But if enough smart people are already working in an
area, unless you have a novel insight or perspective before diving
in, your time and energy is probably best spent elsewhere. If you
are a graduate student, another downside to picking a hot area is
that you will graduate with several others who have worked in the
same area, which will make it harder for you to stand out. This
happened to me, and in retrospect, my choice of thesis topic was
not that inspired.
Amore promising strategy is to observe the world for big changes.
The fault lines created by such changes represent promising av-
enues for research on accommodating those changes. These changes
could be in workloads, technology trends, or even new concerns
such as energy consumption. Some of the very successful research
projects have been driven by such observations. For example, two
of the three award papers at the SOSP 2009 conference fall in this
category. As another example, my recent work on vehicular net-
works is driven by the observation of increasing demand for con-
nectivity from moving vehicles. I wanted to understand how to
enable that connectivity in a cheap and reliable manner.
Another kind of change to look for is adoption or availability of
new technologies. For instance, in the wireless domain, the avail-
ability of cheap software-defined radios led to significant, exciting
research. A similar phenomenon is occurring today with MIMO
(e.g., 802.11n) and programmable directional antennae.
Yet another kind of change, which researchers tend to overlook,
is change in government regulation. Current examples of expected
regulatory changes include net neutrality, privacy, and white spaces.
A second strategy for picking domains is to prefer those that are
underexplored. A domain can be underexplored because it is not on
other researchers’ radars, because folks have not figured out how
to do systems research in it, or because people presume that the
problems in that domain have been solved by solutions in related
domains. It is worthwhile to question such premises.
One of my recent projects focused on diagnosing faults in small
enterprise networks. My collaborators and I observed that these
networks had not been studied before, likely because of the pre-
sumption that solutions designed for large enterprises also work for
small enterprises. On a closer look, we found that presumption to
be false. Studying small enterprise networks gave us a very differ-
ent perspective on the design of diagnostic systems, a perspective
that we found later was useful for large enterprises as well.
A third strategy for picking domains is to look for unique oppor-
tunities. This opportunity could be data that yields novel insights
into the workings a real system. Alternatively, if you encounter a
new technique or tool, ask for what else it might be useful. This
question also applies to tools that you develop because they could
be useful in a different context. The starting point for Rocketfuel,
an ISP topology discovery system that I built with Neil Spring, was
a tool that I had built to understand routing misconfiguration.
4. IDENTIFYING THE PROBLEM
Picking a domain does not mean that you have also identified
a technical problem to solve. But you should have at least some
notion of the problem. If all you know is that you want to work on
data centers, you did not do a good job of picking a domain. Go
back to the previous step.
Before solving what you think is a potential problem, frame it
more concretely. Problem framing involves identifying the exact
weakness in the status quo that you want to address and estimating
the benefits of addressing that weakness. If you want to work on
scaling data centers, you must first understand the scalability bot-
tlenecks, the benefits of scaling, and any characteristics that you
can leverage to scale.
Framing the problem is an exercise that you must do for yourself.
Papers by other researchers are not a good source. If a paper de-
scribes a problem in detail but does not solve it, chances are that the
problem is not important or very hard. If it is very hard, you need
insights and perspectives that cannot be obtained only by reading
that paper.
Instead, you must “scrutinize” carefully using measurements,
data analysis, surveys, etc. Your goal is to establish a concrete
understanding of the real issues. Use your imagination to guide
what to scrutinize and how. Never let imagination alone or hearsay
frame the problem for you. I take this process seriously because I
have been surprised many times. Issues that I think are problems
turn out to non-problems after careful scrutiny.
ACM SIGCOMM Computer Communication Review 61 Volume 40, Number 2, April 2010
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime
Start using Mendeley in seconds!
Readership Statistics
1 Reader on Mendeley
by Discipline
by Academic Status
100% Ph.D. Student
by Country
100% United States


