Head in the clouds: Re-imagining the experimental laboratory record for the web-based networked world.
- PubMed: 20098590
Abstract
The means we use to record the process of carrying out research remains tied to the concept of a paginated paper notebook despite the advances over the past decade in web based communication and publication tools. The development of these tools offers an opportunity to re-imagine what the laboratory record would look like if it were re-built in a web-native form. In this paper I describe a distributed approach to the laboratory record based which uses the most appropriate tool available to house and publish each specific object created during the research process, whether they be a physical sample, a digital data object, or the record of how one was created from another. I propose that the web-native laboratory record would act as a feed of relationships between these items. This approach can be seen as complementary to, rather than competitive with, integrative approaches that aim to aggregate relevant objects together to describe knowledge. The potential for the recent announcement of the Google Wave protocol to have a significant impact on realizing this vision is discussed along with the issues of security and provenance that are raised by such an approach.
Head in the clouds: Re-imagining the experimental laboratory record for the web-based networked world.
ssBioMed CentAutomated Experimentation
Open AcceCommentary
Head in the clouds: Re-imagining the experimental laboratory
record for the web-based networked world
Cameron Neylon
Address: STFC Rutherford Appleton Laboratory, Harwell Science and Innovation Campus, Didcot, UK
Email: Cameron Neylon - cameron.neylon@stfc.ac.uk
Abstract
The means we use to record the process of carrying out research remains tied to the concept of
a paginated paper notebook despite the advances over the past decade in web based
communication and publication tools. The development of these tools offers an opportunity to re-
imagine what the laboratory record would look like if it were re-built in a web-native form. In this
paper I describe a distributed approach to the laboratory record based which uses the most
appropriate tool available to house and publish each specific object created during the research
process, whether they be a physical sample, a digital data object, or the record of how one was
created from another. I propose that the web-native laboratory record would act as a feed of
relationships between these items. This approach can be seen as complementary to, rather than
competitive with, integrative approaches that aim to aggregate relevant objects together to
describe knowledge. The potential for the recent announcement of the Google Wave protocol to
have a significant impact on realizing this vision is discussed along with the issues of security and
provenance that are raised by such an approach.
Introduction
Automated experimentation brings the promise of a much
improved record of the research process. Where experi-
ments are sufficiently well defined that they can be carried
out by automated instrumentation or computational
resources it is to be expected that an excellent record of
process can and will be created. In "Big Science" projects
from particle physics [1] to genome sequencing [2] the
sharing of records about samples and objects, experimen-
tal conditions and outputs, and the processing of data is a
central part of planning and infrastructure, and often a
central part of justifying the investment of resources. As
some segments biological science have become industrial-
ized with greater emphasis on high throughput analysis
mental process and to describe and codify the results of
experiments through controlled vocabularies, minimal
description standards [3], and ontologies [4].
None of this has had a major impact on the recording
process applied to the vast majority of research experi-
ments, which are still carried out by single people or small
teams in relative isolation from other research groups. The
vast majority of academic research is still recorded in
paper notebooks and even in industry the adoption of
electronic recording systems is relatively recent and
remains patchy. A paper notebook remains a means of
planning and recording experiments that is both flexible,
comfortable to use, and has a long history of successful
Published: 29 October 2009
Automated Experimentation 2009, 1:3 doi:10.1186/1759-4499-1-3
Received: 11 June 2009
Accepted: 29 October 2009
This article is available from: http://www.aejournal.net/content/1/1/3
© 2009 Neylon; licensee BioMed Central Ltd.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0),
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.Page 1 of 15
(page number not for citation purposes)
and the generation of large quantities of data sophisti-
cated systems have been developed to track the experi-
use. However, it is starting to fail as an effective means of
recording, collating, and sharing data due to the increas-
ing volume and changing nature of the data that research-
ers are generating, The majority of data generated today is
born digital. The proportion of global data generated in
2002 that was recorded on hard disks was estimated at
over 90% of a total of around five exabytes with print
accounting for less than 0.05% of the total [5]. In the case
of small laboratory data some printouts make it into
bound notebooks. In most cases however, data remains
distributed on a collection of laboratory and personal
hard disks. The record of data analysis, the conversion of
that digital data into new digital objects and finally into
scientific conclusions is, in most cases, poorly recorded.
The question of reproducibility lies at the heart of scien-
tific method and there are serious concerns that much cur-
rently published science is of limited value due to poor
record keeping. Data sharing mandates from research
funders are driven, at least in part, by a concern about
reproducibility. Opposition to those mandates is driven
to a significant extent by concerns of the value of sharing
data that cannot be placed in context due to inadequate
recording of its production. With digital instrumentation,
more complex experiments, and data volumes increasing
a paper based record is now longer capable of providing
the necessary context.
It is noteworthy in this context that a number of groups
have felt it necessary to take an active advocacy position in
trying to encourage the wider community that the repro-
ducibility of data analysis is a requirement, and not an
added bonus [6,7]. The promise of digital recording of the
research process is that it can create a reliable record that
would support automated reproduction and critical anal-
ysis of research results. The challenge is that the tools for
generating these digital records must outperform a paper
notebook while simultaneously providing enough
advanced and novel functionality to convince users of the
value of switching
At the same time the current low level of adoption means
that that field is wide open for a radical re-imagining of
how the record of research can be created and used. It lets
us think deeply about what value the different elements of
that record have for use and re-use and to take inspiration
from the wide variety of web-based data and object man-
agement tools that have been developed for the mass con-
sumer market. This paper will describe a new way of
thinking about the research record that is rooted in the
way that the World Wide Web works and consider the
design patterns that will most effectively utilize existing
and future infrastructure to provide a useful and effective
record.
The distinction between capturing process and describing
an experiment
In discussing tools and services for recording the process
of research there is a crucial distinction to be made
between capturing a record of as it happens and describ-
ing an experiment after the event. There is an important
distinction between data, the raw material produced by an
experiment, including the record of that experiment;
information, which places that data in a context that
allows inferences and deductions to be made; creating
knowledge. A large part of the tension between research-
ers who develop systems for describing knowledge in
structured form and research scientists who need a record
of the processes carried out in the laboratory derives from
a misunderstanding about whether data, information, or
knowledge is being recorded. The best way to maximise
success in recording the important details of a research
process is to capture the data and record as they are gener-
ated, or in the case of plans, before they are generated.
However, most controlled vocabularies and description
systems are built, whether explicitly or implicitly, with the
intention of describing the knowledge that is inferred
from a set of experiments, after the results have been con-
sidered. This is seen mostly clearly in ontologies that place
a hypothesis at the core of the descriptive structure or
assume that the "experiment" is a clearly defined entity
before it has been carried out.
These approaches work well for the highly controlled,
indeed, industrialised studies that they were generally
designed around. However they tend to fail when applied
to small scale and individual research, and particularly in
the situations where someone is "trying something out".
Most of the efforts to provide structured descriptions of
the research process start with the concept of an "experi-
ment" that is designed to test a "hypothesis" (see e.g
[8,9]). However in the laboratory the concept of "the
hypothesis" very often doesn't usefully apply to the detail
of the experimental steps that need to be recorded. And
the details of where a specific experiment starts and fin-
ishes are often dependent on the viewer, the state of the
research, or the choices made in how to publish and
present that research after the fact. Products or processes
may be part of multiple projects or may be later used in
multiple projects. A story will be constructed later, out of
these elements, to write a paper or submit a database entry
but at the time the elements of this story are captured the
framework may be vague or non-existent. Unexpected
results clearly do not fit into an existing framework but
can be the launching point for a whole new programme.
The challenge therefore is to capture the elements of the
research process in such a way that the sophisticated and
powerful tools developed for structured description ofPage 2 of 15
(page number not for citation purposes)
knowledge can be readily applied once the story starts to
take form. That is the data and record should be captured
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime



