Standards in Computational System...
Standards in Computational Systems Biology Edda Klipp1, Wolfram Liebermeister1, Anselm Helbig1, Axel Kowald1,2, J��rg Schaber1 1Max Planck Institute for Molecular Genetics, Berlin, Germany 2Ruhr University Bochum, Bochum, Germany Abstract Systems biology aims at modeling and quantitative simulation of complex biological systems. This endeavor demands close collaboration and communication between modeler and experimenter, which can be facilitated by standards concerning workflows and data formats. We have conducted a survey to find out what the community thinks about the requirements and benefits of standardization efforts we also evaluated which modeling methods and software tools are being used in the systems biology community. Free availability and flexibility are the top criteria that govern the choice of software tools. SBML is widely recognized as a standard format for systems biology models. Eighty percent of the 125 respondents favor standardization, but there is also consensus that standards should not be enforced at all costs. The most significant demands are standardized formats for experimental data and mathematical models, standardized names for metabolites, reactions and enzymes, and standardized graphical representation of networks. 1 Introduction 1.1 Standards in Systems Biology Currently, we observe intense debates about standards in the systems biology scientific community. A series of articles have discussed various aspects of standardization, such as minimal requirements in the annotation of biochemical models (MIRIAM1), the compatibility of tools for kinetic modeling2, or the emerging standards in high-throughput technologies and related ways to establish other standards in biology3. Standards are agreements between people in order to enhance information exchange and mutual understanding. They may concern state-of-the-art methods and workflows for experiments and modeling, data formats for experimental results and mathematical models, or agreed nomenclature and graphical representation for biochemical systems. Standards can arise as informal agreements between researchers, they can result from the use of software tools, or they may be enforced by journals and funding organizations. Standardization also plays a major role on the level of research politics: almost all systems biology projects that are currently funded by the European Commission promise to develop or to define some standards. The hype for standards originates in the facts that the field is quite new and needs to be structured and that it involves expertise from diverse scientific backgrounds. On the one hand, the research object is life itself with all its complexity and diversity, so the definition of standards is not straightforward on the other hand, standardization efforts in other fields have shown that standards can greatly help to avoid misunderstanding and duplication of work. 1
To start this process in systems biology, we should know about de facto standards that are already established in the community. We should also know if the existing tools and methods fulfill the researchers��� needs and whether scientists would appreciate the development or enforcement of further standards for modeling, data exchange, and model publication. 1.2 The Questionnaire To answer these questions, we developed a questionnaire4 that asked the interested colleagues about their modeling habits, the software tools they use for modeling, and their opinions about different aspects of standardization. It was originally addressed to the members of the EU-funded Yeast Systems Biology Network (ysbn.org), but since we obtained the response that these problems are of more general interest, we spread the information as widely as possible, using a variety of formal and informal means. The questionnaire consisted of multiple choice questions and fields for free-text comments. Details about the questionnaire and the statistical evaluation are given in the methods section. Eventually, 125 persons filled the questionnaire until a deadline (August 29, 2006). The respondents cover all areas of systems biology and describe themselves as modelers (75%), experimentalists (4%), or both (21%). Their research areas include - modeling of individual pathways such as glycolysis or specific signaling pathways, - investigation of complex processes such as aging, cell cycle regulation, cancer, robustness, disease dynamics, - development and application of methods such as metabolic engineering, statistics, model identification, or machine learning, and - development of software tools for modeling and simulation. The studied model organisms range from various prokaryotes, yeast, and worms to mammals such as mice and humans. As identified by the ending of email addresses (at, au, be, ch, cn, de, es, fi, fr, gr, hu, in, it, jp, nl, pl, pt, ro, ru, uk, se, su, tr, za as well as com, edu, gov, net), the origins of respondents cover many countries and all continents. The responses indicate to what extent standardization in systems biology and especially in modeling is desired by the researchers, which standards are already common, and for which aspects standards are requested. In this article, we will summarize the comments about general aspects of standardization, i.e. (i) if standards are considered necessary at all, (ii) which standards have already emerged, and (iii) where further standardization is requested (section 2.1). We also give an overview about modeling approaches used to study different biological problems (section 2.2) and about the usage of software tools (section 2.3). All percentages refer to a total number of 125 responses to the questionnaire. Different lessons can be drawn from the results of our survey: (1) experimentalists who are diving into theory will learn about modeling and analysis methods that are suitable for 2
particular biological problems, (2) modelers are provided with a list of exchange formats and advantages or disadvantages of the various tools, and (3) tool developers can learn about exchange formats and functionality requested by the users. 2 Results 2.1 Opinions about Standards In the following, we shall summarize opinions and statements of the respondents we assume that they express a consensus that is shared by the majority of modelers in systems biology. Figure 1 Opinions about standards Percentages of researchers (from a total number of 125) who marked the respective response in the online questionnaire. 2.1.1 Do we need Standards at all? About eighty percent of the respondents stated that they consider the creation of standards necessary or desirable (see Figure 1). Although about 20% disagreed, this is a clear vote for standards in general. Most arguments for standardization were connected to the problems that appear if standards are missing. It was mentioned that many weak models are published respondents observed that it is often difficult to reproduce and to check the simulation results from computational models important modeling details are hidden in the paper or supplement or are not mentioned at all. Standards are expected to improve model reuse, expandability, and integration. Respondents expected that the adoption of standards enables modelers to reproduce each other���s results, to collaborate more productively, and 3