Web-based multi-center data manag...
ORIGINAL PAPER Web-based Multi-center Data Management System for Clinical Neuroscience Research Alexander Pozamantir & Hedok Lee & Joab Chapman & Isak Prohovnik Received: 29 May 2008 /Accepted: 2 September 2008 /Published online: 17 September 2008 # Springer Science + Business Media, LLC 2008 Abstract Modern clinical research often involves multi- center studies, large and heterogeneous data flux, and intensive demands of collaboration, security and quality assurance. In the absence of commercial or academic management systems, we designed an open-source system to meet these requirements. Based on the Apache-PHP- MySQL platform on a Linux server, the system allows multiple users to access the database from any location on the internet using a web browser, and requires no specialized computer skills. Multi-level security system is implemented to safeguard the protected health information and allow partial or full access to the data by individual or class privilege. The system stores and manipulates various types of data including images, scanned documents, laboratory data and clinical ratings. Built-in functionality allows for various search, quality control, analytic data operations, visit scheduling and visit reminders. This approach offers a solution to a growing need for manage- ment of large multi-center clinical studies. Keywords Management system . E-health . E-collaboration . MySQL . PHP. Database . WWW Introduction Modern medical research places severe demands on data management systems. While basic research often involves small, focused datasets, clinical studies often require volumes of data acquired from many subjects and sources. These large patient samples, often comprising hundreds of subjects, usually cannot be found in a single hospital, requiring multi-center, and often multi-country, studies. Even when sampling requirements can be satisfied in a single hospital, rigorous studies often involve the partici- pation of experts from other institutions, as well as auditing and quality assurance measures by external groups. Almost inevitably, thus, data are shared across institutional and national boundaries. This, in turn, creates extraordinary demands for control and coordination of procedures and data management. Further, the current legal atmosphere and the arduous struggle to satisfy ever-increasing expectations of data privacy, as well as the requirements of blind experimental designs, demand that access to the data be strictly limited to individuals with proper privilege. In this article, we describe a web-based data management system developed for longitudinal, multi-center clinical study of Creutzfeldt���Jakob Disease (CJD). CJD is the most notable of human prion diseases, a group of severe and fatal neurodegenerative disease. Prions are a new group of pathogens, poorly understood, that present significant scientific and public health challenges. They are thought to consist of proteins, devoid of the usual DNA/RNA genetic apparatus, which gain their pathogenic potential, as well as infectious properties and resistance to standard sterilization, solely from conformational changes. In other words, prions are chemically identical to normal bodily proteins, but become abnormal, infectious and toxic by misfolding and achieving an aberrant three-dimensional J Med Syst (2010) 34:25���33 DOI 10.1007/s10916-008-9212-2 A. Pozamantir : H. Lee : I. Prohovnik (*) Department of Psychiatry, Mount Sinai School of Medicine, New York, NY, USA e-mail: Isak.Prohovnik@mssm.edu I. Prohovnik Department of Radiology, Mount Sinai School of Medicine, New York, NY, USA J. Chapman Department of Neurology, Sheba Medical Center, Tel Hashomer, Israel
structure. Etiology can be infectious, hereditary, sporadic, or iatrogenic, and its symptoms primarily include move- ment disorders and rapid cognitive decline [1]. The most publicized infectious form is vCJD, mostly encountered in the UK due to consumption of beef contaminated by Bovine Spongiform Encephalopathy, or ���Mad-Cow disease���. Sporadic CJD (sCJD) is the most common subtype of CJD (85���90%), with incidence of about 1/1,000,000/year while hereditary CJD (fCJD) accounts for about 10% of cases worldwide, caused by mutations of the gene encoding the normal form of the prion protein. The most common of these mutations occurs in codon 200 (E200K). There is currently no treatment for any of the prion diseases. This study was designed to investigate a genetic form of the disease, and in particular to examine the transition from preclinical to clinical stage of mutation carriers. All of the participants of the study were recruited and examined in Israel, which has the world���s largest cluster of families affected by the E200K mutation known to cause the disease. The data collected in Israel included medical history, biochemical and genetic tests, structured neurological examinations, neuropsychological tests, and numerous other types of data, as well as extensive MRI imaging data. Participants include three groups: healthy mutation carriers, healthy noncarriers, and symptomatic subjects, and all undergo longitudinal follow-up with periodic examinations. Healthy subjects are examined annually, unless a clinical change or our system indicates the need for more frequent exams, and the CJD patients are examined monthly. The data are inspected for quality assurance, organized and analyzed at the Mount Sinai Medical Center in New York, with the participation of several consultants and collaborators in other US hospi- tals. All of these investigators require access to the data, at varying degrees of authority and blindness. This article describes the structure, function and capabilities of the system developed to manage these data. Methods and system description The planning and specifications for the system were primarily performed by a small group consisting of the overall project principal investigator, the Israeli site principal investigator, the coordinating site system admin- istrator and the system engineer. The initial planning and design phase took about 3 months, prototype implementa- tion about three more months, and then (phase 2) another 6 months of testing, debugging and upgrading. During phase 2, the system was in rudimentary operation, unskilled users were entering data, and their feedback, questions and requests were the basis of final implementation. Specifications Reflecting the nature of this research project, the following specifications were defined for the system: Software and hardware requirements 1. Simple, inexpensive and non-specialized hardware and software platforms, since it must be accessible from multiple locations and the budget was severely constrained by NIH grant funding. 2. Flexible and open code and structure, allowing portability, adaptive adjustments and expansion. 3. Due to network security concerns in Israel, con- ventional computer communication protocols such as telnet, ftp/sftp, or ssh were not permitted in either direction. This left only the World Wide Web (WWW) protocol as a viable solution. 4. Store and manipulate up to about 1 GB of data per record. System interface requirements 1. Contain, organize and maintain several types of data, including imaging, genetics, clinical tests and ratings, and personal and demographic information. 2. Maintain several levels of confidentiality, security and privileges, due to the complex nature of our collaboration, the need to maintain blindness for several investigators, and the sensitivity of genetic information. For example, some types of MRI data were acquired in Israel, anonymized at Mount Sinai, rated blindly at Yale University, and inte- grated with other data back at Mount Sinai. 3. Allow export of data for statistical analysis. 4. Simple and convenient data entry and system operation, accommodating users with minimal computing skills and basic equipment. Data flow needs 1. Allow data entry, examination, and manipulation from several sites world-wide. 2. The need for data migration should be minimized, preferably eliminated altogether. All intermediary data carriers, such as paper documents, computer files, CDs etc. should be eliminated in favor of online real-time information entry and storage. 3. Being a longitudinal study requiring multiple and variable repeat examinations, remind investigators of pending future visits, as well as alarms changing clinical status. 4. Maintain one central repository for all the research data incoming from various sites. 26 J Med Syst (2010) 34:25���33