Blasted cell line names

Jing Wang; Lauren A. Byers; John S. Yordy; Wenbin Liu; Li Shen; Keith A. Baggerly; Uma Giri; Jeffrey N. Myers; K. Kian Ang; Michael D. Story; Luc Girard; John D. Minna; John V. Heymach; Kevin R. Coombes

Journal ArticleOPEN ACCESS

Blasted cell line names

Cancer Informatics (2010) 9 251-255

DOI: 10.4137/CIN.S5613

0Citations

12Readers

Abstract

Background: While trying to integrate multiple data sets collected by different researchers, we noticed that the sample names were frequently entered inconsistently. Most of the variations appeared to involve punctuation, white space, or their absence, at the juncture between alphabetic and numeric portions of the cell line name. Results: Reasoning that the variant names could be described in terms of mutations or deletions of character strings, we implemented a simple version of the Needleman-Wunsch global sequence alignment algorithm and applied it to the cell line names. All correct matches were found by this procedure. Incorrect matches only occured when a cell line was present in one data set but not in the other. The raw match scores tended to be substantially worse for the incorrect matches. Conclusions: A simple application of the Needleman-Wunsch global sequence alignment algorithm provides a useful first pass at matching sample names from different data sets. © The author(s), publisher and licensee Libertas Academica Ltd.

Author supplied keywords

Cite

CITATION STYLE

APA

Wang, J., Byers, L. A., Yordy, J. S., Liu, W., Shen, L., Baggerly, K. A., … Coombes, K. R. (2010). Blasted cell line names. Cancer Informatics, 9, 251–255. https://doi.org/10.4137/CIN.S5613

Blasted cell line names

Abstract

Author supplied keywords

Cite

Register to see more suggestions