This paper aims to propose the sequential pattern discovery method of Deoxyribonucleic Acid (DNA) sequence database in order to identify cancer disease. The DNA which is composed of amino acids of gene P53 is mutated. It effects to change of P53 formation. Sequential pattern discovery is a process of extracting data to generate knowledge about the series of events that has the sequences in a certain frequency so that it creates a pattern. PrefixSpan is to propose method to find a pattern of DNA sequence database. As a result, there are various selected patterns of DNA sequence. The pattem which has high similarity is used as biomarker to identify the breast cancer disease. The performance measure of support value average is 0.8. It means that the frequent sequence pattern is high. Another measure is confidence. All of the confidence values are 1. Then, the last performance measure is lift ratio at average more than 1. It means that the composed sequence items in the pattern has high dependency and relatedness. Futhermore, the selected patterns are applied as biomarker with accuracy as 100%.
CITATION STYLE
Muflikhah, L., & Yuliantoro, I. (2017). Identifying Cancer Disease through Deoxyribonucleic Acid (DNA) Sequential Pattern Mining. International Journal of Intelligence Science, 07(01), 9–23. https://doi.org/10.4236/ijis.2017.71002
Mendeley helps you to discover research relevant for your work.