Being one of the largest families in the angiosperms, Orchidaceae display a great biodiversity resulting from adaptation to diverse habitats. Genomic information on orchids is rather limited, despite their unique and interesting biological features, thus impeding advanced molecular research. Here we report a strategy to integrate sequence outputs of the moth orchid, Phalaenopsis aphrodite, from two high-throughput sequencing platform technologies, Roche 454 and Illumina/Solexa, in order to maximize assembly efficiency. Tissues collected for cDNA library preparation included a wide range of vegetative and reproductive tissues. We also designed an effective workflow for annotation and functional analysis. After assembly and trimming processes, 233,823 unique sequences were obtained. Among them, 42,590 contigs averaging 875 bp in length were annotated to protein-coding genes, of which 7,263 coding genes were found to be nearly full length. The sequence accuracy of the assembled contigs was validated to be as high as 99.9%. Genes with tissue-specific expression were also categorized by profiling analysis with RNA-Seq. Gene products targeted to specific subcellular localizations were identified by their annotations. We concluded that, with proper assembly to combine outputs of next-generation sequencing platforms, transcriptome information can be enriched in gene discovery, functional annotation and expression profiling of a non-model organism.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below