Background: The sequencing and analysis of ESTs is for now the only practical approach for large-scale gene discovery and annotation in conifers because their very large genomes are unlikely to be sequenced in the near future. Our objective was to produce extensive collections of ESTs and cDNA clones to support manufacture of cDNA microarrays and gene discovery in white spruce (Picea glauca [Moench] Voss). Results: We produced 16 cDNA libraries from different tissues and a variety of treatments, and partially sequenced 50,000 cDNA clones. High quality 3′ and 5′ reads were assembled into 16,578 consensus sequences, 45% of which represented full length inserts. Consensus sequences derived from 5′ and 3′ reads of the same cDNA clone were linked to define 14,471 transcripts. A large proportion (84%) of the spruce sequences matched a pine sequence, but only 68% of the spruce transcripts had homologs in Arabidopsis or rice. Nearly all the sequences that matched the Populus trichocarpa genome (the only sequenced tree genome) also matched rice or Arabidopsis genomes. We used several sequence similarity search approaches for assignment of putative functions, including blast searches against general and specialized databases (transcription factors, cell wall related proteins), Gene Ontology term assignation and Hidden Markov Model searches against PFAM protein families and domains. In total, 70% of the spruce transcripts displayed matches to proteins of known or unknown function in the Uniref 100 database (blastx e-value
CITATION STYLE
Pavy, N., Paule, C., Parsons, L., Crow, J. A., Morency, M. J., Cooke, J., … MacKay, J. (2005). Generation, annotation, analysis and database integration of 16,500 white spruce EST clusters. BMC Genomics, 6. https://doi.org/10.1186/1471-2164-6-144
Mendeley helps you to discover research relevant for your work.