This article addresses the systematic and complete enumeration of all the substructures of any size present in a given molecule. The study is not restricted to features which could be defined a priori such as rings or chains. Contrary to prior expectation the exhaustive enumeration is tractable with current computational tools. Results are presented for several families of skeletons which are widespread in chemistry. It is shown that the numbers of constituent substructures of each size are related to the molecular topology, in particular the degree of branching. The number substructures which are distinct depends additionally on the number of different atom and bond types present. The overall shapes of the distribution of substructure counts as a function of substructure size are found to be similar within particular classes of molecules. These distributions are compared and found to be characteristic of certain topologies. For several simple classes of molecule, analytic expressions are provided for the numbers of substructures as a function of fragment and molecule size. These results hold promise for identifying potentially useful scaffolds for use in combinatorial chemistry.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below