Abstract
The use of co-occurrence data is common in various domains. Co-occurrence data often needs to be normalized to correct for the size effect. To this end, van Eck and Waltman (2009) recommend a probabilistic measure known as the association strength. However, this formula, based on combinations with repetition, implicitly assumes that observations from the same entity can co-occur even though in the intended usage of the measure these self-co-occurrences are nonexistent. A more accurate measure based on combinations without repetition is introduced here and compared to the original formula in mathematical derivations, simulations, and patent data, which shows that the original formula overestimates the relation between a pair and that some pairs are more overestimated than others. The new measure is available in the EconGeo package for R maintained by Balland (2016).
Author supplied keywords
Cite
CITATION STYLE
Steijn, M. P. A. (2021). Improvement on the association strength: Implementing a probabilistic measure based on combinations without repetition. Quantitative Science Studies, 2(2), 778–794. https://doi.org/10.1162/qss_a_00122
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.