Crowdsourcing Real-Time Research Trend Data
Significance (2010)
Available from
Jan Reichelt's profile on Mendeley.
or
Abstract
This paper proposes a demo of Mendeley at WWW10. Mendeley is a research workflow and collaboration tool, which crowdsources real-time research trend information and semantic annotations of research papers in a central data store. We describe how Mendeleys data can overcome some of the weaknesses of traditional citation-based impact factor metrics and tag-based semantic databases. In the 12 months since its public launch, Mendeley has captured information on 13 million research papers, with its database doubling every 12 weeks.
Available from
Jan Reichelt's profile on Mendeley.
Page 1
Crowdsourcing Real-Time Research ...
Copyright is held by the author/owner(s). FWCS2010, April 26, 2010, Raleigh, USA. Crowdsourcing Real-Time Research Trend Data Victor Henning Mendeley 144a Clerkenwell Road London EC1R 5DF +44 207 713 8486 victor.henning@mendeley.com Jason J. Hoyt Mendeley 144a Clerkenwell Road London EC1R 5DF +44 207 713 8486 jason.hoyt@mendeley.com Jan Reichelt Mendeley 144a Clerkenwell Road London EC1R 5DF +44 207 713 8486 jan.reichelt@mendeley.com ABSTRACT This paper proposes a demo of Mendeley at FWCS2010. Mendeley is a research workflow and collaboration tool, which crowdsources real-time research trend information and semantic annotations of research papers in a central data store. We describe how Mendeley���s data can overcome some of the weaknesses of traditional citation-based impact factor metrics and tag-based semantic databases. In the 12 months since its public launch, Mendeley has captured information on 13 million research papers, with its database doubling every 12 weeks. Categories and Subject Descriptors H.3.7 [Information Storage and Retrieval]: Digital Libraries ��� collection, dissemination, standards, user issues. General Terms Management, Measurement, Design, Standardization. Keywords Usage-Based Impact Measurement, Crowdsourcing, Article-level Metrics, Journal Impact Factor, Real-Time, Research Trends, Scientific Databases 1. THE SHORTCOMINGS OF CITATION- BASED IMPACT METRICS Citation-based reputation metrics such as the Journal Impact Factor (JIF), the h-index or the g-index play an ever-increasing role in modern science [1, 2, 3]. As seemingly objective measures of academic impact and performance, they are used to determine career progression, post-doc positions, tenure, and grant funding. Pressure on scholars to perform well according to these metrics has mounted. So has the criticism leveled against such metrics. It has been argued that these metrics can lead to academics engaging in citation bartering, gratuitous authorship, and a general increase in aggressive, exploitative, and self-promotional behavior [4]. Citation-based metrics are also thought to tempt journal editors into gaming the system using techniques that inflate their JIF. This includes only accepting papers expected to receive a higher number of citations, encouraging self-citations, and publishing review articles in place of research articles. From a methodological perspective, critics point out that citation counts are context-free, i.e. a citation is counted as positive even if a paper was cited in a negative context. Moreover, the Gini coefficient of the citation distribution is extremely high. A small fraction of all papers garner the majority of all citations, while the majority of all papers are never cited at all [5]. This also implies that a single highly cited article could inflate a JIF. Another major problem for JIFs is the arbitrary two-year window within which citations are measured, which favors fast-evolving disciplines. The h-index, similarly, is arbitrarily bound by the number of papers a researcher has published ��� so a young researcher with a few high-impact publications will still have a low h-index. Finally, there is evidence that only 20% of all papers cited have actually been read by the authors citing them [6]. 2. USAGE-BASED IMPACT METRICS ON MENDELEY Our demo at FWCS2010 will exhibit alternative, usage-based impact metrics which could potentially alleviate many of the problems associated with traditional citation-based metrics. Our usage-based metrics rely on a distributed measurement of article readership on Mendeley [7], a desktop- and web-based research management and collaboration tool. Mendeley Desktop, a free and cross-platform desktop application, automatically extracts metadata, full-text and cited references from research papers to minimize manual data input when setting up a local research paper database. It then enables researchers to manage, tag, full- text search, read and annotate PDF documents, share research papers with colleagues, and create bibliographies in word processors and text editors. Users can sync their libraries and annotations to the companion website, Mendeley Web, and manage them online. In this way, Mendeley Web has accumulated data on more than 13 million research papers and 150 million citations, by more than 100,000 users, in the first 12 months after its public launch. With Mendeley���s research paper database doubling in size roughly every 12 weeks, it is on track to surpass Thomson Reuters��� Web of Science catalogue of 40m full-text documents and 700m citations at some point this year. Our starting point for usage-based metrics is to track the pervasiveness of research papers in Mendeley user libraries, i.e. whether they are present on the computers of a wide-ranging, distributed sample of academics. Preliminary investigation suggests that, for example, the correlation between Thomson Reuters��� ISI citation count of the ���Top 5 Biology Papers of 2009��� and their corresponding readership number on Mendeley is r=.76 [8]. More encompassing correlation statistics will be presented during our demo at FWCS2010. Readership metrics can be seen as a measure of the popularity or awareness that a paper ��� and by association, its author, publication journal, and topic ��� is enjoying. A second, more fine- grained usage metric which Mendeley will begin to start tracking by FWCS2010 is the actual time the users spend reading each
Page 2
research paper in Mendeley���s integrated PDF viewer, and the number of repeat readings per paper. This is a measure of the intensity with which the paper (its author, publication journal, topic, respectively) is being examined. A major advantage of such usage-based metrics is that they are available immediately on a ���per article��� basis. Usage-based metrics let authors track how readership of their individual papers is evolving in real-time, and the article���s impact can evolve independently of the journal and its impact factor. To better understand the readership, Mendeley also collects anonymous demographic information alongside the usage metrics. This information is presented to users in different segments such as geographic region, academic discipline, or junior versus senior faculty. Mendeley���s usage-based metrics let researchers retrieve the ���hottest��� papers for each topic (as marked by user-generated tags assigned to papers), the ���hottest��� tags in each academic discipline (to spot emerging research trends), or up-and-coming authors with the highest percentage growth in readership in the past month. By looking at longitudinal trend data, scholars might be able to assess whether a paper, topic or theory is steadily gaining followers, is subject to a sudden ���hype,��� or is already on the decline again. 3. REFERENCES [1] Garfield, E. The history and meaning of the journal impact factor. JAMA 295, 1 (2006), 90-93. [2] Hirsch, J. An index to quantify an individual's scientific research output. PNAS 102, 46 (2005), 16569-72. [3] Egghe, L. Theory and practise of the g-index. Scientometrics 69, 1 (2006), 131-51. [4] Lawrence, P. Lost in publication: how measurement harms science. ESEP 8, (2008), 9-11. [5] Weale, A., Bailey, M., and Lear, P. The level of non-citation of articles within a journal as a measure of quality: a comparison to the impact factor. BMC Medical Research Methodology 4, 1 (2004), 14. [6] Simkin, M. and Roychowdhury, V. Do you sincerely want to be cited? Or: read before you cite. Significance 3, 4 (2006), 179-181. [7] http://www.mendeley.com. [8] Henning, V. The Top 10 Journal Articles Published in 2009 by Readership on Mendeley. Mendeley Blog (2010), http://www.mendeley.com/blog/academic-features/the-top- 10-journal-articles-published-in-2009-by-readership-on- mendeley/.
Readership Statistics
26 Readers on Mendeley
by Discipline
19% Social Sciences
by Academic Status
23% Student (Master)
19% Librarian
15% Other Professional
by Country
38% United Kingdom
12% United States
12% Canada
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime






