Informed peer review and uninform...
Research Evaluation March 2011 0958-2029/11/01031-16 US$12.00 �� Beech Tree Publishing 2011 31 Research Evaluation, 20(1), March 2011, pages 31���46 DOI: 10.3152/095820211X12941371876382 http://www.ingentaconnect.com/content/beech/rev Informed peer review and uninformed bibliometrics? J��rg Neufeld and Markus von Ins Recent literature on issues relevant to bibliometric indicator relations and peer review discusses whether bibliometric indicators can predict the success of research grant applications. For example, Van den Besselaar and Leydesdorff (2009) reported a higher average number of publications/citations for the group of approved applicants than for the rejected applicants (section Social and Behavioral Sciences of the Netherlands Organization for Scientific Research [NOW], MaGW). However, this difference disappears or even reverses when the group of 275 successful applicants was compared only to the best 275 rejected applicants. Given these findings, we have continued our analyses of publication data of applicants for the Emmy Noether-Programme (ENP) provided by the German Research Foundation. First, we compared the group of actual ENP applicants to a sample of potential applicants, which revealed a ���lack of low performers��� among the actual ENP applicants. Furthermore, we conducted discriminant analyses to predict funding decisions on the basis of several bibliometric indicators. ESEARCH RESULTS concerning the corre- lation between funding decisions in applica- tion-based research funding and the applicants��� bibliometric performance are heteroge- neous. For example, Van den Besselaar and Leydesdorff (2009) reported a higher average num- ber of publications/citations for the group of ap- proved applicants (n = 275) than for the rejected applicants (n = 903). However, this difference dis- appears or even reverses when the group of 275 suc- cessful applicants is compared only to the best 275 rejected applicants. By applying this approach to applicants to the Molecular Biology Organization (EMBO) and to selected fields (psychology and eco- nomics) of the Netherlands Organization for Scien- tific Research���s section for social and behavioral sciences (MaGW), Bornmann et al (2010) revealed nearly the same results concerning the mean number of total citation counts. Regarding the mean h-index values and the mean number of publications, funded applicants of the EMBO show higher values than the best of the rejected group, albeit differences are not significant. In our own studies (Hornbostel et al, 2009), we found virtually equal bibliometric perfor- mance (e.g. publications per year, citations per paper) of approved and rejected applicants of the Emmy Noether Programme (ENP), as provided by the German Research Foundation (DFG). Melin and Danell (2006) presented similar results for the appli- cants of the Individual Grant fort the Advancement of Research Leaders provided by the Swedish Foun- dation for Strategic Research. By comparing h- indices from rejected and approved B��hringer Ingel- heim Fonds applicants, Bornmann and Daniel (2007) identified higher average index values in the group of approved applicants, although the distributions of both groups overlapped. How to account for these partly different find- ings? One issue might be the composition of the ap- plicant groups for the different funding schemes. The prevailing assumption is that eligibility criteria addressing past publication performance lead to formation of a group of factual applicants with an above-average publication performance compared to R J��rg Neufeld (corresponding author) and Markus von Ins are at the Institute for Research Information and Quality Assurance (IFQ), Godesberger Allee 90, D-53175 Bonn, Germany Email: neufeld@forschungsinfo.de Tel: +49-228-97273-22. We are indebted to Anna Schelling for proofreading, Nathalie Huber and Susan B��hmer for their useful feedback, the referees and editors for their valuable remarks, and to the German Re- search Foundation (DFG) for enabling this study by providing data and resources.
Informed peer review and uninformed bibliometrics? Research Evaluation March 2011 32 the group of potential applicants (���self-selection��� see B��hmer and Von Ins, 2009). The more appli- cants exhibiting sufficient past publication perfor- mance, the less this could serve as a decision criterion. Accordingly, fewer (bibliometric) differ- ences could be seen between the groups of funded and non-funded applicants. The above-mentioned studies were conducted as evaluations of funding organizations and in particu- lar their peer review systems. Thus, another possible explanation is that some of the examined review sys- tems simply do a better job than others in identifying the best applicants. A third explanation might be that ��� depending on the funding scheme ��� conventional bibliometric indicators are sometimes less apt to grasp the respec- tive schemes��� funding criteria. Considering the above questions, we first ask which eligibility criteria are active for the ENP and which criteria are actually applied by the reviewers. Subsequently, we try to operationalize these criteria by deliberately selecting/developing indicators, and then check how these indicators correspond to the funding decisions. In a third step a discriminant analysis combining these indicators is performed. Methods and data Background The ENP was set up by the DFG in order to prepare young scientists of excellence for a professorship by giving them the opportunity to lead a research group at an early stage of their career (generally up to four years after obtaining a PhD). Each proposal was evaluated by at least two assessors (elected for four years), who could consult a third expert if special- ized knowledge was needed.1 Assessors gave rec- ommendations to the DFG committee, who made the final decision, which was generally in accordance with the assessors��� recommendations. Data The following analyses are based on three types/ sources of data: 1. In preparation of the bibliometric analyses for the evaluation of the ENP, the publication lists of 495 (Table 1) applicants (medicine, physics, biology, chemistry) have been compiled and checked for completeness and consistency by the applicants.2 Success rates (within the sample) vary from 41% in medicine to 57% in biology. Only full articles have been included and related citations (only from citing articles) have been researched in Web of Science (WoS, Thomson Reuters ISI) in co- operation with the Institute for Science and Tech- nology Studies, Bielefeld. For each publication a three-year citation window (publication year plus two subsequent years) was chosen. References were prepared in a similar way (reference win- dow: year of publication and the two preceding years) for calculating ���reference normalized��� impacts.3 2. A sample list of professors (n = 709) was drawn from the register K��rschners Deutscher Gelehrten-Kalender (2009) serving as a compari- son group (potential applicants). The register con- tains data depicting nearly all active professors in Germany.4 Based on the included CV-data we re- searched those publications from the study group appearing up to four years after obtaining their PhD during the interval 1992���2004 in WoS, which corresponds to the factual applicants��� ca- reer stages. 3. In the context of documentary analyses 129 anon- ymous reviews of 50 applications/proposals have been investigated in order to get details about re- viewers��� rationale in decision-making. Matching this review information with applicants��� biblio- metric data was not possible in accordance with data privacy protection laws. Criteria for ENP applications/applicants and their operationalization The documentary analysis of 129 ENP application reviews provides indications not only about the ex- tent to which reviewers orientate their judgments toward explicitly named funding criteria, but also delivers details about the question how far the (past) publication performance of applicants is actually taken into account.5 The criteria we found in the re- views are displayed in Figure 1. The stated rationale upon which the judgments were based typically proved to be related to proposal and applicant quali- ties. ���Past performance��� in the form of publications was mentioned in some of the inspected reviews on- ly. However, when mentioned, it was only viewed as one aspect among many. This observation narrows the expectations regarding the reproducibility of re- viewer judgments by means of bibliometric indica- tors. Nonetheless, even if reviewers do not consider applicants��� publication lists, there is a chance that successful applicants will combine several posi- tive attributes. Hence, on an aggregate level, the Table 1. Sample: ENP applicants Funding decision Field Rejected Approved Total Physics 51 74 125 Medicine 99 68 167 Biology 49 65 114 Chemistry 42 47 89 Total 241 254 495
Informed peer review and uninformed bibliometrics? Research Evaluation March 2011 33 publication performance might be linked to other relevant qualities such as experience, presentation skills and conceptual strength, which are typically accessible to the reviewers. Earlier analyses regarding the ENP applicants (Hornbostel et al, 2009) did not show strong correla- tions between funding decisions and bibliometric standard indicators for past performance (citations per paper, number of publications, etc.). Therefore, in the current work we endeavored to find out whether deliberately selected/developed bibliometric indicators show a higher accordance with funding decisions than do standard indicators. In the follow- ing section we describe relevant concepts and crite- ria and their bibliometric operationalization. Table 2 gives an overview of our schema. Research output and impact of individual researchers The DFG���s eligibility criteria ask for ���outstanding publications in high-ranking international specialist journals or comparable���. This suggests the use of the journal impact factor (JIF) as a measure for high- ranking journals, which we do in form of the frac- tional mean journal impact factor6 of articles pub- lished by applicants in the period before their application. Research or publication performance can be considered in quantitative as well as in qualitative regards. As a quantitative measure, we chose frac- tional publications, which we assume is a better proxy for applicants��� contribution to their publica- tion lists than is the full count of publications. It is known that the author���s name is not randomly positioned in the list of authors of publications in the fields of medicine and biology. In fact, a specific role is assigned to the individual author according to the author���s hierarchical position, or alternately as a direct contributor to the publication. Typically, the first position indicates the person who has ���done the work���, whereas the last position is reserved for the responsible institute director or group leader. In Germany, this scheme is well-established and incor- porated in several formula-based funding systems. The DFG itself promotes a scheme in evaluation- based funding for medical research, which ascribes one third of a publication (which means one third of the journal���s JIF) to each first and last author (DFG, 2004). The remaining authors located in the center of the list share the residual third. We use this scheme in the fields of medicine and biology for the fractional counting of publications. When it comes to rating the quality of the publication output, cita- tion analyses are standard. Even if there is no Quality of application ���Excellence��� of applicant Formal correctness and ���eligibility��� Style Content Innovation Risk Relevance Methods Reputation of host institution Independence ���self-contained research profile��� Oral presentation of the proposed project Skills/ experience applications for third-party funding, project management, teaching Past performance Publication list Publication experience Research awards Criteria applied by reviewers according to the analysis of 50 applications reviews (DFG Emmy Noether-Programme) Reputation of the research group respectively coauthors Figure 1. Documentary analyses of 129 application reviews ��� criteria named by reviewers of the Emmy Noether Programme Table 2. Criteria for ENP applications/applicants and their operationalization Concept/Criterion Indicator Individual research output Fractional publication number (article) Impact, relevance of research output Reference normalized citation rate ���Quality��� of publications and journals Fractional mean JIF Individual quality threshold/standard Share of cited publications Independence Share of publications with applicant as first author Share of publications with fewer than 4 coauthors Research experience Time span between first publication and application Reputation of coauthors Highest h-index value of coauthors ���Young��� scientist? Applicants��� age