scientiﬁc output than reviewer scores funding: Research past performance is a stronger predictor offuture Journal of Informetrics

(1)

ContentslistsavailableatScienceDirect

Journal of Informetrics

jo u r n al hom e p ag e :w w w . e l s e v i e r . c o m / l o c a t e / j o i

Regular article

Research funding: past performance is a stronger predictor of future scientiﬁc output than reviewer scores

Balázs Gy ˝orffy

^a,b,∗

, Péter Herman

^a,b

, István Szabó

^c

aSemmelweisUniversityDepartmentofBioinformaticsand2ndDept.ofPediatrics,T ˝uzoltóutca7-9.,1094,Budapest,Hungary

bTTKLendületCancerBiomarkerResearchGroup,InstituteofEnzymology,MagyarTudósokkörútja2,1117,Budapest,Hungary

cSzentIstvánUniversity,PáterKárolyutca1.,2100Gödöll ˝o,Hungary

a rt i c l e i n f o

Articlehistory:

Received3October2019 Receivedinrevisedform 22December2019 Accepted4May2020 Availableonline5June2020

Keywords:

Funding

Reviewerassessments Basicresearch Publications Scientiﬁcoutput Q1

H-index

Internationalization

a b s t ra c t

Scientificgrantsareawardedalmostexclusivelyonthebasisofanindependentpeerreview ofaproposalsubmittedbytheprincipalinvestigator(PI).Thewritingandreviewingof theseapplicationsconsumesasignificantamountofresearchers’time.Here,weperform alarge-scaleperformanceevaluationofreview-basedgrantallocationviaanalysisofthe grantproposalssubmittedtotheHungarianScientificResearchFund.

Intotal,42,905scoredreviewreportspreparedfor13,303proposalssubmittedbetween 2006and2015wereanalyzed.ThepublicationandcitationcharacteristicsofthePIswere obtainedfromtheHungarianScientiﬁcWorkArchive(www.mtmt.hu).Eachpublication wasassignedtoitsrespectiveSCImagoJournalRankcategory,andonlypublicationsinthe ﬁrstquarter(Q1)wereconsidered.Citation,H-indexandpublicationdatawerederivedfor eachanalyzedyearforeachresearcher.

Ofallproposals,3455werefunded(26%).PIswithafundedproposalhadsigniﬁcantly moreQ1articlesandﬁrst/lastauthoredQ1articles(1.91vs.1.30,p<1e-16and0.82vs0.53, p<1e-16,respectively).Ofthesuccessfulapplications,thoseinvolvinginternationalcollab- orationsandextendedbudgethadhigherpublicationoutput.Applicantage,grantduration, andsubmissionyearwerenotcorrelatedwithpublicationperformance.Reviewerscores displayedaminorassociation(corr.coeff=0.08-011)withthenumberofQ1publications.

Internationalreviewersweresigniﬁcantlylessefﬁcientthannationalreviewers(p=0.021).

Astrongcorrelationwithoutputwasobservedforthescientometriccharacteristicsofthe applyingPIatthetimeofsubmission,includingH-index(corr.coeff=0.45-0.54),indepen- dentcitation(corr.coeff.=0.46-0.62),andyearlyaverageQ1articles(corr.coeff=0.63-0.79, p<1e-16).Similarcorrelationswereobservedfornonfundedapplicants.

Weperformedacomprehensiveevaluationofreview-basedresourceallocationefﬁ- ciencyinbasicresearchfunding.Evidencesuggeststhatthepastscientometricperformance oftheprincipalinvestigatoristhebestpredictoroffutureoutput.

∗ Correspondingauthorat:SemmelweisUniversityDepartmentofBioinformaticsand2ndDept.ofPediatrics,t ˝uzoltóUtca7-9.,1094,Budapest,Hungary E-mailaddress:gyorffy.balazs@med.semmelweis-univ.hu(B.Gy ˝orffy).

https://doi.org/10.1016/j.joi.2020.101050

4.0/).

(2)

2 B.Gy ˝orffy,P.HermanandI.Szabó/JournalofInformetrics14(2020)101050 1. Introduction

Whileresearchgrantfinancingisa keyfoundationofscientificproductivity,itsoveralleffectivenessisa subjectof debate.Byinvestigating20yearsofNIHgrants,JacobandLefgrenhaveuncoveredapproximately1.2publications(andonly 0.2first-authorpublications)linkedtoanaverageNIHgrantof1.7millionUSD(Jacob&Lefgren,2011).AdifferentUS-based studyrelatedanincreaseof$1millioninfederalresearchfundingtoauniversityto10morearticlesand0.2morepatents (Payne&Siow,2003).Otherresearchershavequestionedthevalueoffinancialincentives;forexample,intheuniversities ofeightEuropeancountries,noforthrightconnectionbetweenfundingandresearchperformancewaspresent(Auranen

&Nieminen,2010).Generally,nationalresearchsystemsfeaturingaperformance-basedevaluationhavehigheroutput thannationswithoutsuchasystem(Sandström&VandenBesselaar,2018).Afterestablishinganevaluationsystemand introducingperformance-basedfunding,Australiawasabletoboostitsresearchoutputwhilesimultaneouslyimprovingits researchquality(vandenBesselaar,Heyman,&Sandström,2017).Recently,theChinesegovernmenthaseveninitiateda newperformance-basedﬁnancialprogramcalledthe“doubleﬁrst-class”plantocatapultindividualuniversitydepartments intoworldclass(Wang,2019).

Importantly,inadditiontoavailablefunding,severaladditionalfactorshavebeenassociatedwithpublicationoutput.

Normalizedforpopulationsize,English-speakingnationshavethehighestrateofscientiﬁcpapers(Man,Weinkauf,Tsang,

&Sin,2004).Affiliationswitheliteinstitutionsarealsopositivelyassociatedwithpublicationyield(Arora&Gambardella, 1997).Inadditiontothefirsttwoyearsofaresearchcareer,maleshaveacontinuouslyhighernumberofpublicationsper year,andessentiallyallhyperproductivescientists(thosewith50ormorepapers)aremale(Symonds,Gemmell,Braisher, Gorringe,&Elgar,2006).Superstarsinvariousfieldsnotonlydrivetheirownproductivitybutalsoboosttheircollaboration partners.Theextinctionofsuperstarsleads,onaverage,toalasting5to8%declineinthequality-adjustedpublicationrates oftheircoauthors(Azoulay,GraffZivin,&Wang,2008).

Higherresearchproductivitysubsequentlyleadstoevenmorehighlycitedpapers.Ithasbeendemonstratedinalarge internationalcohortthattheincreasingthenumberofpublicationsalsoincreasestheshareofhighlycitedpublications, especiallyforoldercohortsofresearchers(Lariviere&Costas,2016).AsimilarstudyfocusingonSwedishscientistsobserved constantorincreasingmarginalreturnswithhighernumbersofpublicationsinmostresearchﬁelds,includingchemistry, lifesciencesandsociology(Sandstrom&vandenBesselaar,2016).

Whenfocusingongovernmentfunding,theallocationofresearchbudgetsisdonealmostexclusivelyonthebasisofgrant applicationssubmittedbytheresearchentities.Theevaluationoftheseproposalsisoneofthekeychallengesthatanyfunding agencyhastoface.Fromthemanagementsideandfromtheevaluatorside,theprocessconsumesmanyresources—both humanandﬁnancial.Proposalsusuallyincludeagreatdealofinformationthatcanhardlybe“automatized”,andthus,they havetobeexaminedonanindividualbasisandmustbeevaluatedthroughtheintensiveworkforceusageofexternalexperts.

Thisresultsinevaluationprocessesthatarequitelengthyandinvolvemanyactors.Intheend,fundingdecisionstendto besubjective,astheyarebasedonimperfectinformationduethelackofcomparableandobjectivedataonapplicantsand proposals.

TheNationalResearch, Development,and Innovation Office(NRDIO)is theprincipal government-financedfunding agencyinHungary.Scientists submitapproximately1500applicationseach yearforbasicresearchgrants(alsodesig- natedasOTKAproposals).Foreachcall,applicationscanbesubmittedonceperyear,andeachproposalissubjecttoa nonblindedpeerreviewaswellasarankingsetbyascientificdiscipline-specificcommittee.Intheevaluationprocesslat- estpublicationdataaretakenintoaccountasindicatorsofrecentscientificperformance.Thenumberofgrantsfunded dependsontheoverallbudgetavailablefor thecallintheparticularfiscal year.Applicantswho areunsuccessfulcan resubmittheapplicationthenextyear,buttheirrankingisnotretained;anewrankingisestablishedineachevaluation round.

In this study,ourgoal wastoperform a large-scaleperformance evaluation of review-basedgrant allocation.We scrutinizedthe grant awarding practices, includingreview scoring atthe NRDIO. We also examinedtheoverall efﬁ- ciencyof thebasicresearchgrant program. Forthis, allapplicationsand allreviewerscoresbetween2006 and2015 wereanalyzed;a cutoffof2015wasusedtohaveat leastthreeyears offollow-upfor eachanalyzedobservation. To maketheanalysisofreviewerefﬁciencypossible,theunitofobservationwasnotaresearcherbutratheranevaluated proposal.

2. Methods 2.1. Datasources

Thedataforeachproposalwasextractedfromtheelectronicproposaladministrationforbasicresearchgrants(EPR)of theNationalResearch,Development,andInnovationOffice,Hungary.Proposalswererestrictedtothosesubmittedbetween 2006and2015.Proposalssubmittedafter2016werenotconsidered,asthereisstillinsufficientfollow-upforthese.For eachproposal,thetypeofproposal,thesubmissionyear,theapplicationnumber,thebirthyearofthePI,theproposallength (years),theuniqueMTMTidentifierofthePI,andtheoutcomeoftheevaluationwerecollected.

Atthesametime,thereviewerevaluationscoreswerealsogatheredforeachproposalusingthesamedatabase.These includeascorefortheresearcher,ascorefortheresearchplan,andanoverallscorefortheapplication.Eachofthesescores

(3)

canbefractionalnumbersandrangebetween0and10.Textualjustiﬁcationsandevaluationswerenotcollected.Foreach proposal,thenumberofreviewerswasalsonoted.Theyounginvestigatorexcellenceprogramdidnothaveascoreforthe researcher(onlyascorefortheresearchplanandoverallscore).

Inaddition,reviewersweredesignatedaseithernationalorinternationalbasedontheirtaxidentiﬁcationnumber.Those withaHungariantaxIDnumberwerelabeledasnationalreviewers.Ofnote,onlythederivednationalitywasusedinthe analysis,andtheactualtaxnumberofthereviewersremainedblindedduringtheinvestigation.

2.2. Publicationdata

PublicationandcitationdataforeachresearcherweredownloadedfromtheHungarianScientiﬁcWorkArchive(MTMT, https://www.mtmt.hu/).Dataincludingpublicationlist,citationlist,andH-indexwereretrievedforeachyearbetween2006 and2018foreachresearcheronMay22,2019.Whenevaluatingcitationsandpublications,onlypeer-reviewedpublications wereincluded,andothercategories,suchasconferenceabstractsand patents,wereomitted.Incitations,weaccepted independentcitationsonly,e.g.,whenthecitedandthecitingarticlesdonothaveanyoverlapintheauthorlist.When collectingpublicationdata,entirecalendaryearswereconsideredandnotthedateoftheactualsubmissionoftheproposal orcontractdateofthegrant.Finally,toenablethecontrolforthecompletenessofthepublicationdata,thedateofthelast declarationoftheresearcherregardingthecompletenessofpublicationandcitationdatawasalsonoted.

2.3. Articleranking

Wehavenotcollectedtheimpactfactorvalues,asthesecanbemarkedlydissimilarwhencomparingdifferentscientific disciplines.Instead,weassignedeachjournaltoitsrespectivequartilewithinitsscientificfieldbasedontherankofthejour- nalintheSCImagodatabase(http://www.scimagojr.com).Onlyfirst-quartile(Q1)publicationswereacceptedasscientific excellence,andnon-Q1articleswerenotconsidered.Foreachproposal,theaverageandtotalnumberofQ1publications duringtheproposedgrantrunningtimewerecomputed.TheusageofQ-rankswasthemostreliableandeasilyaccessible dataforthepublications.Wemustalsonotethatthemethodpresentedherecouldbeusedwithotherpublicationmetrics aswell(forinstance,theH-index).

Publicationswerefurthergaugedincase theapplicantwasthefirstorlastauthor.In thisanalysis,sharedfirst/last authorshipsorpositionasanon-first/lastcorrespondingauthorwerenotconsideredbecauseitwasnotpossibletomanually checkeachpublicationofeachresearcherforthesecategories.

2.4. Statisticalanalyses

DatabasehandlingwasexecutedintheRstatisticalenvironmentusingthepackages“httr”and“rvest”fordownloading andthepackages“stringr”and“dplyr”fordatamanipulation.

t-testStatisticalsigniﬁcancewassetatp<0.05.Graphsarepresentedasthemean±99%conﬁdenceintervals.Statistical analysisandvisualizationwereperformedinWinStatforExcel(R.FitchSoftware,Germany).

3. Results

3.1. Proposalcharacteristics

Intotal,13,303proposalssubmittedbetween2006and2015wereanalyzed.Theseproposalsreceived42,905scored reviewerassessments.Mostoftheproposalswerethematicresearchproposals(n=8943);thesearegrantsforthosewitha PhDdegreewithoutanagerestriction.Thesucceedinglargestcohortsenclosethepostdoctoralexcellenceprogramapplica- tions(n=2480)andtheyounginvestigatorexcellenceprogram(n=472),whicharebothforearly-stageresearcherswitha PhD.Generally,younginvestigatorproposalsandpostdoctoralprogramgrantsalsoincludethesalaryofthePI.Thegeneral budgetoftheseproposalsliesbetween50,000and200,000Euros.

Morefundingwasavailableinthehigh-budgetthematicresearchproposals(n=393)andinthehigh-budgetthematic researchproposalforyounginvestigators(n=159).Internationalcollaborationproposalsalsohadhigherbudgets,including thethematicresearchproposalwithinternationalcollaboration(n=380)and theNorwegianfundproposals(n=65).

NorwegianfundproposalsspeciﬁcallyincludecollaborationswithaNorwegianresearchinstitution.Finally,theremaining groupsincludepublicationssupportproposals(n=279)andacategoryforallotherapplications(n=132).Thedistribution ofthesubmittedproposalsisdepictedinFig.1

A.

Thetotalnumberofsubmittedproposalswasrelativelystable,withayearlyaverageof1330±505applications(Fig.1B).

Overthree-quartersofallproposalshadalengthofthreeyears;however,becauseweonlyconsideredentirecalendaryears, thesearedividedbetweenthree-andfour-year-longgrantsubmissions(Fig.1C).Only34proposalswerelongerthanﬁve years.Asmallcohortofproposalsﬁnishedwithinoneyear(n=183).

Almostallproposalswereevaluatedbymultipleexperts,andonly2.7%ofallreviewswereexecutedbyonlyonereviewer.

Atotalof45%ofallproposalswereevaluatedbythreereviewers(Fig.1D).Moreover,294proposalswerecheckedbymore

(4)

4 B.Gy ˝orffy,P.HermanandI.Szabó/JournalofInformetrics14(2020)101050

Fig.1.Overviewofthe13,303proposalssubmittedbetween2006and2015.Over86%ofproposalswereeitherthematicresearchproposalsor postdoctoralapplications(A).Theyearlymeanofsubmittedapplicationwasapproximately1,300(B),andmostproposalswereintendedfor3-4years(C).

(5)

thansevenreviewers;ofthese,sevengrantswereevaluatedby10reviewers,threegrantswereassessedby11reviewers, andonegrantwasreviewedby13reviewers.

SinceweusethedatafromtheMTMT,whichisnotautomaticallyupdatedasGoogleScholaris,itisimportanttovalidate theup-to-datestatusofthedatabase.WithinMTMT,authorsarerequestedtosignadeclarationregardingthecompleteness ofthedatabaseforbothpublicationandcitationdata.Thisdeclarationwassignedbyover90%oftheauthorssince2016,and only0.67%performedthelastupdatebefore2012(Fig.1E).Ofnote,theapplicationsweresubmittedby6031researchers, andanMTMTaccountwasaccessiblefor4218researchers.Ofthese,thedeclarationwassignedby4181fellows.Those withoutsigneddeclarationsinMTMTwerenotincludedintheperformanceevaluationanalyses.

3.2. Comparisonoffundedandrejectedproposals

Thesuccessrateoftheapplicationswas26%,whereas73%oftheproposalswererejected.Theremaining122proposals wereeitherretracted,ineligible,orthecontractagreementwasunsuccessful(Fig.2A).

ThoseresearcherswhowerefundedhadsigniﬁcantlymoreQ1articlesduringgranttimewhencomparedtothoserejected (p<1e-16,1.91±0.13vs.1.31±0.06,respectively,Fig.2B).Asimilardifferencewasobservedwhenﬁrst/lastauthoredpapers weretakenintoconsiderationonly(p<1e-16;0.82±0.05vs.0.53±0.02forfundedandrejected,respectively,Fig.2C).

Whencomparingtheyearlycitationbeforethegrantandafterthegrantusingthemeanoftwoyears,therewasno signiﬁcantdifferencebetweenapprovedanddisapprovedapplications(p=0.79).Thenominalincreasewasminimally higherinthoseapproved(5.98vs.5.13,Fig.2D).Thisisprobablyduethedelayedreceiptofcitationsafterpublication.

Wehavealsoanalyzedthedissimilaritiesrelatedtothedifferentproposaltypes.Whencomparingotherproposaltypesto thethematicresearchproposal,thosewithinternationalcollaborationandthosewithhigherbudgetswereabletoproduce moreQ1articles(p<1e-16,1.48±0.07vs.2.25±0.31vs.2.93±0.7forresearchproposalsvsinternationalcollaborationvs higherbudget,respectively).Productivitywasslightlylowerforyounginvestigatorsandpostdoctoralresearchers(1.18± 0.21and1.11±0.08,respectively).TheyearlyaveragenumberofQ1publicationsstratiﬁedbyproposaltypeisdepictedin Fig.2E.

3.3. Reviewerscoresandpublicationoutput

Reviewersprovidedthreescoresforeachapplication:anassessmentfortheapplicant,ascorefortheresearchplan,and anoverallscoreregardingtheentireproposal.Whencomparingthesescores(n=10,761)amongthefundedproposalsto thefourmajorparameters,includingtheyearlyaveragenumberofQ1publications,theyearlyaveragenumberoffirst/last authoredQ1publications,thesumofallQ1publicationsduringgrantrunningtime,andthesumofallfirst/lastauthored Q1publicationsduringgrantrunningtime,thecorrelationcoefficientsrangedbetween0.08and0.11(Fig.3).Thescoresfor theprincipalinvestigatorhadaslightlybettercorrelation(0.1-0.11)thanthescoresfortheapplicationandfortheentire proposal(0.08-0.09).Duetotheabundantsamplenumber,smallcorrelationsalsoachievedhighsignificance.

Asacontrol,foursemi-randomparameterswerealsocomparedtoscientiﬁcoutput.Theseincludethesubmissionyear, theregistrationnumberoftheapplication,thebirthyearoftheprincipalinvestigator,andthelengthoftheproposalinyears.

Withtheexceptionofthesumofallpublicationsandproposallength,alltheseparametersreachedacorrelationbetween -0.06and0.05.Longergrantshadachievedmorepublications(corr.coeff.0.14-0.15,Fig.3).

3.4. ScientometricparametersofthePIsatsubmission

Whencomparingthescientometricparametersoftheprincipalinvestigatoratthetimeofproposalsubmission,theyearly numberofQ1publicationshadthebestcorrelationwiththesubsequentpublicationoutputparameters(corr.coeff.0.62-0.79, Fig.3).TheH-indexandtheyearlyindependentcitationalsoshowedhighassociations(corr.coeff.between0.45-0.55and 0.46-0.62,respectively).Eachoftheseparametershadextremelystrongpvalues(Fig.3.).Thecorrelationwassimilarwhen comparingcoauthoredandﬁrst/lastauthoredpublicationsregardlessofwhetherthetotalnumberortheyearlyaveragewas considered.Overall,theuppermostcorrelationwasobservedbetweenpreviousandfutureyearlynumberofQ1publications (corr.coeff.=0.79).

3.5. Analysisofrejectedproposals

Anequivalentanalysiswasperformedforthoseproposalsthatwererejectedbytheagency.Whiletheoverallpicture remainedthesame,thereviewerscores(n=31,808)hadsomewhatbettercorrelations,andthescientometricparameters hadreducedcorrelationswithscientiﬁcperformanceinthissetting(corr.coeff.0.11-0.17and0.37-0.71,respectively,Fig.4.).

Almostallapplicationswereevaluatedbymultiplereviewers(D).Thepublicationlisthasbeenconﬁrmedasupdatedandcompleteforthevastmajority ofapplicantssince2016(E).

(6)

Fig.2.Comparisonofapprovedandrejectedproposalsshowsamarkedlyhigherpublicationactivityofthosefunded.Overall,26%ofallapplications werefunded(A).Duringtheproposedrun-timeofthesubmittedapplication,thosefundedpublishedmoreQ1articles(B)andmoreﬁrst/lastauthoredQ1 articles(C).Atthesametime,thecitationincreasewasnothigherattheendoftheproposedgranttimeforthosefunded(D).Publicationoutputisdifferent foreachproposaltype,withhigherperformanceforthoseinvolvinginternationalcollaborationandlargerbudgets(E).B,CandEshowtheyearlyaverage (Forinterpretationofthereferencestocolourinthisﬁgurelegend,thereaderisreferredtothewebversionofthisarticle).

(7)

Fig.3. ReviewerscoresareminimallybetterthanrandomparametersandsignificantlyworsethanPIscientometricperformancewhenpredicting futureexcellence.Publicationoutputmeasuredexclusivelyduringgrantrunningtime.Thestrongestconnectioncanbeobservedbetweenthescientometric performanceofthePIbeforegrantsubmissionandsubsequentpublicationperformance.Note:trulyrandomparameters(suchastheapplicationnumber) showsignificantpvaluesbecauseofthehighsamplenumber;anycorrelationwithacoefficientbelow0.1canbeconsideredunimportant.PI:principalinvestigator;

Q1:rankofthejournalinthefirstquartileaccordingtotheSCImagoJournalRankdatabase;first/last:onlypublicationswherethePIiseitherfirstorlastauthor.

Thecoefficientsrangebetween0and1,correlationcoefficientsclosertoeither-1or1arebetter(Forinterpretationofthereferencestocolourinthisfigurelegend, thereaderisreferredtothewebversionofthisarticle).

Randomparametersreceivedasimilarspread(corr.coeff-0.05-0.10,Fig.4.).Theseresultssuggestthatthereviewerswere indeedabletoﬁlteroutthepoorestproposals.

3.6. Comparisonofscientiﬁcdisciplines

Inthenextanalysisallproposalswerere-groupedaccordingtothescientiﬁcdiscipline.Toretainhighsamplenumbers, sampleswereassignedtothreemajorcohorts:“materialsciences”includingphysics,mathematics,engineering,informatics, andchemistry(n=11,493);“lifesciences”includingbiology,medicine,genetics,andsystemsbiology(n=12,300);and

“humanities”includingeconomics,linguistics,literature,psychology,andhistory(n=9889).Thecorrelationtrendsbetween reviewerevaluations/scientometricparametersofthePIatproposalsubmissionandsubsequentpublicationoutputwere similarinthethreecohorts(Fig.5.).However,reviewerscoreswereunusuallyworseinhumanities(corr.coeff0.06-0.07in humanitiesvs.0.12-0.19inlifesciences/materialsciences).

3.7. Fractionalpapers

Theanalysesdescribedabovewereperformedusingfullpapersforeachauthorforinitialparametersaswellasforoutput metrics.Inanalteredapproach,wefractionalizedeachpaper–inotherwordswenormalizedthevalueofeachpaperfor thenumberoftheauthorsofthisparticularpaper.Then,thesamestatisticswereperformedasdescribedaboveforreviewer scoresandscientometricparametersofthePIatsubmission.Thisanalysisdeliveredalmostidenticalresultsforbothfunded andnonfundedproposals.TheresultsaredisplayedinFig.6.

3.8. Reviewingthereviewers

Toevaluatethereviewerfeatures,twocommonassumptionswereinvestigated:thehigherreliabilityofinternational reviewersandtheimprovedefﬁciencyassociatedwithahighernumberofapplicationsevaluatedbyagivenreviewer.

Ofallreviewswithknownnationality,82.7%(n=27,225)werepreparedbynationalreviewers,and17.3%(n=5696) werepreparedbyinternationalreviewers.Correlationcoefﬁcientswerecomputedasdescribedaboveandaredisplayedin Figures3and4.Whenanalyzingthecorrelationbetweenreviewerscoresandsubsequentpublicationperformance,the overallscoreandtheproposalscoresdeliveredbynationalreviewersweresigniﬁcantlybetterthanthosebyinternational reviewers(corr.coeff=0.18vs0.11,p=0.021;andcorr.coeff=0.18vs.0.09,p=0.021,respectively,Fig.7A).Atthesame time,thescoresgivenfortheresearcherhimself/herselfweresimilar(p=0.15).

Finally,reviewerswerealsosplitaccordingtothenumberofapplicationsassessedbythereviewerintheparticular reviewround.Thebasicresearchgrantsareopenedonceperyear,andtheyearlynumberofreviewsbythereviewerwere usedregardlessofproposaltype.Allreviewsweresplitintofivecohorts:thosewhoreviewedonlyoneproposal(n=15,783), thosewhoreviewedtwo(n=6822),thosewhoreviewedthree(n=3732),thosewhoreviewedfourorfive(n=3107),and thosewhoreviewedmorethanfive(n=3477)proposalsintheactualyear.Thosewhoreviewedonlyoneproposalhad lowerefficiencyforoverallandapplicationscores(0.11and0.12)thanthosewhoreviewedtwoproposals(0.15and0.16,for

(8)

8B.Gy˝orffy,P.HermanandI.Szabó/JournalofInformetrics14(2020)101050

Fig.4.Nonfundedresearchershaveassociationssimilartothosefunded,butreviewers’scoresreachbettercorrelations.Thetableliststhecorrelationofscientificoutputduringtheproposedgrantrunning timetoproposalparametersforthosenotfunded.Anycorrelationwithacoefficientbelow0.1canbeconsideredunimportant.Reviewerscores,especiallytheassessmentofthePI,provideimprovedassessment butstillfallfarbelowthescientometricparametersofthePIassubmission.PI:principalinvestigator;Q1:rankofthejournalinthefirstquartileaccordingtotheSCImagoJournalRankdatabase;first/last:only publicationswherethePIiseitherfirstorlastauthor.Note:thenumberofreviewsforthefundedandrejectedproposalsdonotadduptothetotalnumberofreviewsbecauseforsomeoftheproposals,thecontract agreementswerenotsigned,andthesewereexcludedfromthisanalysis(Forinterpretationofthereferencestocolourinthisfigurelegend,thereaderisreferredtothewebversionofthisarticle).

(9)

Fig.5.Correlationbetweenreviewerscores/scientometricparametersofthePIatproposalsubmissionandpublicationoutputaresimilarinthe threemajorscientificdisciplines.Thetableliststhecorrelationofscientificoutputduringtheproposedgrantrunningtimetoproposalparameters includingreviewerscoresandscientometricparametersofthePIatgrantsubmission.Anycorrelationwithacoefficientbelow0.1canbeconsidered unimportant.PI:principalinvestigator;Q1:rankofthejournalinthefirstquartileaccordingtotheSCImagoJournalRankdatabase;first/last:onlypublications wherethePIiseitherfirstorlastauthor.(Forinterpretationofthereferencestocolourinthisfigurelegend,thereaderisreferredtothewebversionofthisarticle).

applicationandoverallscores,respectively).However,furtherincreasingthenumberofproposalsevaluatedbythereviewer didnotaffectreviewerperformance(Fig.7B).

4. Discussion

Weobservedaradicallystrongeffectofa47%increaseinpublicationoutputfollowingthereceiptofabasicresearchgrant.

Previously,JacobandLefgreninvestigatedasimilarlysizedsamplewith54,741observationswhenassessingNIHresearch grantapplicationsandobservedarelativelysmalleffectofonlya7%increaseinpublicationyieldfollowingthereceiptof aresearchgrant.Thiscanbeexplainedbytheabundantsourcesofnon-NIH-basedfundingopportunitiesintheUS;infact, therewasnodifferenceinthetotalnumberoffundingsourcesbetweengrantwinnersandlosersintheirstudy(Jacob&

Lefgren,2011).ThisdifferenceemphasizestheprincipalroleofNRDIOinHungary,asunsuccessfulapplicantshavemarkedly lessfundingandmustwaitayearforanewopportunitytosubmitagrantasaprincipalinvestigator.Ofcourse,studiesin

(10)

Fig.6. Correlationbetweenreviewerscores/scientometricparametersofthePIatproposalsubmissionandpublicationoutputusingfractionalized publicationdata.Inthisanalysis,wenormalizedthevalueofeachpaperforthenumberoftheauthorsofthisparticularpaper.Thetableliststhecorrelation ofscientificoutputduringtheproposedgrantrunningtimetoproposalparametersincludingreviewerscoresandscientometricparametersofthePIat grantsubmission.Anycorrelationwithacoefficientbelow0.1canbeconsideredunimportant.PI:principalinvestigator;Q1:rankofthejournalinthefirst quartileaccordingtotheSCImagoJournalRankdatabase;first/last:onlypublicationswherethePIiseitherfirstorlastauthor.(Forinterpretationofthereferences tocolourinthisfigurelegend,thereaderisreferredtothewebversionofthisarticle).

collaborationwithcoauthors,smallfundingprogramsandinstitution-basedresourcescanalsoenabletheseprojectsto continuewithoutdirectNRDIOsupport.

Asweseefromtheresults,whenpredictingfuturescientificproductivity,reviewerscoreswereonlyminimallybetter thanrandomparameters,andthestrongestcorrelationwasobservedwiththescientometricparametersofthePIsatproposal submission.Thelimitedvalueofgrantreviewhasbeendocumentedinotherstudiesaswell.AttheNIH,reviewer-provided percentilescoreshadaverypoorcorrelationwithpublicationyield(Fang,Bowen,&Casadevall,2016).InAustralia,inflated reviewer-basedgrantevaluationresultedinanalmostrandomdistributionoffunds(Graves,Barnett,&Clarke,2011).Inour previousanalysis,weevaluatedtheMomentumexcellenceprogramoftheHungarianAcademyofSciencesandshowedthat theevaluationscoresreceivedfromthegrantreviewexpertswereindependentfromsubsequentscientificoutput(Gyorffy, Nagy,Herman,&Torok,2018).

Multiplestudieshaveshownthatreviewerssufferfrommultiplebiasesandarefarfrombeingobjective.Forexample, single-blindreviewingconfersasigniﬁcantadvantageforfamousresearchersandscientistsfromhigh-prestigeinstitutions (Tomkins,Zhang,&Heavlin,2017).Reviewspreparedbythosewithhigherlevelsofself-assessedexpertisehaveatendency tobestricter(Gallo,Sullivan,&Glisson,2016).Incasearesearchtopicisinterdisciplinary,itsfundingsuccessrateislower (Bromham,Dinnage,&Hua,2016)—probablyduethelackofadequateexpertscapableofprovidinganobjectivevaluation.

Thesuccessrateofaproposalcanbeenlargedsimplybyincreasingthenumberofapplicants’ownpublicationsamong theproposalreferences(Boyack,Smith,andKlavans(2018))).Inaddition,selectingreviewersnominatedbytheapplicants themselvesalsoresultsinasigniﬁcantsystemicbias(Marsh,Jayasinghe,&Bond,2008).

Theselimitationshavealreadypromptedsometocallforalessening ingrantreviewing.FangandCasadevall even promotedtheideaofreplacingreviewpanelsusingamodiﬁedlottery(Fang&Casadevall,2016).Ourresultssuggestthat thereisanalternativeinwhichtheproposalevaluationprocesscouldbemoreevidence-basedandshortenedthroughthe moreintensiveusageofpastpublicationdata.

Itisimportanttodebatethepredictivevalidityofgrantdecisions.Differentmetricsareavailableforthispurpose,includ- ingbibliometrics,securingtenurepositions,futurefundingsuccess,patenting,andinternationalcollaborations.Ofthese, bibliometricsisbyfarthemostwidelyutilizedtechnique(Gallo&Glisson,2018).InaUS-basedstudy,independentofoutput

(11)

Fig.7.Reviewingthereviewers.Aftercomputingacorrelationbetweenreviewerscoresandsubsequentscientificoutput,thereviewsweresplitaccording toreviewernationality(A)andaccordingtothenumberofapplicationsassessedbythereviewerinthegivencalendaryear(B).Internationalreviewers weresignificantlylessefficientintheiroverallscores(p=0.021)andapplicationscores(p=0.021)thannationalreviewers.Increasingthenumberof applicationsreviewedovertwodidnotaffectthereviewefficiency.

measure,91%ofstudiesprovidedevidenceforatleastsomepredictivevalidityofreviewdecisions(Gallo&Glisson,2018) –ourresultsdeliverindependentvalidationfortheseﬁndingsasthereviewerscoreshadasmallbutsigniﬁcantcorrelation tofutureoutput.Ontheotherhand,aEuropeanstudycomparingfundedandnon-fundedproposalsunveiledthelackof anypredictivevaliditywhengranteeswerecomparedtothebestperformingnon-successfulapplicants(vandenBesselaar

&Sandström,2015).Here,wealsodemonstratethatpastperformanceisbetterpredictoroffutureoutputregardlessof fundingsuccess.

Ofnote,theuseofpublicationdataasapre-evaluationtoolforgrantproposalshasalreadybeenpartiallyintroduced, asitistakenintoaccountintheevaluationprocesswhenderivingascorefortheapplicantbythereviewer,andthese scoresshowedthebestcorrelationinouranalysis.Theage-andscientiﬁcdiscipline-standardizedobjectivedataofprevious publicationscanbeusedinawaythatwouldresultinanobjectiveranking.Sucharankingwouldenabletheﬁlteringofthe bestandworstproposals,whichcouldhelptospeeduptheevaluationprocessanduseexpertisewhereitisneeded,without wastingresourcesforproposalsthatarehighlylikelytobeacceptedbecauseoftheirauthorsrecentpublicationactivities aswellasforproposalsthatareunlikelytobeacceptedduetoextremelyweakpriorpublicationperformance.Ofcourse,it isplausiblethatdespitepreviouslyunderperformingpublicationrecords,anapplicantmakesabrilliantproposal.Todecide this,expertswillalwaysbeneeded.However,noevidencesuggeststhatsuchcaseswilloccurfrequently.

Anothersolutionwouldbetheimprovementofpeerreviewbyincreasingitsobjectivity.Oneoptionforthisistheuse ofinternationalexpertsinsteadoflocalreviewers.Internationalexpertsmighthaveanindependentoverviewoftheﬁeld.

Theyalsodonothavenationalconnections,andtherefore,onecouldexpectanobjectiveandunbiasedevaluation.Quite surprisingly,whencomparingtheefﬁciencyofnationalandinternationalexperts,wehaveuncoveredamarkedlyworse

(12)

performanceofinternationalreviewers.Itispossiblethatinternationalreviewersusetheirowncountyasareferencefor theevaluation,andthisresultsintheirinconsistentscoringoftheevaluatedproposals.Furtherresearchisneeded,however, toidentifytheexactcausesofthisphenomenon.

Perseitisnotnewthatresearcherswhohadastrongscientiﬁcpublicationoutputwillhavebetterpublicationoutput inthefuture.Theso-called‘Mattheweffect’referstothisphenomena(Merton,1968).Ithasalsobeendemonstratedthat theMatthew-effectisreinforcedbydifferentresearchmetricsliketheH␣index(Bornmann,Ganser,Tekles,&Leydesdorff, 2017).TheMatthew-effectalsoholdsforsciencefunding,andearlyfundingitselfenablesacquiringlaterfunding(Bol,de Vaan,&vandeRijt,2018).Onethebottomline,reviewershavetwojobs:notonlytopredictthefuturedevelopmentof researchers’careersbutalsotoevaluatewhethertheproposalsaregoodandwhetherthePIscanprovidewhattheypromise intheproposals.

Wehavetonotealimitationinourstudy:wefocusedontheprincipalinvestigatorsofthegrantproposalsonly,andwe didnottakeintoconsiderationtheco-investigators.However,thereisnopredeﬁnedvolumeofresearchersinvolvedina proposal,andeachPIcandecidehowextensivelyteamworkisneededforthegivenproject.Ontheotherhand,identifying allparticipantsineachstudywouldonlybepossiblebymanuallyscreeningeachapplication.Duetolackofdatawealso hadtoomitthenumberofcollaboratorsandthesumsofgrantbudgets.Finally,wealsodidnotevaluatedpreviousgrants– incaseweconsideraprolongedeffectof5-10yearsaftersuccessfulapplication,forsuchananalysisonewouldneeddata forgrantsupto1996.Thequalitiesandquantitiesofthesefactorscouldhaveasimilareffectonfutureperformance.

Insummary,theresultsofouranalysissuggestthatpublicationdatacouldbeusedasanobjective,independentandrobust decisionsupporttool.Thepublicationdataalsomakeitpossiblenotonlytosimplymeasuretheapplicationindividually butalsotoestablishanage-andscientificdiscipline-specificpublication-basedrankingbetweentheapplicants.Suchan approachcouldbeemployedasanearlyfilter,enablingtheexpertsinvolvedintheevaluationprocesstorapidlyassess applicants’potential.Ourresultscanhelptosetthebasisformorereliableandacceleratedfuturegrantschemes.

Competinginterests None.

Authorcontributions

BalázsGy ˝orffy:Conceivedanddesignedtheanalysis,Contributeddataoranalysistools,Performedtheanalysis,Wrote thepaper.

PéterHerman:Collectedthedata,Contributeddataoranalysistools,Wrotethepaper.

IstvánSzabó:Collectedthedata,Contributeddataoranalysistools,Wrotethepaper.

Acknowledgements

TheresearchgroupwassupportedbytheKH-129581grantoftheNationalResearch,DevelopmentandInnovationOfﬁce, Hungary.

AppendixA. Supplementarydata

Supplementary material related to this article can be found, in the online version, at doi:https://doi.org/10.

1016/j.joi.2020.101050.

References

Arora,A.,&Gambardella,A.(1997).ImpactofNSFsupportforbasicresearchineconomics:UniversityLibraryofMunich,Germany.

Auranen,O.,&Nieminen,M.(2010).Universityresearchfundingandpublicationperformance–Aninternationalcomparison.ResearchPolicy.,39(6), 822–834.

Azoulay,P.,GraffZivin,J.,&Wang,J.(2008).SuperstarExtinction:NationalBureauofEconomicResearch,Inc.

Bol,T.,deVaan,M.,&vandeRijt,A.(2018).TheMattheweffectinsciencefunding.ProceedingsoftheNationalAcademyofSciencesoftheUnitedStatesof America,115(19),4887–4890.

Bornmann,L.,Ganser,C.,Tekles,A.,&Leydesdorff,L.(2017).Doestheh␣-indexreinforcetheMattheweffectinscience?Theintroductionofagent-based simulationsintoscientometrics.QuantitativeScienceStudies,0(0),1–16.

Boyack,K.W.,Smith,C.,&Klavans,R.(2018).Towardpredictingresearchproposalsuccess.Scientometrics.,114(2),449–461.

Bromham,L.,Dinnage,R.,&Hua,X.(2016).Interdisciplinaryresearchhasconsistentlylowerfundingsuccess.Nature.,534,684.

Fang,F.C.,&Casadevall,A.(2016).ResearchFunding:theCaseforaModiﬁedLottery.mBio.,7(2),e00422–16.

Fang,F.C.,Bowen,A.,&Casadevall,A.(2016).NIHpeerreviewpercentilescoresarepoorlypredictiveofgrantproductivity.eLife.,5.

Gallo,S.A.,&Glisson,S.R.(2018).ExternalTestsofPeerReviewValidityViaImpactMeasures.FrontiersinResearchMetricsandAnalytics.[Review],3(22).

Gallo,S.A.,Sullivan,J.H.,&Glisson,S.R.(2016).TheInﬂuenceofPeerReviewerExpertiseontheEvaluationofResearchFundingApplications.PloSone, 11(10),Articlee0165147.

Graves,N.,Barnett,A.G.,&Clarke,P.(2011).Fundinggrantproposalsforscientiﬁcresearch:retrospectiveanalysisofscoresbymembersofgrantreview panel.Bmj.,343,d4797.

Gyorffy,B.,Nagy,A.M.,Herman,P.,&Torok,A.(2018).FactorsinfluencingthescientificperformanceofMomentumgrantholders:anevaluationofthe first117researchgroups.Scientometrics.,117(1),409–426.

(13)

Jacob,B.A.,&Lefgren,L.(2011).TheImpactofResearchGrantFundingonScientiﬁcProductivity.Journalofpubliceconomics,95(9-10),1168–1177.

Lariviere,V.,&Costas,R.(2016).HowManyIsTooMany?OntheRelationshipbetweenResearchProductivityandImpact.PloSone,11(9),Article e0162709.

Man,J.P.,Weinkauf,J.G.,Tsang,M.,&Sin,D.D.(2004).Whydosomecountriespublishmorethanothers?Aninternationalcomparisonofresearch funding,Englishproﬁciencyandpublicationoutputinhighlyrankedgeneralmedicaljournals.Europeanjournalofepidemiology.,19(8),811–817.

Marsh,H.W.,Jayasinghe,U.W.,&Bond,N.W.(2008).Improvingthepeer-reviewprocessforgrantapplications:reliability,validity,bias,and generalizability.TheAmericanpsychologist.,63(3),160–168.

Merton,R.K.(1968).TheMatthewEffectinScience.Therewardandcommunicationsystemsofscienceareconsidered.Science,159(3810),56–63.

Payne,A.,&Siow,A.(2003).DoesFederalResearchFundingIncreaseUniversityResearchOutput?TheBEJournalofEconomicAnalysis&Policy,3(1),1–24.

Sandstrom,U.,&vandenBesselaar,P.(2016).Quantityand/orQuality?TheImportanceofPublishingManyPapers.PloSone,11(11),Articlee0166149.

Sandström,U.,&VandenBesselaar,P.(2018).Funding,evaluation,andtheperformanceofnationalresearchsystems.JournalofInformetrics.,12(1), 365–384.

Symonds,M.R.,Gemmell,N.J.,Braisher,T.L.,Gorringe,K.L.,&Elgar,M.A.(2006).Genderdifferencesinpublicationoutput:towardsanunbiasedmetric ofresearchperformance.PloSone,1,e127.

Tomkins,A.,Zhang,M.,&Heavlin,W.D.(2017).Reviewerbiasinsingle-versusdouble-blindpeerreview.ProceedingsoftheNationalAcademyofSciences oftheUnitedStatesofAmerica,114(48),12708–12713.

vandenBesselaar,P.,&Sandström,U.(2015).Earlycareergrants,performance,andcareers:Astudyonpredictivevalidityofgrantdecisions.Journalof Informetrics.,4,826–838.

vandenBesselaar,P.,Heyman,U.,&Sandström,U.(2017).Perverseeffectsofoutput-basedresearchfunding?Butler’sAustraliancaserevisited.Journalof Informetrics.,11(3),905–918.

Wang,D.D.(2019).Performance-basedresourceallocationforhighereducationinstitutionsinChina.Socio-EconomicPlanningSciences.,65(C),66–75.