how many citations before an article is "influential"?


Can't make spagetti
i am in need of conducting a "systematic" review of sort of "influential" articles. i thought the best way to consider an article as "influential" is by the number of times it's been referenced.

i guess this changes by discipline so in my case it is within the context of simulation studies analyzing the robustness properties of certain types of multivariate statistical methods.

my rather arbitrary cut-off is 100. opinions?
spunky citations is not the best method to determine that. It is quite possible that a bad article becomes cited too many times as it is being criticized in different articles. Also it is possible that a potentially very influencial article is not cited for too many times (yet), because it is published recently... Also it is possible that a paper is cited frequently by its very author in other papers of that author...

So this is not completely objective and needs some level of subjective judgement... However, you can rule out the confounders, and it would be much more reproducible and objective:

1. Find the self-citations and remove them from the number of citations.
2. Find the citations in letters to editors, or in other review articles that had EXCLUDED that paper. You know that when a systematic review excludes a paper, the excluded paper too gets cited, despite being disqualified... So citations of letters to editors and excluded studies from systematic reviews are actually pointing to a negative score each. So either remove those citations, or as a better plan, for each citation from a letter or an excluded article, give a NEGATIVE score too (so reduce 2 citations per each letter to editor citing the article of interest)...
3. The count the number of citations.
4. Divide the number of citations by the number of the months passed since the publication date of that paper (not even years, but months)...

The calculated value would be much more reliable in terms of influence of the paper. Note that not all self-citations mean that the study was not really influential and the authors tried to inflate their H index. Sometimes, the leaders of science have very influential papers that should be cited even in their own further studies. This is where I say it needs subjective judgement... but if the number of citations is too high, well a few self citations doesn't matter.

Then how to determine the threshold? 100 citations for example? I think

1. You should rely on that value I said above that is the pure citations divided by time passed since publication
2. you should find all the relevant papers and count the number of citations
3. you should judge on a good values based on what you are seeing in your field. For example in my field, the most highly cited article from 1970 can be cited for maximum 300 times... but in another field (cancer), an article can be cited for 2000 times in 10 years... So you should judge this by checking your own literate, and not only the number of citations, but the number of citations per time unit.


Phineas Packard
Page Rank = prestige
Citations = popularity

Page rank is harder, obviously, to generate but I think better at getting at influence

EDIT: There is a google scholar package for R that you could use to construct a cross-citation matrix :)


Can't make spagetti
Perhaps "citations per year since publication" would be a more reasonable metric?
Uhm. I didn't think about that, which is a good alternative metric.

nevertheless, the question still remains... how many before you'd say "oh wow, this is article is indeed influential".


Can't make spagetti
I guess you missed my comment
I didn't, but I don't have the time to implement it. That would be a full project in itself and i'm already short on time. At this point what I could really use is just some completely arbitrary reference from someone that I can cite. It's like Bradley's robustness criterion for empirical Type 1 error rates in simulation studies. He provides a completely arbitrary and fantastically made-up estimate of how much the empirical Type 1 error rate is allowed to vary before a method is considered robust or not. and he gets cited... A LOT.

so I was kind of hoping for something like "as such and such said, an article with *** # of relevant citations (so not the author citing him/herself and whatnot). is considered relevant". but since systematic reviews are not really my area I was sort of just reaching out.


Cookie Scientist
nevertheless, the question still remains... how many before you'd say "oh wow, this is article is indeed influential".
Well, I guess this is field-dependent, not to mention dependent on my subjective definition of "influential" of course, but for me and my field, my first rough guess would be like... 400?