Statistical quote of the day


Ambassador to the humans
I guess you could use this as a workaround if you don't have fortune installed (on any system) - requires internet access and isn't quite as fast as using fortune locally and you don't have the option of adding any options to your fortune but here it goes:
# require(RCurl) # RCurl needs to be installed but we'll use :: notation to use the function of interest
fortune <- function(){
Example use:
> fortune()

Applying computer technology is simply finding the right wrench to pound in the correct screw.
> fortune()

"Happiness is Planet Earth in your rear-view mirror."
-- Sam Hurt
Don't take this as me saying that you shouldn't use a *nix though.


Ambassador to the humans
Yeah but like you said those are R specific - if you want the entire group of fortunes from the unix fortune program you need to do something else.


Ambassador to the humans
To paraphrase provocatively, 'machine learning is statistics minus any checking of models and assumptions'.

-- Brian D. Ripley (about the difference between machine learning and statistics) useR! 2004, Vienna (May 2004)

Obtained from the fortunes package in R.


Super Moderator
On statistical quotes...

I will quote Walt Brainerd from the The Fortran Company, Tucson, AZ (2003).

In an article he asserts:

Let us start with a bold assertion: Fortran is still the best programming language for numerical/scientific computing. The reasons could be discussed and debated extensively, but they include:

1. There is a large investment in scientific software written in Fortran, including extensive libraries.

2. There is a large investment in the training and experience of scientists that do programming.

3. The language is more straightforward to learn and used than most "modern" languages.

4. Fortran produces efficient code.

5. Fortran is very portable: source code compiles on many platforms with little need for conditional compilation and results are consistent, particularly when executed on standard floating point hardware.

The reason I make this statement is because it means that the continued development and implementation of Fortran will be important in the twenty-first century for the same reasons (listed above) that it has been important in the twentieth century”


His points are all very good in my view.


Ambassador to the humans
Kelvin Lam: My institute has been heavily dependent on SAS for the past while,
and SAS is starting to charge us a very deep amount for license renewal. Since
we are a non-profit organization that is definitely not sustainable. The team
is brainstorming possibility of switching to R, at least gradually. I am
talking about the entire institute with considerable number of analysts using
SAS their entire career. There's a handful of us using R regularly. What kind
of problems and challenges have you faced?
Frank Harrell: One of your challenges will be that with the increased
productivity of the team you will have time for more intellectually challenging
problems. That frustrates some people.
-- Kelvin Lam and Frank Harrell
R-help (July 2009)


Ambassador to the humans
If you really want to assess uncertainty you need to take into account that the
models are false and that several models may capture different aspects of the
data and so be false in different ways.
-- Brian D. Ripley


Frank Harrell: One of your challenges will be that with the increased
productivity of the team you will have time for more intellectually challenging
problems. That frustrates some people.
-- Kelvin Lam and Frank Harrell
I gotta use this in my dissertation somehow. That's my challenge. :)


As part of my reading about bootstrapping I found this in the Acknowledgements section of Efron's original description of the technique:

I also wish to thank the many friends who suggested names more colorful than Bootstrap, including Swiss Army Knife, Meat Axe, Swan-Dive, Jack-Rabbit, and my personal favorite, The Shotgun, which, to paraphrase Tukey, "can blow the head off any problem if the statistician can stand the resulting mess."
(Efron B. Bootstrap methods: another look at the jackknife. The annals of Statistics. 1979;7(1):1-26.)


Ambassador to the humans
"Everybody believes in the exponential law of errors [i.e., the Normal distribution]: the experimenters, because they think it can be proved by mathematics; and the mathematicians, because they believe it has been established by observation."

Whittaker, E. T. and Robinson, G. "Normal Frequency Distribution." Ch. 8 in The Calculus of Observations: A Treatise on Numerical Mathematics, 4th ed. New York: Dover, pp. 164-208, 1967. p. 179.


Point Mass at Zero
"To consult a statistician after an experiment is finished is often merely to ask him to conduct a post-mortem examination. He can perhaps say what the experiment died of."

R.A.Fisher, 1938.

Got this off from Cambridge's Centre for Applied Medical Staistics (CAMS) webpage.


"computer packages have played the role of master rather than slave far too often"

Knapp, T. Treating ordinal scales as interval scales: An attempt to resolve the controversy Nursing Research, 1990, 39, 121-123
”It is easy to lie with statistics. It is hard to tell the truth without statistics.”

Andrejs Dunkels

Quoted just after the title page in:
Maindonald J., Braun W. J., Data Analysis and Graphics Using R, 2007

(and there is the one below...)


Global Moderator
The combination of some data and an aching desire for an answer does not ensure
that a reasonable answer can be extracted from a given body of data.

-- John W. Tukey The American Statistician 40(1), 72-76 (February 1986)


No cake for spunky
I like this one by Dason (don't know if he created it or got it somewhere else). :p

Student "How do we know we have enough samples for the CLT to apply to our data"

Professor "Just make sure your n is close to infinity. An n of 30 seems to do the trick for most situations."

So now I know what infinity is, a least in "most situation." :)