I use R now, and I would use R again if I had to start over.
Can someone explain to me the pros and cons of all these software packages? Which one would you use if you can start all over again? Thanks.
I use R now, and I would use R again if I had to start over.
The true ideals of great philosophies always seem to get lost somewhere along the road..
Heres my sense, not from experience, from parroting what I have heard.
Kill matlab off your list. As a statistics tool it would be the primary tool of only applied mathematicians involved primairly in modeling and secondly statistics.
Replace it with SPSS and link R and S for a new list. So now your list is:
R/Splus, SAS, SPSS
Ok Splus versus R. Splus is a proprietary system. R is based on it and free. R has more support at this point from the intellectual community. Splus still has an enterprise presence. There is no reason to use SPlus ahead of R unless you are in or preparing for a specific job or someone else is paying the bills. And Splus is not always "better". For example at the time of the plubishing a book called MASS noted that an operation took 15 minutes under Splus while it only took 90 seconds under R. The advantage to an open source project that people are actually working on is that people don't suffer glaring inefficiencies very long.
R versus SAS. Ive used both. In a nutshell, R gives you nothing that you do not know how to ask for, SAS gives you lots for asking for very little but it will seem like nothing unless you know what to look for.
R evolves faster than SAS. SAS has higher standards.
R is more programic than SAS. SAS is a little more cookbook.
SAS has stronger support for huge databases and enterprise level data management. It remains very very popular in big business. But it has an incredibly expensive license for the full version.
SPSS is largely graphical data analysis environment where the scripts that can be generated arn't often used. Its a point and click data analysis environment. It has incredible world wide inflitration in social sciences.
Personally I use R.
I think it depends on what you want to do.
R is the strongest software, it allows you to do nearly anything, as long as you know how to.
SAS is also strong, I personally don't like it that much, but it's one of the strongest software out there.
Speaking of SAS, if you want easy life with nice graphics, you should try JMP, superb software.
I am not a fan of SPSS, I prefer Minitab and Statistica over it.
If you work with Biostatistics or longitudinal data analysis, you should use Stata, also a very powerful software.
If you are good at Excel, use XLStat.
About MatLab, I think, that just like R, which is a software for Statisticians has mathematical functions, MatLab, a software for Mathematicians, has Statistical models built in it. You can do a lot with MatLab, but sometimes Statistical methods will be easier to use by other software.
I use JMP and Stata
I think R is definately not for beginners or the weak of heart as it involves learning a whole new computer language. I've worked a lot with SPSS v15. It's extremely easy to use like MiniTab. If you are totally green, check out InStat by GraphPad it walks you through everything and pats you on the back afterwards.
I use Matlab, really powerfull, mathworks says it is the fastest there is (and I believe it). You can do anything, many many ready to use functions for almost any field. Easy to use, funcier graphs from other softwares I have used.
If you had to start all over again with a new software package, then the one you would choose would depend on your level/area of expertise and what software you have used in the past.
If it was me, because I'm only a rudimentary statistician, I would want to start over again with something simple but very capable like Minitab.
I am learning R at the moment - but I certainly don't depend on it yet. So for me I would avoid starting over exclusively with something like this.
SigmaStat is too user-friendly for me now, but it provided quite a nice stepping stone to Minitab.
In lieu of Minitab I'd probably use SPSS - but only if I wasn't the one paying for the license.
Otherwise I'd make do with Excel (i.e., program my own tests) while I transitioned over to R.
Last edited by Silvanus; 11-06-2008 at 03:42 PM. Reason: ambiguity
R is certainly a very good statistical analysis package. It is very comprehensive, easy to learn and compatible with it's proprietary counterpart S. Its one drawback is that it becomes very slow, or may even fail when dealing with extremely large sets of data.
This however is unlikely to be a problem unless you've got millions of data to process.
Matlab is not really a stats package, but a numerical analysis tool, although you can do statistical operations with it. If you needed to use matlab however, you'd be better off getting the free version Octave. It is completely compatible with Matlab, but you don't have to pay $$$$ to get a copy.
Similarly, you'd be crazy to pay the license fee for SPSS when there is PSPP which is the free version of SPSS. Unlike the "student" versions, you don't have any limit on case counts and there's no expiry date. It's fast obsoleting SPSS, in the way that R has obsoleted S.
Don't be tempted to try and use Excel for except for VERY simple work. It's just too easy to make mistakes and not notice them.
Personally I would avoid Minitab because it's very outdated these days.
I use R in my line of work. I did a lot of research on using it. While others have sung praises of it... It can be very difficult to use more advanced functions without explanation from someone else.
I personally despise using R, because to do interesting functions means spending 2-3 hours scouring mailling lists and the internet to get a basic clue of what the hell is going on. IF I had a textbook that explained it, (like my company would ever pay for that, dang a$$holes, and I personally don't have the money for that) maybe I wouldn't be so bitter about the language.
Also, it is almost all command line unless you get the Rcmdr package installed... Which is still quite limited (but better than nothing when your company won't buy you software.)
It depends on your skills in statistics and on the time and money you want to spend.
If you are not very rich avoid Matlab, SAS and SPSS. SPlus and R are very similar but R is free.
R is very complete but you will have to spend some time understanding it. If you use Excel datasets, XLSTAT is a complete statistical tool not too expensive (450 US$) with the most important statistcal methods, it can be a good solution.
Personnaly, I use R, SAS and XLSTAT for my research.
Using STATISTICA Version 9, you can combine R functions with custom programming and manage R results (data & graphs) in spreadsheets and workbooks.
The StatSoft website has some examples and white papers:
For me, it depends on what I am doing.
If I am using bootstrap techniques, then I would use SPlus.
If I want a variety of numerical integration techniques available or solving (large) systems of equations, or symbolic results then I would use Mathematica.
If I want a (quick) empirical confirmation of analytical derivations that I make, then I would use Minitab.
If I want to conduct a large Monte Carlo study where speed is of essence, then I would program in Fortran.
If I am teaching a basic course in inferential statistics then I would SPSS.
The list goes on....
At the risk of sounding like a commercial (which is actually an impossibility) for R, I am going to make a not-so-complete list of the reasons I am a big fan of R.
- R is free
- R will perform virtually all common statistical methods without additional programming
- R has graphics far superior to that of SAS and Excel, and rivals most other packages
- R is extensible. If you want to write C code and interface it with R, there is a simple way to do it
- R code/data written by you can be shared with the rest of the statistics community as an R package
- R documentation, including books, manuals, tutorials, etc. are freely available
I have only ever used R and SPplus and now use R most of the time in my work. For me the main advantages are
1. R is free
2. It can be programmed to do anything not just statistical functions
3. The graphics are excellent
4. There is a lot of goodwill and support from the network of other R users
5. Programs can be written and saved which gives you an exact record of how you produced your results and enables you to repeat the process with another set of data.
The speed and memory limitations in my experience are more a problem with the PC and 32 bit OS rather than with R.
Yes, it is a steep learning curve to understand the language but there are plently of examples available, manuals and people who are willing to help.
Advertise on Talk Stats