- Thread starter shrek
- Start date

Can someone explain to me the pros and cons of all these software packages? Which one would you use if you can start all over again? Thanks.

I use R now, and I would use R again if I had to start over.

Kill matlab off your list. As a statistics tool it would be the primary tool of only applied mathematicians involved primairly in modeling and secondly statistics.

Replace it with SPSS and link R and S for a new list. So now your list is:

R/Splus, SAS, SPSS

Ok Splus versus R. Splus is a proprietary system. R is based on it and free. R has more support at this point from the intellectual community. Splus still has an enterprise presence. There is no reason to use SPlus ahead of R unless you are in or preparing for a specific job or someone else is paying the bills. And Splus is not always "better". For example at the time of the plubishing a book called MASS noted that an operation took 15 minutes under Splus while it only took 90 seconds under R. The advantage to an open source project that people are actually working on is that people don't suffer glaring inefficiencies very long.

R versus SAS. Ive used both. In a nutshell, R gives you nothing that you do not know how to ask for, SAS gives you lots for asking for very little but it will seem like nothing unless you know what to look for.

R evolves faster than SAS. SAS has higher standards.

R is more programic than SAS. SAS is a little more cookbook.

SAS has stronger support for huge databases and enterprise level data management. It remains very very popular in big business. But it has an incredibly expensive license for the full version.

SPSS is largely graphical data analysis environment where the scripts that can be generated arn't often used. Its a point and click data analysis environment. It has incredible world wide inflitration in social sciences.

Personally I use R.

R is the strongest software, it allows you to do nearly anything, as long as you know how to.

SAS is also strong, I personally don't like it that much, but it's one of the strongest software out there.

Speaking of SAS, if you want easy life with nice graphics, you should try JMP, superb software.

I am not a fan of SPSS, I prefer Minitab and Statistica over it.

If you work with Biostatistics or longitudinal data analysis, you should use Stata, also a very powerful software.

If you are good at Excel, use XLStat.

About MatLab, I think, that just like R, which is a software for Statisticians has mathematical functions, MatLab, a software for Mathematicians, has Statistical models built in it. You can do a lot with MatLab, but sometimes Statistical methods will be easier to use by other software.

I use JMP and Stata

If you had to start all over again with a new software package, then the one you would choose would depend on your level/area of expertise and what software you have used in the past.

If it was me, because I'm only a rudimentary statistician, I would want to start over again with something simple but very capable like Minitab.

I am learning R at the moment - but I certainly don't depend on it yet. So for me I would avoid starting over exclusively with something like this.

SigmaStat is too user-friendly for me now, but it provided quite a nice stepping stone to Minitab.

In lieu of Minitab I'd probably use SPSS - but only if I wasn't the one paying for the license.

Otherwise I'd make do with Excel (i.e., program my own tests) while I transitioned over to R.

If it was me, because I'm only a rudimentary statistician, I would want to start over again with something simple but very capable like Minitab.

I am learning R at the moment - but I certainly don't depend on it yet. So for me I would avoid starting over exclusively with something like this.

SigmaStat is too user-friendly for me now, but it provided quite a nice stepping stone to Minitab.

In lieu of Minitab I'd probably use SPSS - but only if I wasn't the one paying for the license.

Otherwise I'd make do with Excel (i.e., program my own tests) while I transitioned over to R.

Last edited:

This however is unlikely to be a problem unless you've got millions of data to process.

Matlab is not really a stats package, but a numerical analysis tool, although you can do statistical operations with it. If you needed to use matlab however, you'd be better off getting the free version Octave. It is completely compatible with Matlab, but you don't have to pay $$$$ to get a copy.

Similarly, you'd be crazy to pay the license fee for SPSS when there is PSPP which is the free version of SPSS. Unlike the "student" versions, you don't have any limit on case counts and there's no expiry date. It's fast obsoleting SPSS, in the way that R has obsoleted S.

Don't be tempted to try and use Excel for except for VERY simple work. It's just too easy to make mistakes and not notice them.

Personally I would avoid Minitab because it's very outdated these days.

I personally despise using R, because to do interesting functions means spending 2-3 hours scouring mailling lists and the internet to get a basic clue of what the hell is going on. IF I had a textbook that explained it, (like my company would ever pay for that, dang a$$holes, and I personally don't have the money for that) maybe I wouldn't be so bitter about the language.

Also, it is almost all command line unless you get the Rcmdr package installed... Which is still quite limited (but better than nothing when your company won't buy you software.)

If you are not very rich avoid Matlab, SAS and SPSS. SPlus and R are very similar but R is free.

R is very complete but you will have to spend some time understanding it. If you use Excel datasets, XLSTAT is a complete statistical tool not too expensive (450 US$) with the most important statistcal methods, it can be a good solution.

Personnaly, I use R, SAS and XLSTAT for my research.

The StatSoft website has some examples and white papers:

http://www.statsoft.com/solutions/r-language-platform/

For me, it depends on what I am doing.

If I am using bootstrap techniques, then I would use SPlus.

If I want a variety of numerical integration techniques available or solving (large) systems of equations, or symbolic results then I would use Mathematica.

If I want a (quick) empirical confirmation of analytical derivations that I make, then I would use Minitab.

If I want to conduct a large Monte Carlo study where speed is of essence, then I would program in Fortran.

If I am teaching a basic course in inferential statistics then I would SPSS.

...

The list goes on....

- R is free

- R will perform virtually all common statistical methods without additional programming

- R has graphics far superior to that of SAS and Excel, and rivals most other packages

- R is extensible. If you want to write C code and interface it with R, there is a simple way to do it

- R code/data written by you can be shared with the rest of the statistics community as an R package

- R documentation, including books, manuals, tutorials, etc. are freely available

1. R is free

2. It can be programmed to do anything not just statistical functions

3. The graphics are excellent

4. There is a lot of goodwill and support from the network of other R users

5. Programs can be written and saved which gives you an exact record of how you produced your results and enables you to repeat the process with another set of data.

The speed and memory limitations in my experience are more a problem with the PC and 32 bit OS rather than with R.

Yes, it is a steep learning curve to understand the language but there are plently of examples available, manuals and people who are willing to help.

I have only ever used R and SPplus and now use

The speed and memory limitations in my experience are more a problem with the PC and 32 bit OS rather than with R.

The speed and memory limitations in my experience are more a problem with the PC and 32 bit OS rather than with R.

- most of us will not run into this problem, I do occasionally but then I reboot in linux and I'm usually fine.

Secondly many University labs refuse to use R because of the liability issue. You use R at your own risk, which is often unacceptable if you design bridges, aircraft or maybe medication that people will use. In short: If your bridge fails because of an inherent error in R, you are liable and not the R-developers or package developers.

- This really is only a problem for a select group of fields, not for mine. I love the way contributing to R is build up like contributing to science; you submit a package you created and then it gets peer reviewed by others before it gets accepted. I therefore feel that for this reason there is a very low chance of any 'inherent error' in R. The chances of running into such a thing in a program like SPSS are much higher (no peer reviews, no huge and knowledgeable user community) thus they NEED to backup their expensive programs, whereas I feel R does not (or does this in a different way).

These are the only two meaningful problems I know about.

The steep learning curve is not really a problem, if you really cant work with code: use R-commander. Its what I recommend to grad students who haven't taken R-courses, I often see them learning commands really well with it and then dumping R-commander after sometime anyway.

Secondly many University labs refuse to use R because of the liability issue. You use R at your own risk, which is often unacceptable if you design bridges, aircraft or maybe medication that people will use. In short: If your bridge fails because of an inherent error in R, you are liable and not the R-developers or package developers.

- This really is only a problem for a select group of fields, not for mine. I love the way contributing to R is build up like contributing to science; you submit a package you created and then it gets peer reviewed by others before it gets accepted. I therefore feel that for this reason there is a very low chance of any 'inherent error' in R. The chances of running into such a thing in a program like SPSS are much higher (no peer reviews, no huge and knowledgeable user community) thus they NEED to backup their expensive programs, whereas I feel R does not (or does this in a different way).

- This really is only a problem for a select group of fields, not for mine. I love the way contributing to R is build up like contributing to science; you submit a package you created and then it gets peer reviewed by others before it gets accepted. I therefore feel that for this reason there is a very low chance of any 'inherent error' in R. The chances of running into such a thing in a program like SPSS are much higher (no peer reviews, no huge and knowledgeable user community) thus they NEED to backup their expensive programs, whereas I feel R does not (or does this in a different way).

BioStatMatt

I use SAS for my job and R for my personal research. And I got chance to work using multiple Stat packages.

My view on stat package as follows

===============================================

For new bees use GUI stat packages - Minitab/spss.. ( now most of it has both GUI and programming part).

One must learn the data manipulation part if he use the programming stat packages. Without that it will not be fun.

like in SAS -> one should be good at BASE package

R -> should know what is class and mode,list/dataframe/matrix .. etc.

In my opinion, the best software is the one that helps you fill all your needs easily.

Certain software as MATHEMATICA, GAUSS and MATLAB are good, but they are not devoted to Statistics, so they are somehow weak. Their real strengths is shown in areas as simulation, or complex mathematics.

I also avoid certain software such as STATISTICA or EXCEL. I know that they are widely used but they still have many flaws in its processes. I specially had some problems with STATISTICA in Time Series Analysis and with some Multivariate Tools (I used version 7, so I'm not sure whether this problems have been fixed) In fact in certain talks the use of Excel for statistics has been criticized (Cryer, 2001).

I'd like to make some comments on the most popular softwares. I personally use STATA at work, since is one of the most complete packages I've worked with. It also has open contributions so it grows fast. Besides I think the tools for modeling available in STATA are beyond any other multi-function package. I like SAS cause programming it's fun Still I'm not a huge fan of its interface. Along with STATA is one of the most powerful programs. I find SPSS really confusing! It is almost impossible to introduce survey designs: and I haven't managed to produce graphics with the "amazing graph generator". Don't get me wrong, it can be outstanding tool (I love the Multidimensional Scaling Module) , I just think the interface needs to improve a lot.

Finally, it is true that R is the Ultimate Weapon for a statistician, but to use this software requires full command of stats and also knowledge of programming. I think R should only be recommended to professionals with high knowledge in the area. I mean, it is pretty easy to obtain wrong results with it if you don't use it properly.

So, I'd like to add some recommendations, which haven't been mentioned here yet, specially for those starting in the world of randomness. These packages are, in my opinion, the best option for starting with Stats:

Is the easiest software I've used and has most basic tools available. It's specially good for Industrial Stats and Quality Control. I should recommend it for beginners due to the detailed help files where you can easily learn the basis of statistical tools using examples.

This software was developed in Argentina and is quite popular in Latin America. One of its great advantages is the prize, really cheap compared to the competition. You can also find many tools in this package that uses a very clean menu interface. Previous versions lacked of high quality graphics but I heard this is being improved.

Openstat is a free Statistical package oriented to Social Sciences. I consider it an amazing tool to teaching statistics. It has an easy-to-use menu interface, along with an important number of tools that displays results in a simple way.

This a very interesting software based on menus, that I would also suggest to beginners. It has more statistical tools available than MINITAB and INFOSTAT, so it is a really useful alternative. I really like its friendly interface which has the advantage of even suggesting certain alternatives to you analysis.

Well, I hope this helps someone to choose a statistical software not only based in its capabilities but also thinking in one's capabilities with Stats

Last edited: