STATA or SAS? Which one should I learn?

#1
Hello
I am currently trying to go a little beyond SPSS by learning either SAS or STATA. My field is psychology and biostatistics. Which one of these is easier? and more relevant to me?

My friends are all learning R which I truly hate. Is this a general common fad these days?

Any good books you would recommend?
 

Dason

Ambassador to the humans
#2
Could you expand on why you hate R? It might help give a suggestion. I love R, I don't really like SAS, and I've never used Stata.
 
#3
Sorry I do not hate R. It just may be how our computer lab works which forces us to get R using a server. But if you could tell me why you love R and where I can start to learn it (Books, websites) I might as well learn R. I just like everything on my own PC not using a server and have to download packages.
 

Dason

Ambassador to the humans
#4
Every now and then I make a foray into R, but find it very difficult to get any real work done. I dont' understand why so many people like R.
Because it makes it easy to get real work done. I'm serious - it has a steep learning curve but once you're familiar with it it's a very powerful tool.
 

noetsi

Fortran must die
#5
R is useful for statistical purist like Dason who only do research that no one else understands (well 99.99997 percent of the human population anyhow).:p For most others, and for nearly all bosses in the real world which means most employees as well, SAS, Strata, or SPSS is best. You also don't have to learn to write code.... R is extremely useful for work you will almost certainly never do outside academia (academic including pure research organizations outside academics - effectively the same thing).

I use SAS a lot (SPSS some, R rarely). I have not used STRATA. I suspect you will find little meaningfull difference between them for most tasks. The answer to your question likely depends where you will work after school. In the private sector (or government outside universities) I think you will find SAS and SPSS more common and thus more useful to learn. In an academic setting R will be. Strata is (as best I can tell) not used as much as SAS or SPSS in non-academic organizations although that is based only on personal experience not formal studies and thus of course could be wrong.
 

Dason

Ambassador to the humans
#6
R is useful for statistical purist like Dason who only do research that no one else understands (well 99.99997 percent of the human population anyhow)
R is preferred by those that do research because it makes it easy to program in new methods. What this also means is that R is always on the cutting edge with respect to data analysis techniques. Maybe you don't care about that and that's fine but for a lot of people this is important and useful. I know a lot of people that work with financial data that use R almost exclusively. This is a field where having access to the latest techniques can make a company A LOT of money if they're smart enough to take advantage of it. There are plenty more examples but we've had this discussion before....

You also don't have to learn to write code....
And you don't have to learn to write code in SAS or Stata? I guess with SAS you can use Enterprise Guide but that's not "SAS" in my mind. But writing code is a huge advantage in my mind. It allows for an easily reproducible analysis. It means that if I want to run the same analysis but on a different data set I don't have to do a bunch of extra work.
R is extremely useful for work you will almost certainly never do outside academia (academic including pure research organizations outside academics - effectively the same thing).
R does everything that SAS/SPSS/Stata do. I don't see what point you're trying to make. Sure it has a learning curve but don't try to say that you can't do the simple stuff in R.

I use SAS a lot (SPSS some, R rarely). I have not used STRATA. I suspect you will find little meaningfull difference between them for most tasks. The answer to your question likely depends where you will work after school. In the private sector (or government outside universities) I think you will find SAS and SPSS more common and thus more useful to learn. In an academic setting R will be. Strata is (as best I can tell) not used as much as SAS or SPSS in non-academic organizations although that is based only on personal experience not formal studies and thus of course could be wrong.
Learning R does have another big advantage... it's free. So if you end up working at a place that doesn't have access to SAS/SPSS/Stata you don't have to pay a lot of money to get a license so you can use it to analyze data.
 

hlsmith

Omega Contributor
#7
All depends on where you end-up. From my experience pysch may use SPSS or SAS, depends where you are at. Biostatistics is very heavy in SAS (this is my area). Many gov and national datasets are provided as SAS datasets and many large hospital systems affiliated with or not with academic institutions run SAS off servers. However, many of the other programs are great. Just keep learning the principles, then you can apply them to whichever program. Also, play around with them all, so when it comes to job interviewing time you can claim experience with all of them.

R a fad? Well it is picking up more and more steam, but I don't see it as the mainstay in the private sector anytime soon. Who knows??
 

noetsi

Fortran must die
#8
What this also means is that R is always on the cutting edge with respect to data analysis techniques.
That was actually my point.:) R is very useful for work that is very rarely used outside academics or academic like organizations. Which is not very many. It depends (as hlsmith noted above) where you will work what you use.

Most of even SAS is rarely going to be used in normal business or public organizations. Virtually everything I have to do (and I do much more statistics here than is common I suspect) can be done in EG. On rare occassions you have to modify the code with regular code. Then you just insert it into the EG.
 

Dason

Ambassador to the humans
#9
That was actually my point.:) R is very useful for work that is very rarely used outside academics or academic like organizations.
Which is fine but R containing cutting edge stuff doesn't take away from the fact that it does the basic stuff too.
 

WeeG

TS Contributor
#11
R is very popular firstly because it's free (in contrast to SAS which is very expensive). In addition, like it was said here in the thread, R is the first package to have brand new techniques, if someone wrote an article about it, most likely that someone wrote a package that implement it on R. So R is the most powerful package. However, R is very hard to use. It doesn't take to be a statistician to use R, but some computer science skills won't hurt in this case.

SAS is very powerful and it's popular for a reason. SAS also allow you to run some models that other packages still don't (like generalized mixed models). For a long time I worked with Stata, purely because my company couldn't afford SAS. Now we did buy SAS, and I have to admit that I understand why it is said to be powerful, well it is ! Stata is easier to use, it's getting more popular by the day. The code is so simple, and you also have a GUI that helps you a lot. Stata is also extendable kind of like R is, but with less extension packages.

I think that the choice of a software should be based on how much can you spend, what your needs are (which models & techniques you plan to use), how good you are with writing codes,....

It's also worth thinking of the other options in the market, and not to rule anything, you can read about JMP (belongs to SAS), Statistica, Minitab, Systat (SPSS you already know so I think I covered most of them). Some of the names I just mentioned are VERY easy to use.

Another option that came to my mind. If you have strong excel skills, you can try XLSTAT.
 
#13
R is great.. Of course I work in a lab conducting experiments. I wrote a script which loads the data we measured, analyzes the correlations of the 50+ variables in our experiments, and outputs beautiful graphs and latex documents we use to make recommendations to the higher ups. It took some work to get it going, but aside from a few tweaks here and there, all I have to do is click a button and the entire process is automated. My manager loves it.
 

noetsi

Fortran must die
#14
Of course your manager works in a lab (or above a lab). Which is the key point. What you use depends on where you work and what you do and how good you are at coding. As with most questions there are no absolute answers to this.
 
#15
I can't comment on STATA much, although I have heard good things about it and the small amount of code I've seen looks sensible.

The thread seems to have become R vs. SAS and since I use both, I can comment on that:

SAS is expensive, R is free.

SAS is extensively tested, the base part of R and most of the packages are very well tested but some packages ...

SAS has excellent tech support, R has none (although there is R-Help and StackExchange)

SAS has voluminous documentation, R has minimal documentation. Some prefer one, some the other.

SAS is behind R in terms of what it offers, but when it does offer something, it goes all out.

SAS has made improvements in graphics and its default graphs in the statistics PROCs are nice, but R is still more flexible for free-form graphs.

In R you can see (and modify) the code; in SAS you can't

BOTH can be used well both in and out side of academia.
 

maartenbuis

TS Contributor
#16
Let me give some input as a Stata user. My take on the difference between R versus Stata and SPSS (and I guess SAS) is that the focus of R is on being a programming language that can be used for statistical analysis, while Stata & Co. are primarily data analysis programs. This is more a difference in focuss not in kind; the steep learning curve and flexibility of R are both a direct result of that. Both are legitimate and useful approaches, but useful to different audiences.

My interpretation of Stata is that it takes an intermediate possition between R and SPSS, it is still a program that focusses on data-analysis, but it explicitly enables and invites users to write new programs. Another advantage of Stata is that the documentation is excelent. Compared to SPSS, Stata has a much more consistent syntax, which makes it easier to learn. Interestingly, this also applies to the user written commands as Stata provides various syntax parsers and other tools to help user programmers to maintain that consistancy also for their programs.
 
#17
I recommend SAS and R. I won't bother listing the strengths and limitations as many have already stated, but both are very good to know despite being academic or industry bound.

I personally use SAS for about 95% of what I do (data management, analyses, reports). I tend to use R mainly for graphics and the occasional simulation. I like the fact that you download packages for just about anything and some of those packages have really neat extensions that interface with other software (like WinBUGS).

By the way, I also come from a background in Psychology (undergrad) and Biostatistics (graduate)!