# Thread: How to do ANOVA test analysis in Excel 2007?

1. ## How to do ANOVA test analysis in Excel 2007?

Hi everybody,

I'm new in this forum and in statistics analysis too, hoping that somebody could help me and I'll really be grateful.

I don't know how to do an ANOVA test using excel 2007. I'm confusing which columns and/or lines to be chosen to make the ANOVA analysis.

I have a data set which looks like the attached file below. It contains values of genes expression in 3 conditions (control, mutant and double mutant mices) and each condition is repeated 5 times (5 arrays). The aim is see if there is a significant difference in gene expression between mutant and double mutant compared with the control.
For this purpose, I should to do some statistics tests and look at the p-value. But I don't have any statistics or mathematics background and when I try to do it in excel, I'm confused about which are the right columns and lines to be specified to carry out the ANOVA or student test properly.

Should I permute the columns and lines ?

Please, somebody could explain me how to make ANOVA properly in Excel 2007? which columns and lines to select and why?

Thank you very much in advance.

2. The Anova: 2 factor with replication function in Excel can be used to examine the variance in your data. Check the help file for how to set up your data and refer to the attached spreadsheet. This type of analysis is usually done before a more detailed comparison of factors and gives info on the significance of factors and interactions between factors. You have two factors : conditions & genes. I noted that the output for the "Double" is exactly the same as the control - is this really the case ?

From the analysis - here's my take:

Sample,F > Fcrit, therefore significant difference between Genes overall
Columns,F > Fcrit, therefore highly significant effect of mutation overall
Interaction, Low interaction (high P) between gene and mutation, therefore response to mutation is not significantly influenced by Gene.

3. Originally Posted by Andre Smit
The Anova: 2 factor with replication function in Excel can be used to examine the variance in your data. Check the help file for how to set up your data and refer to the attached spreadsheet. This type of analysis is usually done before a more detailed comparison of factors and gives info on the significance of factors and interactions between factors. You have two factors : conditions & genes. I noted that the output for the "Double" is exactly the same as the control - is this really the case ?

From the analysis - here's my take:

Sample,F > Fcrit, therefore significant difference between Genes overall
Columns,F > Fcrit, therefore highly significant effect of mutation overall
Interaction, Low interaction (high P) between gene and mutation, therefore response to mutation is not significantly influenced by Gene.
Thank you very much for your reply, Andre. I appreciate it. I looked at the excel help but it is not well detailed for me!

You are right, the output in "double mutant" is the same as in the control. But in fact, this is just arbitrary, I only wanted to understand how Anova functions and how it is done.

Actually, I have thousands of genes as rows and one hundred treatments as columns.
In total I have 40.000 values for genes (in rows) and 90 treatments (in columns).
My question now is: should I re-organize data as you did for all my genes or I can do anova on raw data as they are (where replicates are in rows and genes are in columns)? Because I see in your answer that you put replicates hierarchically and not as in the original file?

4. Yep, for the 2-factor ANOVA in Excel you need to arrange the data in that specific format. Here's the help I was referring to - expand the Anova tab:

Excel Help

So you'll require some copying and pasting to rearrange your data I agree it is a "weird" format. In Minitab, for example, you'd arrange your data in the column format e.g.

Response Condition Gene
0.46 Control 1
0.3 Control 1
0.8 Control 1
1.51 Control 1
0.9 Control 1
-0.1 Mutant 1
0.49 Mutant 1
0.24 Mutant 1
0.06 Mutant 1
0.46 Mutant 1
0.46 Double 1
0.3 Double 1
0.8 Double 1
1.51 Double 1
0.9 Double 1

5. Thanks again.

When one has a small set of data, it's OK to rearrange them but when we have thousands of columns and rows, there should be another mean or methode to do statistics analysis without copying and pasting ! you don't thing so ? otherwise, it is really nightmare !

On other hand, in according to excel help file, I should to apply "one way anova" but not tow factors, as I have genes values measures under only one type of conditions (treatments) ?! what do yoy thing about that?

6. ## ANOVA within Excel

Hi,

If you are looking for informations on ANOVA with Excel, you should check the microsoft website concerning the Data Analysis Toolpak, short descriptions of anova and tests are available.
Personnally, I use XLSTAT to run tests and ANOVA with Excel.

7. Originally Posted by jemz
Hi,

If you are looking for informations on ANOVA with Excel, you should check the microsoft website concerning the Data Analysis Toolpak, short descriptions of anova and tests are available.
Personnally, I use XLSTAT to run tests and ANOVA with Excel.
Hi,

What is the advantage of XLSTAT over Excel to do ANOVA, in your mind of course ? do you find xlstat adding a best capability to excel?

8. XLSTAT is more complete and can handle easily and quickly more complex models than Excel do.

Furthermore, there is many papers saying Excel data analysis tools are not very reliable. See the special number of the journal Computational Statistics and Data Analysis (Volume 52, Issue 10, Special Section on Microsoft Excel 2007, Edited by B.D. McCullough).

XLSTAT has its own stat functions and do not use Excel functions.

9. Jemz, when I wrote the notes I uploaded at that site, I did a check with spss to see any discrepancies first. There were some in the non-parametric which i think i mention too. other than that, there were slight diferrences due to small roundings at the 4th-5th digit.

11. Hey, i am new in statistics but i have the same issue as Statser. However, i would like to find a conveniet way to organize my data to be XLstat or anova friendly. My microarray data contain huge number of genes arranged by the same way as Statser's file. I am wondering if there is any reliable way to organize these magnitude of data. Thanks all !

 Tweet

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts