# Looping over all variables

#### NN_STAT

##### New Member
Hello,

I am using STATA for my analysis, but I am not a good programmer in STATA yet.

I am dealing with a multiple hypothesis testing problem.

I have a dataset of let's say n observations, and (p+1) variables, 1 dependent and p independent (n<<p). The dependent variable is nominal.

I would like to run in a loop (obviously not manually), a series of tests. If an independent variable is continuous, a t-test and if it's nominal, a chi-square test. From every test I need to keep the p-value (I need a variable containing p-values), so I can use the FDR method, using the STATA package smileplot which I already installed.

How do I do that ? I have no idea where to start.....if anyone ever did something like this and can help me with codes it will be more than appreciated....the alternative is to work with R....less friendly.

thanks

#### bukharin

##### RoboStataRaptor
A t-test would be appropriate for a binary independent variable and continuous outcome, not the other way around. For a continuous predictor and binary categorical outcome I'd suggest logistic regression.

To help get you started, after logistic regression the p-value for the overall regression is returned as e(p). After -tabulate, chi2- it's returned as r(p). You can loop over variables using:
Code:
foreach var of varlist a b c { // a, b & c are your continuous independent variables
logistic outcomevar var'
do something with e(p)
}

foreach var of varlist d e f { // d, e & f are your categorical independent variables
tabulate outcomevar var', chi2
do something with r(p)
}
The "do something" is a bit tricky and it depends on what you're after. If you just want the p-values you could store them in a matrix. Otherwise you may need to create a temporary dataset, which is slightly irritating in this situation (but quite do-able).

#### NN_STAT

##### New Member
thank you for the quick reply!

I have more than 3 variables, perhaps something like 500 or more. Is there a way to tell the loop to run from var1-var500 without listing them ?

The "do something" is very tricky, I have no idea how to handle it. I do need to store the p-values somehow, I don't know if a matrix is better or dataset. I need it in order to use the false discovery rate (package smileplot) so I can estimate how many type I errors I have (if I run so many tests I will have some for sure).

#### bukharin

##### RoboStataRaptor
Well here's a way of doing it using a temporary dataset. I've added "quietly" in front of each calculation to save time and screen real estate...
Code:
tempfile pvalues
foreach var of varlist a b c { // a, b & c are your continuous independent variables
quietly logistic outcomevar var'
preserve
clear
set obs 1
gen var="var'"
gen p=e(p)
capture append using pvalues'
save pvalues', replace
restore
}

foreach var of varlist d e f { // d, e & f are your categorical independent variables
quietly tabulate outcomevar var', chi2
preserve
clear
set obs 1
gen var="var'"
gen p=r(p)
capture append using pvalues'
save pvalues', replace
restore
}

use pvalues', clear`
As to extending it to run from var1-var500, it's well documented in -help foreach-; you may also want to look at -help varlist- and, just for good measure, -help numlist-