+ Reply to Thread
Results 1 to 4 of 4

Thread: Looping over all variables

  1. #1
    Points: 8,343, Level: 61
    Level completed: 65%, Points required for next Level: 107

    Posts
    278
    Thanks
    14
    Thanked 7 Times in 7 Posts

    Looping over all variables




    Hello,

    I am using STATA for my analysis, but I am not a good programmer in STATA yet.

    I am dealing with a multiple hypothesis testing problem.

    I have a dataset of let's say n observations, and (p+1) variables, 1 dependent and p independent (n<<p). The dependent variable is nominal.

    I would like to run in a loop (obviously not manually), a series of tests. If an independent variable is continuous, a t-test and if it's nominal, a chi-square test. From every test I need to keep the p-value (I need a variable containing p-values), so I can use the FDR method, using the STATA package smileplot which I already installed.

    How do I do that ? I have no idea where to start.....if anyone ever did something like this and can help me with codes it will be more than appreciated....the alternative is to work with R....less friendly.

    thanks

  2. #2
    RoboStataRaptor
    Points: 10,533, Level: 68
    Level completed: 21%, Points required for next Level: 317
    bukharin's Avatar
    Location
    Sydney, Australia
    Posts
    1,300
    Thanks
    11
    Thanked 322 Times in 313 Posts

    Re: Looping over all variables

    A t-test would be appropriate for a binary independent variable and continuous outcome, not the other way around. For a continuous predictor and binary categorical outcome I'd suggest logistic regression.

    To help get you started, after logistic regression the p-value for the overall regression is returned as e(p). After -tabulate, chi2- it's returned as r(p). You can loop over variables using:
    Code: 
    foreach var of varlist a b c { // a, b & c are your continuous independent variables
        logistic outcomevar `var'
        do something with e(p)
    }
    
    foreach var of varlist d e f { // d, e & f are your categorical independent variables
        tabulate outcomevar `var', chi2
        do something with r(p)
    }
    The "do something" is a bit tricky and it depends on what you're after. If you just want the p-values you could store them in a matrix. Otherwise you may need to create a temporary dataset, which is slightly irritating in this situation (but quite do-able).

  3. #3
    Points: 8,343, Level: 61
    Level completed: 65%, Points required for next Level: 107

    Posts
    278
    Thanks
    14
    Thanked 7 Times in 7 Posts

    Re: Looping over all variables

    thank you for the quick reply!

    I have more than 3 variables, perhaps something like 500 or more. Is there a way to tell the loop to run from var1-var500 without listing them ?

    The "do something" is very tricky, I have no idea how to handle it. I do need to store the p-values somehow, I don't know if a matrix is better or dataset. I need it in order to use the false discovery rate (package smileplot) so I can estimate how many type I errors I have (if I run so many tests I will have some for sure).

  4. #4
    RoboStataRaptor
    Points: 10,533, Level: 68
    Level completed: 21%, Points required for next Level: 317
    bukharin's Avatar
    Location
    Sydney, Australia
    Posts
    1,300
    Thanks
    11
    Thanked 322 Times in 313 Posts

    Re: Looping over all variables


    Well here's a way of doing it using a temporary dataset. I've added "quietly" in front of each calculation to save time and screen real estate...
    Code: 
    tempfile pvalues
    foreach var of varlist a b c { // a, b & c are your continuous independent variables
        quietly logistic outcomevar `var'
        preserve
        clear
        set obs 1
        gen var="`var'"
        gen p=e(p)
        capture append using `pvalues'
        save `pvalues', replace
        restore
    }
    
    foreach var of varlist d e f { // d, e & f are your categorical independent variables
        quietly tabulate outcomevar `var', chi2
        preserve
        clear
        set obs 1
        gen var="`var'"
        gen p=r(p)
        capture append using `pvalues'
        save `pvalues', replace
        restore
    }
    
    use `pvalues', clear
    As to extending it to run from var1-var500, it's well documented in -help foreach-; you may also want to look at -help varlist- and, just for good measure, -help numlist-

  5. The Following User Says Thank You to bukharin For This Useful Post:

    NN_STAT (11-24-2011)

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats