+ Reply to Thread
Results 1 to 3 of 3

Thread: Regression and imputing zeros

  1. #1
    Points: 66, Level: 1
    Level completed: 32%, Points required for next Level: 34

    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Regression and imputing zeros




    Hi all!

    I have what is likely a simple regression problem.

    I am imputing Variable X from survey Y to survey Z. I am using a simple linear regression model and the predict function in STATA.

    So, I run a regression on Variable X in survey Y, using a host of independent dummy variables. Then I use the predict command to impute variable X onto survey Z using the regression parameters on the same dummy variables in Z. So far, so good.

    However, the original variable in survey Y has 60% of observations equal to zero. But, when I impute none of the observations are zero. This is important as I need a similar proportion of my imputed variable equal to zero.

    Does anyone have any idea how I can constrain my imputation to contain a similar proportion of zeros without arbitrarily adding zeros here or there? I've got a feeling that hotdecking might be my answer but would appreciate some further advice.

    I'm off on holiday for a few weeks so looking forward to seeing some responses when I get back!

    Dom

  2. #2
    Points: 66, Level: 1
    Level completed: 32%, Points required for next Level: 34

    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Regression and imputing zeros

    Any thoughts? Anyone??

  3. #3
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Regression and imputing zeros


    I don't know STATA so I don't understand what you are doing with the predict command. You would be better off to explain what you are doing in terms of the regression generically as compared to what you are doing with a software command most are probably not familiar with. What does the Stata predict command do substantively?

    Do you mean you are predicting X in two different samples and comparing the results? Or that you are generating parameters in one sample and testing if they work in another different sample? This is not clear to me...

    I don't really understand what you are doing when you say this (the only time I have seen the term imputation used in regression is to deal with missing values and I don't think you are doing this).


    However, the original variable in survey Y has 60% of observations equal to zero. But, when I impute none of the observations are zero. This is important as I need a similar proportion of my imputed variable equal to zero.
    Why are you imputing, whatever that means, this way? I have not seen this done unless you are, as I mentioned above, trying to test parameter validity from another sample and this does not appear to be what you are doing. Why would you expect the observations in one sample to be similar to that in another one. When you do multiple imputation commonly they are not the same for example.


    Does anyone have any idea how I can constrain my imputation to contain a similar proportion of zeros without arbitrarily adding zeros here or there?
    Again I am not sure what you mean by imputation, please explain what this is and why you are doing it. Why do you want one sample to have the same proportion of values as another (why do you think this makes sense for two random sample)?

    In general you need to explain far more what you are trying to do, and why and realize few use STATA here so they won't be familiar with its functions.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

+ Reply to Thread

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats