For a research project in economics, I want to study the several determinants that affect the decision to enter the private rental housing market in the United States. There is a special focus on taxation.

In the analysis, I am using unbalenced (survey) panel-data from the PSID (Panel Study of Income Dynamics) from 2001 until 2013 (biannually). In order to get a nationally representative sample, sample weights are applied (Strata on State level).

The standard specification (simplified) for household i, US state j at time t is as follows:

Name:  Capture.PNG
Views: 47
Size:  8.1 KB
  • Entry is a dummy variable and varies across households (and thus indirectly also over states) over time.
  • The MSR, PSR and CGR are the several tax rates across states over time.
  • D are the time-invariant household characteristics (Age, Sex, Education) across households.
  • E are the time-variant household characteristics (Income, Wealth) across households over time.
  • Epsilon is the error term and varies across households (and thus indirectly also over states) over time

The analysis is conducted in Stata 13.1. First of all, I would like to perform a standard OLS regression, a Fixed Effects regression, and a Random Effects regression.
  1. The are both households and states in the analysis. How do I deal with these two subgroups?
  2. I do not know if a fixed effects regression or a random effects regression is allowed, because the dependent variable is a dummy variable. I think this might lead to inconsistent coefficients, but I am not sure. Can someone confirm this?
  3. Besides, what kind of Standard Errors do I need to use? Robust Standard Errors are not possible in combination with survey regression in Stata.
  4. Afterwards, I also would like to apply Year fixed effects and State fixed effects to control for unobservable time effects and unobservable time-invariant effects at state level, respectively. The time effects can be tested by means of the command "testparm". However, how do I test the state fixed effects?
  5. Do I maybe need to apply other statistics? Maybe probit/logit? If so, why?


In general, can someone give me some help about how to start with this analysis? Thanks in advance!