Panel data model for policy intervention

#1
Research question: I am writing a master thesis where I am analyzing the impact of a policy on the flow of investments from country A to country B.

Data: I have highly(!) unbalanced panel data on investment flows for 9 years (policy were implemented on year 5), i. e. microdata for the investment flows of every company resided in country A and investing in country B.

Research design: My independent variable is individual investment flows in US dollars. I have a main independent variable, which is Policy coded as dichotomous dummy variable (0 = before policy implementation, 1 = after policy implementation). Plus, I have several control macroeconomic variables, such as GDP growth in a receiving country B, market size in a country B, etc.

Statistical model: That's where my questions begin. In the last weeks, I read several articles and statistics books on panel data analysis and decided that the most suitable model for my design and data available is One-way fixed time effects model (with LSDV as estimator for time dimension). Unfortunately, it is not possible to do two-way fixed effects as each year there are firms which exit and enter the investment market (that's what I meant with "highly unbalanced").

My questions:

  1. Is that a right approach? Can this data be used for the model described?
  2. I also want to analyze how this policy shift impacted investments in various economic sectors (e.g. finance, agriculture, energy, etc.). In the dataset, I have the information which investment goes to which sector. Can I create a categorical variable with sectors to see how the sector affiliation impacts the flow? Can the sector category be Interpreted as individual effects allowing for the complete FE model?
  3. I also want to see how different components of the policy influenced the investments. E.g. for example there is a tax deduction for agriculture, there is low interest rate for energy, and both for finance.
  4. Maybe, I want too much and have to consider more than one model?

I would be glad to receive your advice and any other hints.
 
#3
Do you use yearly data, or quarterly, monthly, etc? Saying "9 years" does not mean much... How many companies?
I have yearly data. For each year it is 30 to 50 data points. All together I have 357 observations.
As for company, I can not really include companies in the analysis as there is no consistency in companies from year to year. So, e. g. in year 1 I have companies A, B, C, in year 2 A, B, D, E, in year 3 B, F, G, H, I, J. They highly vary over time. There are years, where 20 investment transactions are made by one firm and other 15 by separate firms.
And differences between firms is not something I am interested in. I am interested in firms as an aggregate (or sorted by sector). That is why I use macroeconomic control variables varying over years but not across firms.
 

staassis

Active Member
#4
I have yearly data. For each year it is 30 to 50 data points. All together I have 357 observations.
And differences between firms is not something I am interested in.
That does not matter. If there are differences among firms, your model should capture them. Otherwise, statistical inference may be wrong. Once you have estimated a model which passes goodness-of-fit checks, you can look only into those aspects of the model that interest you. I would suggest to start with

A) random effects for company intercepts + Sector Dummy Variables * Policy Shift + Time + Time^2 + Macrovariables + ...
B) fixed effects for company intercepts + Sector Dummy Variables * Policy Shift + Time + Time^2 + Macrovariables ...
(F)

With only 9 time intervals, more complex dynamics cannot be studied... You can compare the two modeling frameworks using Hausman test, AIC and BIC. Once the decision on A vs B has been made, note the following. The full model, as described in (F), is likely to have many non-significant terms. So polish it using backward stepwise selection.
 
#5
That does not matter. If there are differences among firms, your model should capture them. Otherwise, statistical inference may be wrong. Once you have estimated a model which passes goodness-of-fit checks, you can look only into those aspects of the model that interest you. I would suggest to start with

A) random effects for company intercepts + Sector Dummy Variables * Policy Shift + Time + Time^2 + Macrovariables + ...
B) fixed effects for company intercepts + Sector Dummy Variables * Policy Shift + Time + Time^2 + Macrovariables ...
(F)

With only 9 time intervals, more complex dynamics cannot be studied... You can compare the two modeling frameworks using Hausman test, AIC and BIC. Once the decision on A vs B has been made, note the following. The full model, as described in (F), is likely to have many non-significant terms. So polish it using backward stepwise selection.

Thank you!
I just counted the firms and there are 126 of unique ones. Is it not too much? And only 71 of them have invested multiple times over the 9 years period. Would the full fixed effect model still make sense?..
 

staassis

Active Member
#6
If some firms have only 1-2 years of observations, then the fixed effects framework cannot be estimated. In that case you would put "Sector Dummy Variables" instead of "fixed effects for company intercepts" in equation (F.B). Better than removing 1-year firms.

Either way, I would suspect that (F.A) or (F.A) + [Sector Dummy Variables] would make a better model. But still, try different frameworks and compare.
 
Last edited: