# Advice on methodology and analysis, regression.

Dear all,

For my Master’s thesis, I have some difficulties with the statistical part of my research.

I want to analyze the relationship between two variables: Investment in Innovation (R&D) --> Concentration Ratio
1. Research & Development (R&D) Investments in an industry
2. Concentration Ratio of the industry (Herfindahl index or Concentration Ratio by 4/50 largest firms)

Now, my supervisor is very busy yet told me I should use:
Regression Formula = H(it)-H(it-1)= a+b*(R&D/VA)+ c*(GFCF/VA)

DV = Change in Concentration ratio (Herfindahl index or Concentration Ratio index for 4/tm50)

- i = Industry (in NAICS industry code)
- t = Time (1997-2012, with a 5 year interval; 1997->2002->2007->2012)
- H = Herfindahl index for each year. (a measure of concentration)
- RD= (R&D) Research and development investment
- VA= Value Added
- GFCF= Gross Fixed Capital Formation

My questions:

1. Can you guys give me any advice? And what would my hypotheses be?
2. How do I include a dummy variable into this? Any suggestions on variables to use? I am considering on using: R&D intensive industry or not (1=yes, 0=no).
3. How would I go on about this, in Spss and such. Basically, I am asking for advice on the analysis.

I hope I was clear enough, please ask for further details. Thank you!