Opinion on my regression+corellation tables

Hey guys!

I m currently doing a regression analysis for my Master Thesis. I am looking on the venture capital market in China.
Here I am looking for a connection between the total Funding and the difference in habits between chinese and foreign investors as well as their role in the startup financing market.
You can access both tables here:

Both tables here

Does the model even make sense?
Can I still use those results with a P value between 0,05 and 0,1?

I am just an engineer and not very gifted when it comes to statistics, so I am counting on you! (just hope its not complete garbage ;)


Active Member
You are getting bogus, misleading high correlations because your variables are not weakly stationary. Their means, variances and covariances change with time. The nuisance may lead to phenomenon known as spurious regression. You should transform the variables into their stationary versions first. Potential tools:

1] (log-)differencing,
2] subtracting deterministic trends, if any,
3] subtracting deterministic seasonalities, if any.

It may be the case that not all of the variables have to be transformed. To check whether a specific factor is weakly stationary, run the augmented Dickey-Fuller test, Phillips–Perron test or KPSS test.


Fortran must die
Better yet run all three and see if they agree. Weak power is a major issue with these tests, particularly for near unit roots. Differencing is the classic way to address this problem (but not correct if the trend is deterministic rather than stochastic).

Correlations with time series are very doubtful ...and time series regression with predictor variables is not for the faint of heart. I have been trying to work with this for years and still don't have a good understanding of them.