Clarification: This is not a forecast model. I simply want to see if there is a statistical trend.
Somebody please help! I would very much appreciate any assistance here
Hi all,
I have "panel data" (kind of) by country for "publication share," ie the percentage of total global publications contributed by a specific country in a specific time period.
The data is sorted in 15 consecutive 5-year periods (1988-1992, 1989-1993, 1990-1994....2002-2006, 2003-2007) by country. So, for example, I have a data points that note that the USA contributed 54% of all total publications from 1988-1992, Slovenia contributed 1% of all publications from 1990-1994, and the UK contributed 23% of all publications from 1989-1993.
The question I would like to examine is: Has the share of total publications for Country X increased over the 20 year period? I would like to examine this for all countries (ie. Has the USA's share of total publications increased from 1988 to 2007?)
How can I approach this question statistically? Is it acceptable to simply perform a univariate linear regression between publication share and year for each country, then examine the t-statistic for the coefficient? Or is serial correlation a big problem with this approach (or does it not apply at all)?
Please let me know if there's anything I can clarify. Thank you very much
Clarification: This is not a forecast model. I simply want to see if there is a statistical trend.
Somebody please help! I would very much appreciate any assistance here
are the publications you're referring to an actual count of publications, or are there just percentages listed?
The data is currently organized by % over five year periods
However it is possible to retrieve the actual counts as well
Next question: Your question addresses the change over 20 years. however, you have 5 year intervals of data (fifteen of them). which 20 years are you referring to?
1988 to 2007
the five year intervals are overlapping, as i described in my original post
Is there a way to obtain or calculate out individual year publications and percentages?
yes, it is possible
is it necessary?
I tried to be as clear as possible in the original post but it appears I did not clarify enough....
Ok. Here are my thoughts.
If you can get individual year # of publications and percentages, then that would be easier to work with. Keeping the 5 year intervals causes overlapping observations, which makes modeling more difficult.
My understanding is that you're only interested in determining if the percentage of the total articles over time changes. This holds a lot of assumptions (ones that might not stand up to criticism). I'd think about modeling the actual number of publications rather than the percentages.
Simple linear regression sounds like it would work well for your case. Whether or not you choose to model counts instead of percentages should guide you in how you model it.
Before modeling, you should plot (i.e. graph) your data and look at it. That will also help guide you in how you should model.
Hope that helps.
|
|