Interesting question. I would say try both methods. They all tell a story. But eventually, a non-linear model would be best if you want all three week periods explained in one single model. My 2 cents.
I'm looking at a data set of movies with the number of mentions on Twitter over time and trying to find if there's a linear correlation with box office results. My data looks like this:
(see attachment)
My question is, if I want to know how mention volume correlates to box office over this entire 3 week period, would it be better to take the average mentions for each movie over the time period and correlate that to the box office number? Or would it be better to find the correlation at each week and take the average of the 3 correlations?
Interesting question. I would say try both methods. They all tell a story. But eventually, a non-linear model would be best if you want all three week periods explained in one single model. My 2 cents.
RTFM
sosaysi (01-27-2015)
Thanks Lukan27. It does seem a non-linear model would be better but I'm a bit of a beginner and not sure what direction to go in in terms of designing a non-linear model. Do you have any suggestions that might work for this kind of time-based data or perhaps know of resources where I could learn more?
Also curious to hear if anyone has any other thoughts about which average would best work in a linear model.
Thanks!
Tweet |