# Predict the number of messages to be posted in a forum

#### PaulAuchon

##### New Member
Hello everyone,

I need some advice about how to predict the number of messages to be posted in a forum based on the number of messages previously posted on the same forum every month the 10 previous years. You can find below a graph representing the number of messages previously posted :

What I would like is to identify a "trend" that would allow me to tell how many messages will be posted in the following months (obviously, I would do that on newer data). I was thinking about doing some sort of regression but my skills in statistics are not that great. What do you think?

Thank you!

#### hlsmith

##### Omega Contributor
You will likely be looking at time series analysis, which I have no experienced using.

Do you have any potential predictors that may help explain the above fluctuations?

#### PaulAuchon

##### New Member
Ok, thank you, I had never heard of time series analysis, I will check that. And I don't have any predictors. Maybe the month of the year has an influence on the number of messages posted, but I don't know for now.

#### PaulAuchon

##### New Member
For those who may be interested, I think I figured out how to do. I used ARIMA. I don't know if that is correct -- please tell me if it is not! -- but this is what I did:
1. I entered my data in R as a time series
2. I used the auto.arima() function (in the forecast library) to find the optimal parameters for the ARIMA model
3. I generated the model
4. I applied the forecast function to it

Here is the code:
Code:
> vec = c(14,14,6,6,18,6,3,11,19,40,25,24,35,18,37,51,55,74,39,50,75,127,116,239,125,174,249,295,473,435,408,834,870,1357,1684,4424,5559,5844,8167,16253,21481,21107,21977,29219,30942,35164,39167,37134,39841,42546,42088,45719,43197,49463,53292,71794,69769,81344,72821,76963,79017,78711,79111,78277,82376,81930,82682,109876,116350,143995,158316,185915,169616,163694,156993,174117,179635,203711,183714,226372,231351,264537,229456,211828,234205,188645,202730,202623,211995,228025,208926,246247,207021,204611,204082,190179,180224,160862,157919,170342,171995,160736,145481,168716,159044,159673,158128,154751,139266,139699,144129,155927,160554,173098,172859,195933,170872,192772,163560,154206,133142,130577,137756,129598,140794,133556,138111,151234,127174,162714,144866,127285,124480,132021,139130,112157,121962,106806,109051,121430,110299,114485,105224,103578,88850,93361,88971,78878,82327,85312,67054,78447,74901,74718,64201,64709,55960,53437,50360,49388,50932,46196,48206,55148,53573,49467,41345,46502,38052,31148,30166,31662,36370,39369,36745,35372,33431,36656)
> myts <- ts(vec, start=c(2000, 06), end=c(2015, 03), frequency=12)
> myts
Jan    Feb    Mar    Apr    May    Jun    Jul    Aug    Sep    Oct    Nov    Dec
2000                                        14     14      6      6     18      6      3
2001     11     19     40     25     24     35     18     37     51     55     74     39
2002     50     75    127    116    239    125    174    249    295    473    435    408
2003    834    870   1357   1684   4424   5559   5844   8167  16253  21481  21107  21977
2004  29219  30942  35164  39167  37134  39841  42546  42088  45719  43197  49463  53292
2005  71794  69769  81344  72821  76963  79017  78711  79111  78277  82376  81930  82682
2006 109876 116350 143995 158316 185915 169616 163694 156993 174117 179635 203711 183714
2007 226372 231351 264537 229456 211828 234205 188645 202730 202623 211995 228025 208926
2008 246247 207021 204611 204082 190179 180224 160862 157919 170342 171995 160736 145481
2009 168716 159044 159673 158128 154751 139266 139699 144129 155927 160554 173098 172859
2010 195933 170872 192772 163560 154206 133142 130577 137756 129598 140794 133556 138111
2011 151234 127174 162714 144866 127285 124480 132021 139130 112157 121962 106806 109051
2012 121430 110299 114485 105224 103578  88850  93361  88971  78878  82327  85312  67054
2013  78447  74901  74718  64201  64709  55960  53437  50360  49388  50932  46196  48206
2014  55148  53573  49467  41345  46502  38052  31148  30166  31662  36370  39369  36745
2015  35372  33431  36656
> library(forecast)
> auto.arima(myts)
Series: myts
ARIMA(4,2,0)(0,0,2)[12]

Coefficients:
ar1      ar2      ar3      ar4    sma1    sma2
-1.0746  -0.7222  -0.4778  -0.2572  0.2820  0.1347
s.e.   0.0731   0.1029   0.1039   0.0727  0.0764  0.0680

sigma^2 estimated as 138366184:  log likelihood=-1897.57
AIC=3809.13   AICc=3809.8   BIC=3831.33
> fit <- Arima(myts,seasonal=c(4,2,0),order=c(0,0,2))
> plot(forecast(fit)
And here is the result:

Once again, I don't know if that is correct, but the result look pretty good.