# Best method to determine future success or to determine best linearity?

#### nhb

##### New Member
Long time viewer, but first time poster, so excuse me if i'm in the wrong place please.
Anyway, I am working on a project that is pretty interesting. Through data mining, I am able to gather a ton of investment portfolios. Each portfolio has the obviously related statistics, including total revenue, total loss, resulting profit, and I can even get a daily break down of this for the past 3-4 years. (1200 days exactly).
I did a bit of excel manipulation, using the r2 pearson function and using a sharpe ratio (average profit/standard deviation), and I think I am on the right track because I was able to get a few results that were pretty **** consistent. First I break up the 1200 days into 20 day segments, and I get the net daily profit/loss. Then I accumulate the 60 points of 20-day intervals to get a growing summation. When you do this, you'll get some sort of line since you're adding up the individual intervals, and then I apply the 2 functions of r2 pearson and sharpe ratio to this line. My goal is to find portfolios that are linear in fashion, because in my eyes, if the portfolio has been linear for the past 1200 days, it should have a high chance of continued linearity.
So when you sort out my list of portfolios by highest sharpe ratio, I can choose one of the top portfolios and as you can see from the pic of the graph attached, it is somewhat consistent and linear as opposed to another portfolio that, when graphed, appears to be sinusoidal, or erratic with large jumps and dips.
My question is, can anyone give me more information on how valid my theory of using linearity for continued success is? Is there a different equation that I should be using to determine which portfolio I can invest with in the future? Is there something more capable of defining linearity of my portfolio graphs than the r2 function or the sharpe ratio?

I was also thinking of using a portfolio that ranks high in =(average sum)/(loss)) or choosing a portfolio that has the least amount of losses in it. Typically though, I have observed those having alot of flat areas when graphed, and I'm not sure that would be best for future success.
Thanks!!

#### hlsmith

##### Less is more. Stay pure. Stay poor.
Welcome to the forum!

This is not my area at all, but I would wonder if you should make your selection on which model performs best on a small holdout set or use the holdout to evaluate the best fitting method?

Also how often can there be exogenous shocks to the system and how do you address these - using an indicator variable?

#### nhb

##### New Member
Welcome to the forum!

This is not my area at all, but I would wonder if you should make your selection on which model performs best on a small holdout set or use the holdout to evaluate the best fitting method?

Also how often can there be exogenous shocks to the system and how do you address these - using an indicator variable?
Hi thanks! You bring up some good points. First and foremost, do you think this type of question should be in a different forum category? I know some forums are really specific so i dont want to step on anyone's toes by placing this question in the wrong spot :/

But anyway you bring up a good idea with the small holdout set, but can you explain more? I think at one point I had a similar idea, but eventually, you run into a circle because you're going to fight a question of "how much X days of past data is good enough for a future Y days of horizon data?) and that question in and of itself will have endless answers as it likely won't be an easy, consistent answer. But perhaps your idea is a bit different so would you mind explaining?

In terms of exogenous shocks or anomalies- - - I dont plan on stopping them because I don't think there is a theoretical way to do that, so I choose to just limit them. By mining a bunch of different portfolio investments, and making cumulative line graphs and by defining that line - - I think that is the best way because in my eyes, a line that is consistent or linear will have a higher chance of being successful since we know linear regression exists, and linear regression solves that specific question

#### hlsmith

##### Less is more. Stay pure. Stay poor.
Well if you are fitting a linear model, you could hold out a few recent obs, then fit different models and make your selection base on the trade off between simplicity and mean square errors from holdout data.

This forum category is fine. We all aren't as finicky as others.

#### nhb

##### New Member
Ah I see what you're saying. So take a sample, run it against random other data, and see if there is any common ground, right?

Have you heard of Granger and Newbold, 1974? It's an article about auto correlation and the Durbin Watson statistic. Essentially, I THINK it says that relating past data of the same data (previous points of a linear regression) to determine the future of it can be flawed and skewed, But I am not knowledge in this enough to know better. I will read up more about it