Regression/Time Series

Hello All:wave:

I am looking for research help:yup:

There is a data set and the goal is to build a model.

There are events that happen. 229 of these events have been recorded and more of these events will happen in the future.

Each event has an outcome. outcome is recorded for the previous 229 events and the goal is to predict outcomes in the future events and hopefully with much certainty. It is not critical to make a prediction every time but it is important to be certain.

Each event's outcome is being predicted during a time interval and this variable is segmented as 0-100% and it signifies the timeline of each prediction made.

There are 229 stocks from the stock market that have been recorded. the goal is to predict what will be the opening price next morning.

Throughout yesterday's day, at time intervals when volume significantly changed, price of stocks was recorded. This was done until stock market closed. Also this morning's price is recorded as Open_Price

Variable View:
StockID, Numeric, 1-229
Open_Price, Numeric
Point_Time, Date
Point_Price, Numeric
Point_Time_Segment, Numeric, (0-1]

Point_Time_segment is a created variable from Point_Time
First data point was recorded as 1 divide by (the number of observations for that stock), second point was 2 divide by (the number of observations for that stock). And closing data point's Point_Time_Segment = 1

The goal is to predict whether tomorrow morning's opening price will be higher or lower than close price, and to do this with most certainty.

In a sense, there are 229 of the following graphs. The red point on the pictures is the opening price next day. This is the value to predict. But again the goal is simply to be certain on whether the price will be higher or lower than close price.
Last edited:

So lets set a timeline and say the open price occurs at t+1.

Close of previous day is t, previous price at t-1.

Create a depvar (t+1 price - t price) and you have your above below variable, then get busy with your hypotheses,

run regression of depvar (which is binary you should note) on say, whether the previous trade was a buy (up move) or sell (down move) so

depvar=a+b(some measure of t to t-1 deviation)+ error

that is a pretty simple model that can easily be expanded, i.e. if last 2 trades are up then dummy=1(or some such), if price went down by 5% previous day (1 or 0) - does that lead to continued drop or bounce, etc. or use price changes if you prefer continuous vars.

vinoverde i hear what you are saying, and i can create this variable (depvar) but please look at the following and let me know.

StockID Open_Price Today Point_Time_Segment Point_Price
1 21.52 0.143478261 16.12
1 21.52 0.304347826 16.12
1 21.52 0.47826087 16.12
1 21.52 0.52173913 16.12
1 21.52 0.565217391 16.08
1 21.52 0.826086957 16.24
1 21.52 0.913043478 16.24
1 21.52 0.956521739 16.24
1 21.52 1 16.24
2 16.64 0.12 15.28
2 16.64 0.32 15.4
2 16.64 0.4 15.4
2 16.64 0.52 15.2
2 16.64 0.6 15.2
2 16.64 0.72 15.2
2 16.64 0.8 15.2
2 16.64 0.88 15.2
2 16.64 0.96 15.2
2 16.64 1 15.28
3 14.08 0.379310345 15.52
3 14.08 0.413793103 15.52
3 14.08 0.827586207 15.48
3 14.08 0.862068966 15.48
3 14.08 0.896551724 15.48
3 14.08 0.931034483 15.48
3 14.08 0.965517241 15.48
3 14.08 1 15.48

let me explain: like the last row is StockID 3, this morning's price was 14.08 and it closed yesterday at 15.48. The third column represents time. Market Hours from 9:30 a.m. to 4:00 p.m. so 9:30 represents zero and 4pm represents 1. day is broken down into ratio. that's how i know it closed at 15.48 because time is equal to 1; a little earlier before the market closing the last recorded time was .965517241 'th of the day also at 15.48.

I am not sure as to how to use spss/minitab/excel. these are the software i have access to that perform statistics.

vinoverde is this what you are saying: to create a new variable called, depvar, it is the (today's morning price minus yesterday's close price). this variable ranges from negative something to positive something. i make another variable binary_depvar. here if it is equal to or less than zero it is zero. positive is one. so now that the dep variable is set, the data is like the following. vinoverde how to run it with spss?
here is the latest update

from 0 to 1 represents the data set of yesterday's price up to closing time at time=1. At time = 1.5 is this morning's prices that need to be predicted. i am still working out to build a model - i know right now it looks more like pubic hair:). Help is welcome.
Opening Market Price of share does not depend on time alone.It depends on no.of shares traded of various companies & state of economy.Then only model can be developed.
thanks for the reply

what if i dont know the volume? the items i'm dealing with do not provide volume readily, maybe there is a way to figure out approximate volume? right now i have prices over time and would like to predict "next day opening price", i put this in quotations because the item has a final value and that is it.