Linear regression vs logistic regression

#1
I have a time series dataset. The,

X (Independent variable) is time and is denoted as 1,2,3,4,5,6..1000.etc Y (Dependent variable ) is a percentage scale as 99%, 98.7%, 96%, 91% ...etc. This is a continuous data set.

I have 1000 such data points. The first 700 data points used as training set and rest 300 is used for testing.

I tried to use simple linear regression but when predicting sometimes the prediction is more than 100%. And the case is even worse when I calculated the confidence interval and prediction interval.

So I tried to use logistic regression as there is a boundary ( from 0% to 100%). But logistic regression can take only binary data. I am confused on how to appropriately convert my existing time series data so that I can try how logistic regression on that.
 
Last edited:

Dragan

Super Moderator
#2
The easiest answer is to "censor" your data by converting the percentages (the dependent variable) from 70%-100% to scores of 1. For percentage points less than 70% convert those data points to scores of 0. As such, you could then use binary logistic regression.
 

noetsi

No cake for spunky
#3
If you have time series data its questionable if either linear or logistic regression is ideal. Something like ARDL might be better.