+ Reply to Thread
Results 1 to 9 of 9

Thread: Regression: DV is Revenue and IDV is cost. (Very High MAPE)

  1. #1
    Points: 277, Level: 5
    Level completed: 54%, Points required for next Level: 23

    Posts
    8
    Thanks
    4
    Thanked 0 Times in 0 Posts

    Unhappy Regression: DV is Revenue and IDV is cost. (Very High MAPE)




    Need help on one of the modelling assignment.
    My dependent variable is Revenue and the independent variable is Cost beside some other variables like Temperature etc.

    My concern is about how shall i improve the MAPE given that the revenue does not behave in a particular pattern with regards to cost.
    For ex: There are many instances where the cost is so less but the revenue is quite high for that combination. In some cases, the revenue value against cost seems reasonable.

    MAPE is coming around 60% to 70% which is not allowing me to sleep well.

    Please help.

  2. #2
    TS Contributor
    Points: 12,227, Level: 72
    Level completed: 45%, Points required for next Level: 223
    rogojel's Avatar
    Location
    I work in Europe, live in Hungary
    Posts
    1,470
    Thanks
    160
    Thanked 332 Times in 312 Posts

    Re: Regression: DV is Revenue and IDV is cost. (Very High MAPE)

    hi,
    I think, this is not unusual, you will have many factors that drive the revenue besides costs. You might get a higher mape, maybe, if you introduce some more factors, like industry type - if you didn't already, but anyway a high variation is to be expected imo.

    regards

  3. The Following User Says Thank You to rogojel For This Useful Post:

    vikash_124124 (03-10-2016)

  4. #3
    Points: 277, Level: 5
    Level completed: 54%, Points required for next Level: 23

    Posts
    8
    Thanks
    4
    Thanked 0 Times in 0 Posts

    Re: Regression: DV is Revenue and IDV is cost. (Very High MAPE)

    Quote Originally Posted by rogojel View Post
    hi,
    I think, this is not unusual, you will have many factors that drive the revenue besides costs. You might get a higher mape, maybe, if you introduce some more factors, like industry type - if you didn't already, but anyway a high variation is to be expected imo.

    regards
    My main concern as of now is to reduce the MAPE considering only cost as an IDV before i start adding other variables because it is the most imortant factor which should explain the variations in Revenue. Right?

    But how shall one go about improving the MAPE. Is DV transformation coupled with IDV transformation an option to go about it??

    Also if i use transformations of DV and/or IDVs, will i get the right results if i use my regression equation to predict the values of Revenue based on cost value alone?

  5. #4
    TS Contributor
    Points: 12,227, Level: 72
    Level completed: 45%, Points required for next Level: 223
    rogojel's Avatar
    Location
    I work in Europe, live in Hungary
    Posts
    1,470
    Thanks
    160
    Thanked 332 Times in 312 Posts

    Re: Regression: DV is Revenue and IDV is cost. (Very High MAPE)

    Hi,
    the obvious way would be to plot the residuals and check to see if you have any pattern. With any luck, you might find some nonlinearity or other pattern that you can address - e.g. a funnel shape with a a log transform or such. But there are limits to what you can do.

    regards

  6. The Following User Says Thank You to rogojel For This Useful Post:

    vikash_124124 (03-11-2016)

  7. #5
    Points: 277, Level: 5
    Level completed: 54%, Points required for next Level: 23

    Posts
    8
    Thanks
    4
    Thanked 0 Times in 0 Posts

    Re: Regression: DV is Revenue and IDV is cost. (Very High MAPE)

    Quote Originally Posted by rogojel View Post
    Hi,
    the obvious way would be to plot the residuals and check to see if you have any pattern. With any luck, you might find some nonlinearity or other pattern that you can address - e.g. a funnel shape with a a log transform or such. But there are limits to what you can do.

    regards
    Hi,
    Thanks for your reply. Well we tried plotting the fit chart i.e. plotting the original Revenue values along with the predicted values. The fit chart is random. It shows some seasonality. Tried stl decomposition and reseaonalizing the data but the result was miserable.
    I will try the residuals plot and see if it gives a new dimension to my thought process.

    Another question that i would like to ask is that if I use DV transformation, is it true that i will probably lose the ability to show the effect of individual Independent variables on the DV??

    For example : If i use the following transformation: Log(Revenue) ~ Log(sqrt(Cost))

  8. #6
    TS Contributor
    Points: 12,227, Level: 72
    Level completed: 45%, Points required for next Level: 223
    rogojel's Avatar
    Location
    I work in Europe, live in Hungary
    Posts
    1,470
    Thanks
    160
    Thanked 332 Times in 312 Posts

    Re: Regression: DV is Revenue and IDV is cost. (Very High MAPE)

    Hi,
    yes, that is true because the effect of IV on the transformed variable does not translate to an effect on the untransformed DV - Mean(Log(DV)) != Log(Mean(DV))) so and increase of B1 units of mean Log(revenue) does not translate to an increase of exp(B1) of the mean revenue. This is however true for the median, so, you have to change the effect from the mean to the median if you transform the DV.

    regards

  9. The Following User Says Thank You to rogojel For This Useful Post:

    vikash_124124 (03-11-2016)

  10. #7
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Regression: DV is Revenue and IDV is cost. (Very High MAPE)

    I don't really agree with the logic that you should solve a univariate relationship first before you run a multivariate model. Something in the multivariate model might show why the univariate relationship is performing as it is, and commonly univariate models are meaningless in the context of a multivariate approach. You could find a positive slope for a univariate model and find the relationship is negative in the multivariate approach.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  11. The Following User Says Thank You to noetsi For This Useful Post:

    vikash_124124 (03-11-2016)

  12. #8
    Points: 277, Level: 5
    Level completed: 54%, Points required for next Level: 23

    Posts
    8
    Thanks
    4
    Thanked 0 Times in 0 Posts

    Re: Regression: DV is Revenue and IDV is cost. (Very High MAPE)

    Hi noetsi,
    I agree with you. But isn't it true that the majority of the variation in Revenue should be explained by the cost variable rather than depending on the mix of Cost and other variables to explain the variability in Revenue.
    I mean in any industry for any company, Cost to the company (In this case say "marketing spend") should be the first parameter which must explain the majority of the variability in Revenue. Please correct me if i am wrong.
    Regards

  13. #9
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Regression: DV is Revenue and IDV is cost. (Very High MAPE)


    Quote Originally Posted by vikash_124124 View Post
    Hi noetsi,
    I agree with you. But isn't it true that the majority of the variation in Revenue should be explained by the cost variable rather than depending on the mix of Cost and other variables to explain the variability in Revenue.
    I mean in any industry for any company, Cost to the company (In this case say "marketing spend") should be the first parameter which must explain the majority of the variability in Revenue. Please correct me if i am wrong.
    Regards
    That is a substantive question I am not qualified to answer But consider two things. First, it really does not hurt to check that in fact cost is driving revenue. Look at the r square value. If its 15 percent, than does that really show it is driving revenue? Second, but part of the first, industry assumptions are often wrong. People believe things that turn out when inspected to be badly wrong. And there could be interaction effects - that is some other variable may influence the impact of cost on revenue, or the relationship may be non-linear going up quickly at some level of the predictor or some other variable and so on.

    It does not hurt to look. You might discover the common wisdom is incomplete.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

+ Reply to Thread

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats