+ Reply to Thread
Results 1 to 6 of 6

Thread: how splitting optimally a continuous variable for anova ?

  1. #1
    Points: 2,867, Level: 32
    Level completed: 78%, Points required for next Level: 33

    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts

    how splitting optimally a continuous variable for anova ?



    Hi,

    What segmentation method I could use for splitting optimally (*) a continuous variable in two groups in order to run an ANOVA ?
    (beside exploratory approach as boxplot and so on.)

    (*): that is to say to maximize inter-group variance.

    vincent

  2. #2
    Bhoot
    Points: 1,270, Level: 19
    Level completed: 70%, Points required for next Level: 30

    Posts
    1,758
    Thanks
    40
    Thanked 124 Times in 106 Posts
    If the continuous variable is independent variable(IV), then you can make the split using scatter-plot(IV &DV). If you are able to see two clusters in the graph then it is easy to find the split.

    or split IV in the meanof(IV)
    In the long run, we're all dead.

  3. #3
    Points: 2,867, Level: 32
    Level completed: 78%, Points required for next Level: 33

    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts

    thank, but how without graphical tool ?

    I would avoid an approximative graphical solution, because I need to implement a method automatically for a lot of different sample. So I am looking for an algorithmic solution giving me treshold values.

  4. #4
    Bhoot
    Points: 1,270, Level: 19
    Level completed: 70%, Points required for next Level: 30

    Posts
    1,758
    Thanks
    40
    Thanked 124 Times in 106 Posts
    I am not clear about your objective. If you wanted algorithmic solution, you can use clustering technique, use k-means clustering( use k=2)
    In the long run, we're all dead.

  5. #5
    Points: 2,867, Level: 32
    Level completed: 78%, Points required for next Level: 33

    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts

    my objective

    My objective is more simple than what could bring a cluster analysis.

    My population is composed of observation of firms.
    My DV is an economic ratio of theses individual firm.
    My IV is the size of theses firms (the unity is the number of employees).

    I have good reasons to think (theory, viewing boxplot) that the economic ratio is related to the size of firms. But because heteroscedasticity, outliers and other specific needs, I would prefer run anova after partitionning the IV in two classes with a cut-off point.

    I would like to use an algorithm to choose the cut-off point that divide optimally my population so as to maximise the inter-group variance (*) with sas or R to allow me programmation.


    gratefully yours.
    vincent

    (*) empirically I have already ranked my observation ascendantly with the size of the IV and calculate slipping means according to the move of an observation of one one group to the other in order to identify the maximal gap between the means of every group. But it is not enough statistical nor automatic.

  6. #6
    Bhoot
    Points: 1,270, Level: 19
    Level completed: 70%, Points required for next Level: 30

    Posts
    1,758
    Thanks
    40
    Thanked 124 Times in 106 Posts

    I don't think that splitting DV is a good idea. When you put into two groups(assigning indicator variable 0 or 1), you are reducing the information of DV. It is my personal opinion. If you found this method is useful plz let me know.

    you can't use discriminant analysis, because the group is not pre-determined.
    I guess it is one time work. I still feel k-means algorithm will help you.
    In the long run, we're all dead.

+ Reply to Thread

Similar Threads

  1. Replies: 3
    Last Post: 08-03-2010, 04:06 PM
  2. continuous variable
    By mojo in forum Statistics
    Replies: 2
    Last Post: 02-20-2010, 11:23 AM
  3. Continuous random variable question
    By Janiffer in forum Statistics
    Replies: 0
    Last Post: 10-19-2009, 03:16 PM
  4. Replies: 0
    Last Post: 02-20-2009, 06:25 AM
  5. Repeated Measures ANOVA with a Continuous Variable?
    By Lola in forum Statistical Research
    Replies: 2
    Last Post: 08-27-2008, 04:11 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts








Advertise on Talk Stats