+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 15 of 27

Thread: Representing a distance measure with math equation

  1. #1
    ggplot2orBust
    Points: 71,220, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    User with most referrers
    trinker's Avatar
    Location
    Buffalo, NY
    Posts
    4,417
    Thanks
    1,811
    Thanked 928 Times in 809 Posts

    Representing a distance measure with math equation




    I'm working on presenting a distance measure at a conference but don't have a publication quality math formula for representing what I'm doing in R. Below I have the problem laid out in it's simplest form and would like help representing this in a mathematical expression that would be of the type expected for publication (i.e., you can use LaTeX to show it).

    I have data with two codes that run along a intervals across time as seen below in the dataframe and represented in the gantt plot as well:



    Code: 
      code start end    Start      End variable
    1    A   159 200 00:02:39 00:03:20
    2    A   391 420 00:06:31 00:07:00  
    3    A   539 580 00:08:59 00:09:40
    4    A   599 660 00:09:59 00:11:00
    5    A   763 796 00:12:43 00:13:16
    6    B   180 225 00:03:00 00:03:45
    7    B   300 339 00:05:00 00:05:39
    8    B   599 600 00:09:59 00:10:00 
    9    B   719 781 00:11:59 00:13:01

    For every A interval (any location along A) we will find the nearest B interval (any location along B) and calculate the distance between them.

    So...

    Code: 
    A1 = overlap     =  0
    A2 = |391 - 339| = 52
    A3 = |580 - 199| = 19
    A4 = overlap     =  0
    A5 = overlap     =  0
    In this related post Bryan discusses interval arithmetic algebra but I'm not sure how to write in a journal specific way that I want for ever A code the minimum distance to the enarest B code.

    I'm thinking something like:



    The data:

    Code: 
    dat <- structure(list(code = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 
        2L, 2L), .Label = c("A", "B"), class = "factor"), start = c(159, 
        391, 539, 599, 763, 180, 300, 599, 719), end = c(200, 420, 580, 
        660, 796, 225, 339, 600, 781), Start = structure(c(0.00184027777777778, 
        0.00452546296296296, 0.00623842592592593, 0.00693287037037037, 
        0.00883101851851852, 0.00208333333333333, 0.00347222222222222, 
        0.00693287037037037, 0.00832175925925926), format = "h:m:s", class = "times"), 
            End = structure(c(0.002314815, 0.00486111111111111, 
            0.00671296296296296, 0.00763888888888889, 0.00921296296296296, 
            0.00260416666666667, 0.00392361111111111, 0.00694444444444444, 
            0.00903935185185185), format = "h:m:s", class = "times"), 
            variable = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L
            ), .Label = "x", class = "factor")), .Names = c("code", "start", 
        "end", "Start", "End", "variable"), row.names = c(NA, -9L), class = c("cmspans", 
        "cmtime", "cmtime2long", "vname_variable", "data.frame", "spans_1500")
    )
    "If you torture the data long enough it will eventually confess."
    -Ronald Harry Coase -

  2. #2
    Dark Knight
    Points: 6,762, Level: 54
    Level completed: 6%, Points required for next Level: 188
    vinux's Avatar
    Posts
    2,011
    Thanks
    52
    Thanked 241 Times in 205 Posts

    Re: Representing a distance measure with math equation

    From my understanding I made this code. Please let me know if you are looking something else.


    Code: 
    ddat$dur <- paste(ddat$code, "[", ddat$start, "-", ddat$end, "]", sep="")
    
    datA <- subset(ddat, code=='A')
    
    datB <- subset(ddat, code=='B')
    
    datC <- merge(datA, datB, all=T, by=NULL)
      
    
    
    gap<-function(start.x, end.x, start.y, end.y){
      
      s1 <- start.x - start.y 
      s2 <- start.x - end.y 
      e1 <- end.x - start.y
      e2 <- end.x - end.y
      
      nonoverlap <- (s1*s2 > 0) & (e1*e2>0) &(s1*e1 > 0) & (s2*e2>0)
      
      dist <- nonoverlap * apply(abs(cbind(s1,s2,e1,e2)),1,min)
    
      return(dist)
    }
      
    datC$dist <- gap(datC$start.x, datC$end.x,datC$start.y,datC$end.y)
    
    distM <- xtabs(datC$dist ~ datC$dur.x + datC$dur.y)
    distM
    apply(distM, 1, min)
    Output
    Code: 
    >  ddat$dur <- paste(ddat$code, "[", ddat$start, "-", ddat$end, "]", sep="")
    >  datA <- subset(ddat, code=='A')
    >  datB <- subset(ddat, code=='B')
    >  datC <- merge(datA, datB, all=T, by=NULL)
    >  gap<-function(start.x, end.x, start.y, end.y){
    +    s1 <- start.x - start.y 
    +    s2 <- start.x - end.y 
    +    e1 <- end.x - start.y
    +    e2 <- end.x - end.y
    +    nonoverlap <- (s1*s2 > 0) & (e1*e2>0) &(s1*e1 > 0) & (s2*e2>0)
    +    dist <- nonoverlap * apply(abs(cbind(s1,s2,e1,e2)),1,min)
    +    return(dist)
    +  }
    >  datC$dist <- gap(datC$start.x, datC$end.x,datC$start.y,datC$end.y)
    >  distM <- xtabs(datC$dist ~ datC$dur.x + datC$dur.y)
    >  distM
                datC$dur.y
    datC$dur.x   B[180-225] B[300-339] B[599-600] B[719-781]
      A[159-200]          0        100        399        519
      A[391-420]        166         52        179        299
      A[539-580]        314        200         19        139
      A[599-660]        374        260          0         59
      A[763-796]        538        424        163          0
    >  apply(distM, 1, min)
    A[159-200] A[391-420] A[539-580] A[599-660] A[763-796] 
             0         52         19          0          0 
    >
    Lengthy one. can be optimized further
    In the long run, we're all dead.

  3. #3
    TS Contributor
    Points: 22,410, Level: 93
    Level completed: 6%, Points required for next Level: 940

    Posts
    3,020
    Thanks
    12
    Thanked 565 Times in 537 Posts

    Re: Representing a distance measure with math equation

    Assume you have intervals in the form of [a_i, b_i] in which you already know a_i \leq b_i.

    It seems that you want to define the "distance" between two intervals [a_i, b_i] and [a_j, b_j] to be

    \max\{a_i-b_j, a_j - b_i, 0\}

  4. The Following User Says Thank You to BGM For This Useful Post:

    trinker (11-11-2013)

  5. #4
    ggplot2orBust
    Points: 71,220, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    User with most referrers
    trinker's Avatar
    Location
    Buffalo, NY
    Posts
    4,417
    Thanks
    1,811
    Thanked 928 Times in 809 Posts

    Re: Representing a distance measure with math equation

    @Vinux, I'm sorry I wasn't clear. I'm looking to make an equation not do it in R (I can do it with words and with computer code but not as an equation).

    @BGM, I think that's it. Thank you. I'm looking at stuff online to learn more now. Thanks.
    "If you torture the data long enough it will eventually confess."
    -Ronald Harry Coase -

  6. #5
    ggplot2orBust
    Points: 71,220, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    User with most referrers
    trinker's Avatar
    Location
    Buffalo, NY
    Posts
    4,417
    Thanks
    1,811
    Thanked 928 Times in 809 Posts

    Re: Representing a distance measure with math equation

    May I ask if I want to represent the number of a vectors if I can use: n_a or do I have to use: n_{a_i}
    "If you torture the data long enough it will eventually confess."
    -Ronald Harry Coase -

  7. #6
    Dark Knight
    Points: 6,762, Level: 54
    Level completed: 6%, Points required for next Level: 188
    vinux's Avatar
    Posts
    2,011
    Thanks
    52
    Thanked 241 Times in 205 Posts

    Re: Representing a distance measure with math equation

    Let A= \{ [As_i, Ae_i],   i \in  I_A\} and B= \{ [Bs_j, Be_j ],  j \in  I_B\}

    For a non overlapping interval, distance measure \mu () of A[s_i,e_i ] w.r.t B
    \mu([As_i,Ae_i ], B) = \min_j  \{ |As_i - Bs_j|,|As_i - Be_j|,|Ae_i - Bs_j|,|Ae_i - Bs_j|   \}
    In the long run, we're all dead.

  8. The Following User Says Thank You to vinux For This Useful Post:

    trinker (11-11-2013)

  9. #7
    Devorador de queso
    Points: 95,754, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,932
    Thanks
    307
    Thanked 2,629 Times in 2,245 Posts

    Re: Representing a distance measure with math equation

    Not quite. Yours only gives a distance of 0 if any of the endpoints overlap. You could use your notation (although I think it's a little bit more complex with the subscripts than necessary) but consider using BGMs already defined distance.
    I don't have emotions and sometimes that makes me very sad.

  10. #8
    ggplot2orBust
    Points: 71,220, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    User with most referrers
    trinker's Avatar
    Location
    Buffalo, NY
    Posts
    4,417
    Thanks
    1,811
    Thanked 928 Times in 809 Posts

    Re: Representing a distance measure with math equation

    Wow I'm actually getting this math (not sure If I could apply to a new situation yet). From what I read Vinu's answer and BGM's are equivalent.

    That is \min_{j}\left \{|A_{s_i} - B_{s_j}|, |A_{s_i} - B_{e_j}|, |A_{e_i} - B_{s_j}|, |A_{e_i} - B_{e_j}|\right \} = \max{\left \{ a_i-b_j,a_j-b_i, 0 \right \}}

    If the intervals are non overlapping.
    "If you torture the data long enough it will eventually confess."
    -Ronald Harry Coase -

  11. #9
    Dark Knight
    Points: 6,762, Level: 54
    Level completed: 6%, Points required for next Level: 188
    vinux's Avatar
    Posts
    2,011
    Thanks
    52
    Thanked 241 Times in 205 Posts

    Re: Representing a distance measure with math equation

    @trinker, are you looking for a new measure or a measure that matches
    A1 = overlap = 0
    A2 = |391 - 339| = 52
    A3 = |580 - 199| = 19
    A4 = overlap = 0
    A5 = overlap = 0
    ?
    In the long run, we're all dead.

  12. #10
    Devorador de queso
    Points: 95,754, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,932
    Thanks
    307
    Thanked 2,629 Times in 2,245 Posts

    Re: Representing a distance measure with math equation

    You could try something like this:
    Let B_j be the jth interval in B then define
    B = \cup B_j

    Now

    \text{dist}(A_i, B) = inf\{|x - y| \text{ such that } x \in A_i, y \in B\}

    One of the other methods might be better for your audience though?
    I don't have emotions and sometimes that makes me very sad.

  13. The Following User Says Thank You to Dason For This Useful Post:

    trinker (11-11-2013)

  14. #11
    ggplot2orBust
    Points: 71,220, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    User with most referrers
    trinker's Avatar
    Location
    Buffalo, NY
    Posts
    4,417
    Thanks
    1,811
    Thanked 928 Times in 809 Posts

    Re: Representing a distance measure with math equation

    I think the later vinu. I was showing what I did and now needed to represent this with math notation.
    "If you torture the data long enough it will eventually confess."
    -Ronald Harry Coase -

  15. #12
    Devorador de queso
    Points: 95,754, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,932
    Thanks
    307
    Thanked 2,629 Times in 2,245 Posts

    Re: Representing a distance measure with math equation

    Quote Originally Posted by trinker View Post
    If the intervals are non overlapping.
    If they don't overlap then that's true. But vinux's method gives a positive distance if there is overlap (and the endpoints don't exactly match)
    I don't have emotions and sometimes that makes me very sad.

  16. #13
    ggplot2orBust
    Points: 71,220, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    User with most referrers
    trinker's Avatar
    Location
    Buffalo, NY
    Posts
    4,417
    Thanks
    1,811
    Thanked 928 Times in 809 Posts

    Re: Representing a distance measure with math equation

    Quote Originally Posted by Dason View Post
    You could try something like this:
    Let B_j be the jth interval in B then define
    B = \cup B_j

    Now

    \text{dist}(A_i, B) = inf\{|x - y| \text{ such that } x \in A_i, y \in B\}

    One of the other methods might be better for your audience though?
    Yeah I no longer got it I'm guessing BGM's is the most understandable to my audience. I had to look stuff up in vinu's approach and Dason's there are symbols I don't know yet (but will try to by the end of the day).
    "If you torture the data long enough it will eventually confess."
    -Ronald Harry Coase -

  17. #14
    Dark Knight
    Points: 6,762, Level: 54
    Level completed: 6%, Points required for next Level: 188
    vinux's Avatar
    Posts
    2,011
    Thanks
    52
    Thanked 241 Times in 205 Posts

    Re: Representing a distance measure with math equation

    Quote Originally Posted by vinux View Post
    Let A= \{ [As_i, Ae_i],   i \in  I_A\} and B= \{ [Bs_j, Be_j ],  j \in  I_B\}

    For a non overlapping interval, distance measure \mu () of A[s_i,e_i ] w.r.t B
    \mu([As_i,Ae_i ], B) = \min_j  \{ |As_i - Bs_j|,|As_i - Be_j|,|Ae_i - Bs_j|,|Ae_i - Bs_j|   \}
    I was translating the Rcode into math language. I have defined the non overlapping intervals using above terms.

    For nonoverlapping intervals it can be further simplified

    \mu([As_i,Ae_i ], B) = \min_j  \{ |As_i - Be_j|,|Ae_i - Bs_j|   \}
    In the long run, we're all dead.

  18. The Following User Says Thank You to vinux For This Useful Post:

    trinker (11-11-2013)

  19. #15
    ggplot2orBust
    Points: 71,220, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    User with most referrers
    trinker's Avatar
    Location
    Buffalo, NY
    Posts
    4,417
    Thanks
    1,811
    Thanked 928 Times in 809 Posts

    Re: Representing a distance measure with math equation


    Now one step further, what if I wanted to say that The beginning of the A code interval must precede the B code. Would this be represented as:

    \max{\left \{ a_i-b_j,a_j-b_i, 0 \right \}}

    Where:

    a_i \geq b_i
    Or is there a better way to show this?
    "If you torture the data long enough it will eventually confess."
    -Ronald Harry Coase -

+ Reply to Thread
Page 1 of 2 1 2 LastLast

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats