# Thread: Representing a distance measure with math equation

1. ## Representing a distance measure with math equation

I'm working on presenting a distance measure at a conference but don't have a publication quality math formula for representing what I'm doing in R. Below I have the problem laid out in it's simplest form and would like help representing this in a mathematical expression that would be of the type expected for publication (i.e., you can use LaTeX to show it).

I have data with two codes that run along a intervals across time as seen below in the dataframe and represented in the gantt plot as well:

Code:
  code start end    Start      End variable
1    A   159 200 00:02:39 00:03:20
2    A   391 420 00:06:31 00:07:00
3    A   539 580 00:08:59 00:09:40
4    A   599 660 00:09:59 00:11:00
5    A   763 796 00:12:43 00:13:16
6    B   180 225 00:03:00 00:03:45
7    B   300 339 00:05:00 00:05:39
8    B   599 600 00:09:59 00:10:00
9    B   719 781 00:11:59 00:13:01

For every A interval (any location along A) we will find the nearest B interval (any location along B) and calculate the distance between them.

So...

Code:
A1 = overlap     =  0
A2 = |391 - 339| = 52
A3 = |580 - 199| = 19
A4 = overlap     =  0
A5 = overlap     =  0
In this related post Bryan discusses interval arithmetic algebra but I'm not sure how to write in a journal specific way that I want for ever A code the minimum distance to the enarest B code.

I'm thinking something like:

The data:

Code:
dat <- structure(list(code = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L), .Label = c("A", "B"), class = "factor"), start = c(159,
391, 539, 599, 763, 180, 300, 599, 719), end = c(200, 420, 580,
660, 796, 225, 339, 600, 781), Start = structure(c(0.00184027777777778,
0.00452546296296296, 0.00623842592592593, 0.00693287037037037,
0.00883101851851852, 0.00208333333333333, 0.00347222222222222,
0.00693287037037037, 0.00832175925925926), format = "h:m:s", class = "times"),
End = structure(c(0.002314815, 0.00486111111111111,
0.00671296296296296, 0.00763888888888889, 0.00921296296296296,
0.00260416666666667, 0.00392361111111111, 0.00694444444444444,
0.00903935185185185), format = "h:m:s", class = "times"),
variable = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L
), .Label = "x", class = "factor")), .Names = c("code", "start",
"end", "Start", "End", "variable"), row.names = c(NA, -9L), class = c("cmspans",
"cmtime", "cmtime2long", "vname_variable", "data.frame", "spans_1500")
)

2. ## Re: Representing a distance measure with math equation

From my understanding I made this code. Please let me know if you are looking something else.

Code:
ddat$dur <- paste(ddat$code, "[", ddat$start, "-", ddat$end, "]", sep="")

datA <- subset(ddat, code=='A')

datB <- subset(ddat, code=='B')

datC <- merge(datA, datB, all=T, by=NULL)

gap<-function(start.x, end.x, start.y, end.y){

s1 <- start.x - start.y
s2 <- start.x - end.y
e1 <- end.x - start.y
e2 <- end.x - end.y

nonoverlap <- (s1*s2 > 0) & (e1*e2>0) &(s1*e1 > 0) & (s2*e2>0)

dist <- nonoverlap * apply(abs(cbind(s1,s2,e1,e2)),1,min)

return(dist)
}

datC$dist <- gap(datC$start.x, datC$end.x,datC$start.y,datC$end.y) distM <- xtabs(datC$dist ~ datC$dur.x + datC$dur.y)
distM
apply(distM, 1, min)
Output
Code:
>  ddat$dur <- paste(ddat$code, "[", ddat$start, "-", ddat$end, "]", sep="")
>  datA <- subset(ddat, code=='A')
>  datB <- subset(ddat, code=='B')
>  datC <- merge(datA, datB, all=T, by=NULL)
>  gap<-function(start.x, end.x, start.y, end.y){
+    s1 <- start.x - start.y
+    s2 <- start.x - end.y
+    e1 <- end.x - start.y
+    e2 <- end.x - end.y
+    nonoverlap <- (s1*s2 > 0) & (e1*e2>0) &(s1*e1 > 0) & (s2*e2>0)
+    dist <- nonoverlap * apply(abs(cbind(s1,s2,e1,e2)),1,min)
+    return(dist)
+  }
>  datC$dist <- gap(datC$start.x, datC$end.x,datC$start.y,datC$end.y) > distM <- xtabs(datC$dist ~ datC$dur.x + datC$dur.y)
>  distM
datC$dur.y datC$dur.x   B[180-225] B[300-339] B[599-600] B[719-781]
A[159-200]          0        100        399        519
A[391-420]        166         52        179        299
A[539-580]        314        200         19        139
A[599-660]        374        260          0         59
A[763-796]        538        424        163          0
>  apply(distM, 1, min)
A[159-200] A[391-420] A[539-580] A[599-660] A[763-796]
0         52         19          0          0
>
Lengthy one. can be optimized further

3. ## Re: Representing a distance measure with math equation

Assume you have intervals in the form of in which you already know .

It seems that you want to define the "distance" between two intervals and to be

4. ## The Following User Says Thank You to BGM For This Useful Post:

trinker (11-11-2013)

5. ## Re: Representing a distance measure with math equation

@Vinux, I'm sorry I wasn't clear. I'm looking to make an equation not do it in R (I can do it with words and with computer code but not as an equation).

@BGM, I think that's it. Thank you. I'm looking at stuff online to learn more now. Thanks.

6. ## Re: Representing a distance measure with math equation

May I ask if I want to represent the number of a vectors if I can use: or do I have to use:

7. ## Re: Representing a distance measure with math equation

Let and

For a non overlapping interval, distance measure of A[s_i,e_i ] w.r.t B

8. ## The Following User Says Thank You to vinux For This Useful Post:

trinker (11-11-2013)

9. ## Re: Representing a distance measure with math equation

Not quite. Yours only gives a distance of 0 if any of the endpoints overlap. You could use your notation (although I think it's a little bit more complex with the subscripts than necessary) but consider using BGMs already defined distance.

10. ## Re: Representing a distance measure with math equation

Wow I'm actually getting this math (not sure If I could apply to a new situation yet). From what I read Vinu's answer and BGM's are equivalent.

That is

If the intervals are non overlapping.

11. ## Re: Representing a distance measure with math equation

@trinker, are you looking for a new measure or a measure that matches
A1 = overlap = 0
A2 = |391 - 339| = 52
A3 = |580 - 199| = 19
A4 = overlap = 0
A5 = overlap = 0
?

12. ## Re: Representing a distance measure with math equation

You could try something like this:
Let be the jth interval in B then define

Now

One of the other methods might be better for your audience though?

13. ## The Following User Says Thank You to Dason For This Useful Post:

trinker (11-11-2013)

14. ## Re: Representing a distance measure with math equation

I think the later vinu. I was showing what I did and now needed to represent this with math notation.

15. ## Re: Representing a distance measure with math equation

Originally Posted by trinker
If the intervals are non overlapping.
If they don't overlap then that's true. But vinux's method gives a positive distance if there is overlap (and the endpoints don't exactly match)

16. ## Re: Representing a distance measure with math equation

Originally Posted by Dason
You could try something like this:
Let be the jth interval in B then define

Now

One of the other methods might be better for your audience though?
Yeah I no longer got it I'm guessing BGM's is the most understandable to my audience. I had to look stuff up in vinu's approach and Dason's there are symbols I don't know yet (but will try to by the end of the day).

17. ## Re: Representing a distance measure with math equation

Originally Posted by vinux
Let and

For a non overlapping interval, distance measure of A[s_i,e_i ] w.r.t B
I was translating the Rcode into math language. I have defined the non overlapping intervals using above terms.

For nonoverlapping intervals it can be further simplified

18. ## The Following User Says Thank You to vinux For This Useful Post:

trinker (11-11-2013)

19. ## Re: Representing a distance measure with math equation

Now one step further, what if I wanted to say that The beginning of the A code interval must precede the B code. Would this be represented as:

Where:

Or is there a better way to show this?