Let's say I have a length N, sequence X of non-integer values. I have another sequence Y of non-integers of length M, where N >> M. I employ a metric (say MSE) to find the closest alignment of X and Y. The closest alignment gives me a score S. I would like to estimate a probability that the match is a correct match (i.e. the alignment is correct). Using Bayes rule:
I have estimated the probabilities P(S|match) and P(S|no match) based on synthesized data. The issue I have here is in determining P(match) and P(no match). If I use the fact that there is only one true alignment and account for every other possible shift (with equal probability), then P(match) is really really small (1/(N-M)) and the metric begins to appear very ineffective. I could group the shifts and come up with some probability of a match "within Z shifts", but this seems very arbitrary. Another issue that will affect P(match) and P(no match) is that the sequences (X & Y) contain some amount of serial correlation. At this point, I'm not sure how to estimate the correlation nor how to account for it.
I have searched the web for possible solutions, but I haven't found any. I appreciate any comments.