# weighting of aligned subsequences by their length

#### tevang

##### New Member
I have a pool of subsequences, each one associated with a probability (and other properties that you don't need to know) that quantifies how frequently that subsequence occurs. In order to select those subsequences that may be part of a longer sequence (template sequence), I align them to the template sequence as below:
Code:

Code:
SLQQWKVGDKCSA  :template sequence
LQQ           :sA, probA, other properties
QQWKV        :sB, probB, other properties
SLQQ           :sC, probC, other properties
VGDKCSA  :sD, probD, other properties
each of the subsequences sA, sB, sC, sD is associated with a probability (probA, probB, probC, probD). Now I want to weight their probability by the sequence length of the subsequence. For instance, sA is smaller than sB and thus must have a lower weight, because the smaller the subsequence the higher the probability to get a false positive (to be aligned at that part of the template sequence by chance). The reason I am doing this process is to select for each element (column) of the template sequence a set of representative properties. So, any ideas of weighting schemes will be very welcome!