+ Reply to Thread
Results 1 to 1 of 1

Thread: Calculating the probability of finding pairs of n-letter substrings in DNA

  1. #1
    Points: 899, Level: 15
    Level completed: 99%, Points required for next Level: 1

    Thanked 0 Times in 0 Posts

    Probability of finding pairs of substrings in DNA


    I've been struggling a bit with this problem:

    If I have a sequence of DNA made from letters ATGC which has a length of say 3571 letters, what's the probability that this sequence will contain the sequence AAAA (a 4-mer) at two positions spaced s letters apart, where s is in 10, 50, 100, 200, and where no overlaps can occur, so AAAAAA only has 1 occurrence of AAAA and AAAAAAAA has 2. I also give a tolerance +/- two letters for each occurrence, but I've left that out of my attempt.

    I've attached my attempt but I don't know how to calculate the probability of the second pattern which can either be 10, 50, 100 or 200 letters later....

    Any help would be appreciated :-)

    Best wishes,
    Attached Images  
    Last edited by James1984; 12-21-2016 at 10:51 PM. Reason: more succinct title

+ Reply to Thread


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Advertise on Talk Stats