Add PatternMatching function
This commit is contained in:
parent
f5609b5577
commit
dc740e1c54
6
Code/PatternMatching.py
Normal file
6
Code/PatternMatching.py
Normal file
@ -0,0 +1,6 @@
|
||||
def PatternMatching(Pattern, Genome):
|
||||
positions = []
|
||||
for i in range(len(Genome)-len(Pattern)+1):
|
||||
if Genome[i:i+len(Pattern)] == Pattern:
|
||||
positions.append(i)
|
||||
return positions
|
@ -21,8 +21,14 @@
|
||||
|
||||
We're going to generate the reverse complement of a sequence, which is the complement of a sequence, read in the same direction (5' -> 3').
|
||||
In this case, we're going to use [[./Code/ReverseComplement.py][ReverseComplement]]
|
||||
After using our function on the Vibrio's Cholerae genome, we realize that some of the frequent k-mers are reverse complements of other frequent ones.
|
||||
After using our function on the Vibrio Cholerae's genome, we realize that some of the frequent k-mers are reverse complements of other frequent ones.
|
||||
|
||||
***** Exercise: Find a subsequence within a sequence
|
||||
|
||||
We're going to find the ocurrences of a subsquence inside a sequence, and save the index of the first letter in the sequence.
|
||||
This time, we'll use [[./Code/PatternMatching.py][PatternMatching]]
|
||||
After using our function on the Vibrio Cholerae's genome, we find out that the /9-mers/ with the highest frequency appear in cluster.
|
||||
This is strong statistical evidence that our subsequences are /DnaA boxes/.
|
||||
|
||||
*** Vocabulary
|
||||
- k-mer: subsquences of length /k/ in a biological sequence
|
||||
|
Loading…
Reference in New Issue
Block a user