Add PatternMatching function

This commit is contained in:
coolneng 2019-10-21 17:35:44 +02:00
parent f5609b5577
commit dc740e1c54
2 changed files with 13 additions and 1 deletions

6
Code/PatternMatching.py Normal file
View File

@ -0,0 +1,6 @@
def PatternMatching(Pattern, Genome):
positions = []
for i in range(len(Genome)-len(Pattern)+1):
if Genome[i:i+len(Pattern)] == Pattern:
positions.append(i)
return positions

View File

@ -21,8 +21,14 @@
We're going to generate the reverse complement of a sequence, which is the complement of a sequence, read in the same direction (5' -> 3').
In this case, we're going to use [[./Code/ReverseComplement.py][ReverseComplement]]
After using our function on the Vibrio's Cholerae genome, we realize that some of the frequent k-mers are reverse complements of other frequent ones.
After using our function on the Vibrio Cholerae's genome, we realize that some of the frequent k-mers are reverse complements of other frequent ones.
***** Exercise: Find a subsequence within a sequence
We're going to find the ocurrences of a subsquence inside a sequence, and save the index of the first letter in the sequence.
This time, we'll use [[./Code/PatternMatching.py][PatternMatching]]
After using our function on the Vibrio Cholerae's genome, we find out that the /9-mers/ with the highest frequency appear in cluster.
This is strong statistical evidence that our subsequences are /DnaA boxes/.
*** Vocabulary
- k-mer: subsquences of length /k/ in a biological sequence