Add ReverseComplement function

This commit is contained in:
coolneng 2019-10-21 17:35:30 +02:00
parent 4fa3efbb2b
commit f5609b5577
2 changed files with 30 additions and 9 deletions

17
Code/ReverseComplement.py Normal file
View File

@ -0,0 +1,17 @@
def ReverseComplement(Pattern):
Pattern = Reverse(Pattern)
Pattern = Complement(Pattern)
return Pattern
def Reverse(Pattern):
reversed = Pattern[::-1]
return reversed
def Complement(Pattern):
compl = ""
complement_letters = {"A": "T", "T": "A", "C": "G", "G": "C"}
for char in Pattern:
compl += complement_letters[char]
return compl

View File

@ -8,17 +8,21 @@
Locating an ori is key for gene therapy (e.g. viral vectors), to introduce a theraupetic gene.
**** Exercise: computational approach to find ori in bacteria
**** Exercises: computational approaches to find ori in Vibrio Cholerae
We'll look for the *DnaA box* sequence, using a sliding window, in that case our code would be the following:
***** Exercise: find Pattern
We'll look for the *DnaA box* sequence, using a sliding window, in that case we will use the function [[./Code/Replication.py][Replication]] to find out how many times
does a sequence appear in the genome.
For the second part, we're going to calculate the frequency map of the sequences of length /k/, for that purpose we'll use [[./Code/FrequentWords.py][FrequentWords]]
***** Exercise: Find the reverse complement of a sequence
We're going to generate the reverse complement of a sequence, which is the complement of a sequence, read in the same direction (5' -> 3').
In this case, we're going to use [[./Code/ReverseComplement.py][ReverseComplement]]
After using our function on the Vibrio's Cholerae genome, we realize that some of the frequent k-mers are reverse complements of other frequent ones.
#+begin_src python
count = 0
for i in range(len(Text)-len(Pattern)+1):
if Text[i:i+len(Pattern)] == Pattern:
count = count+1
print(Pattern + ": " + count)
#+end_src
*** Vocabulary
- k-mer: subsquences of length /k/ in a biological sequence