diff --git a/Notebook.org b/Notebook.org index 5c47339..8df8c37 100644 --- a/Notebook.org +++ b/Notebook.org @@ -1,5 +1,7 @@ * Biology Meets Programming: Bioinformatics for Beginners + ** Week 1 + *** DNA replication **** Origin of replication (ori) @@ -95,7 +97,6 @@ def PatternMatching(Pattern, Genome): We find out that the /9-mers/ with the highest frequency appear in cluster. There is strong statistical evidence that our subsequences are /DnaA boxes/. - **** Computational approaches to find ori in any bacteria Now that we're pretty confident about the /DnaA boxes/ sequences that we found, @@ -119,6 +120,7 @@ We have to try another computational approach, find clusters of /k-mers/ repeated in a small interval. ** Week 2 + *** DNA replication (II) **** Replication process @@ -257,7 +259,6 @@ def SkewArray(Genome): return Skew #+END_SRC - **** Finding /DnaA boxes/ When we look for /DnaA boxes/ in the minimal skew region, @@ -383,6 +384,7 @@ def Count(Motifs): ***** Exercise: Form the most frequent sequence of nucleotides Finally, we can form a Consensus string, to get a candidate regulatory motif: + #+BEGIN_SRC python def Consensus(Motifs): consensus = "" @@ -493,8 +495,7 @@ def Pr(Text, Profile): return probability #+END_SRC -Now we're finally ready to assemble all the pieces and implement a Greedy Motif -Search Algorithm: +Now we're finally ready to assemble all the pieces and implement a Greedy Motif Search Algorithm: #+BEGIN_SRC python def GreedyMotifSearch(Dna, k, t): @@ -583,9 +584,7 @@ def Pr(Text, Profile): ***** Motifs in tuberculosis -Tuberculosis is an infectious disease, caused by a bacteria called /Mycobacterium -tuberculosis/. The bacteria can stay latent in the host for decades, in hypoxic -environments. +Tuberculosis is an infectious disease, caused by a bacteria called /Mycobacterium tuberculosis/. The bacteria can stay latent in the host for decades, in hypoxic environments. Our Greedy Algorithm can help us identify a motif that might be involved in the process.