Difference between revisions of "Davidson Missouri W/Mathematical Modeling"

From GcatWiki
Jump to: navigation, search
Line 5: Line 5:
 
=Markov Chain Model For Flipping=
 
=Markov Chain Model For Flipping=
  
One of the first projects we worked on was to develop a mathematical model of graphs for the pancake problem. These graphs were used to find the percent of plasmids that solved problem based on the number of flips. We also developed graphs with biases to see if the length of the flips effect the probability of making the flip. In some of the graphs we saw bouncing because from a starting point you can only get to a solution by an odd number of flips or an even number of flips. If you get to a point on an even number of flips you can never get to an odd number. When the graph shows convergence, regardless whether the number of flips is even or odd you will have a chance to get to one of the solutions. The convergence occurs at .25 because there are 2/8 chances of getting to the solution after a high number of flips which reduces to ¼. We also did graphs using bias wieghts where it takes more flips for the graph to have the bouncing behavior. The biologist suspect that the length of the flips effects the probability of making the flips. Then you would compare mathematical graph to the biologist graph to see the bias.  
+
'''Does starting orientation matter?'''
 +
 
 +
Question: Will the order in which we place the edges in our E.coli computers affect the probability of us detecting a Hamiltonian Path?
 +
 
 +
Our mathematical reasoning relies on the fact that the Hin/HixC DNA recombination system makes a large number of flips and for any one flip there is some element of chance that dictates which Hix sites are chosen to flip.  
 +
 
 +
Answer: We developed a Markov Chain model that takes every signed permutation of edges as a state and the connections between these states are found and categorized by our Matlab programs by both the number of edges that need to be flipped to move between those states and categorized by which specific edge needed to be flipped. Further, our Matlab programs creates a transition matrix based on the weights a user enters to bias the probabilities of making different kinds of flips.
 +
 
 +
Once this transition matrix is generated we then compute, for each possible starting state, the probability of that starting state transitioning to any of the solved states after x flips given the biasing of different types of flips. When our Markov Chain model is run for more than 2 edges we always observe a convergence of these probabilities no matter what bias we apply. This shows that starting orientation does not matter since the probability of being solved after a large number of flips is the same for each starting orientation. 
 +
 
 +
This begs the question: what is a large number of flips? Mostly we observe that this convergence occurs in the first 20 flips for three edges and only for very extreme probabilities does this convergence occur after 60 flips. However, as the number of edges increase we can expect this convergence to occur later. This does not appear to be a problem. Our flipping mechanism flips edges very quickly, so quickly that it is hard to measure. Additionally, the diameter of our transition diagrams increases linearly with the number of edges so there is no reason to believe that the number of flips to convergence would increase at more than a linear pace.  
 +
 
  
 
MATLAB programs that we developed using Markov Chains
 
MATLAB programs that we developed using Markov Chains

Revision as of 16:41, 8 August 2007

Home | Background Information | Current Project: Solving the Hamiltonian Path Problem in vivo | Probability and Statistics | Gene Splitting | Controlling Expression | Traveling Salesperson Problem | Backwards Promotion and Read-Through Transcription | Resources and Citations

Markov Chain Model For Flipping

Does starting orientation matter?

Question: Will the order in which we place the edges in our E.coli computers affect the probability of us detecting a Hamiltonian Path?

Our mathematical reasoning relies on the fact that the Hin/HixC DNA recombination system makes a large number of flips and for any one flip there is some element of chance that dictates which Hix sites are chosen to flip.

Answer: We developed a Markov Chain model that takes every signed permutation of edges as a state and the connections between these states are found and categorized by our Matlab programs by both the number of edges that need to be flipped to move between those states and categorized by which specific edge needed to be flipped. Further, our Matlab programs creates a transition matrix based on the weights a user enters to bias the probabilities of making different kinds of flips.

Once this transition matrix is generated we then compute, for each possible starting state, the probability of that starting state transitioning to any of the solved states after x flips given the biasing of different types of flips. When our Markov Chain model is run for more than 2 edges we always observe a convergence of these probabilities no matter what bias we apply. This shows that starting orientation does not matter since the probability of being solved after a large number of flips is the same for each starting orientation.

This begs the question: what is a large number of flips? Mostly we observe that this convergence occurs in the first 20 flips for three edges and only for very extreme probabilities does this convergence occur after 60 flips. However, as the number of edges increase we can expect this convergence to occur later. This does not appear to be a problem. Our flipping mechanism flips edges very quickly, so quickly that it is hard to measure. Additionally, the diameter of our transition diagrams increases linearly with the number of edges so there is no reason to believe that the number of flips to convergence would increase at more than a linear pace.


MATLAB programs that we developed using Markov Chains

1.Flip Length

2.Pure Flip

3.Bias Weighter



True Positives / False Positives

MATLAB programs used to find the false positives:

1. Adelman's False Positives with 12 nodes

2. Adelman's False Positives with 14 nodes

3. Counter Program for Adelman's False Positives


Possion Model For the Number of Plasmids

Poisson

Using this statistical method we used to make a chart of the probability of finding true positives based on the number of plasmids.