Difference between revisions of "Davidson Missouri W/Mathematical Modeling"

From GcatWiki
Jump to: navigation, search
m
m (Probability and Statistics moved to Davidson Missouri W/Probability and Statistics)
(No difference)

Revision as of 19:34, 11 July 2007

Home | Background Information | Current Project: Solving the Hamiltonian Path Problem in vivo | Probability and Statistics | Gene Splitting | Controlling Expression | Traveling Salesperson Problem | Backwards promotion and Read-Through Transcription | Resources and Citations

False Positives

The false positive programs will tell how many false positives there are in Adelman’s graph without two edges (12) and the other is for the original graph with 14 edges. To develop the program we first made a list of all the different ways to make each number represented in the path and numbered them. We also, numbered each of the edges. we took the numbered edges and made vectors of the ways that each number can be represented at the top of the program.

First we made a loop to make sure that MATLAB picks up at least one representation of each number. Then the perms function is used to arrange all the representations of each number except the promoter in all the different ways and call it matrix g.Then you want to find all the rows where the last number of each row of the matrix ends with the number that represents the terminator. Then using the ord function all the different ways of putting the two promoters in the front of each row is developed. Now the program changes the label of the numbers from there number representation number to there edge number where the label goes from 1 to 2 and transposes them. The way that the program lays out the numbers need to be transposed again so we called that matrix sl. Now an if statements were made for the total number of repeats for each row equal to the number of repeats next to each other. The counter is used to find the total number of repeats that are next to each other. For all the number of edges left after all of the repeats are taken out we replaced those with zeros. Now we had to make a vector of ones that will be from 1 to the length of sl. What we call usl is where we get rid of one of the repeated numbers that’s next to each other because one number shows the back half and front half of the gene. Now the new usl is relabeled and includes the numbers for the edges with the promoter at the beginning of each and the terminator at the end of each with the zeros replacing the edges that are after the terminator.

If there is a row in matrix g that does not end with a terminator then we use the command elseif and add a terminator to each of those rows. Since there are two terminators then the program goes through the process one time with the first terminator and another time with the other terminator making these rows longer than if they originally had a terminator at the end. After this whole process all the rows start with a promoter and end with a terminator and have zeros after the terminator is represented.

At the end of the program we print out the total number of arrangements of false positives before the zeros after the terminator are replaced with the extra edges. Also, the rows that are not repeated are printed out. Now the program finds all the rows that have terminators starting in column 8 to 12. We found how many ways we can arrange the remaining edges depending on where the terminator is and take that number and multiply by the amount of times the terminator is in each spot and add them together for each column number. This number gives the number of false positives for a Hamiltonian path.

MATLAB programs used to find the false positives:

1. Adelman's False Positives with 12 nodes

2. Adelman's False Positives with 14 nodes

3. Counter Program for Adelman's False Positives

Projects Completed

Markov Chains

One of the first projects we worked on was to develop a mathematical model of graphs for the pancake problem. These graphs were used to find the percent of plasmids that solved problem based on the number of flips. We also developed graphs with biases to see if the length of the flips effect the probability of making the flip. In some of the graphs we saw bouncing because from a starting point you can only get to a solution by an odd number of flips or an even number of flips. If you get to a point on an even number of flips you can never get to an odd number. When the graph shows convergence, regardless whether the number of flips is even or odd you will have a chance to get to one of the solutions. The convergence occurs at .25 because there are 2/8 chances of getting to the solution after a high number of flips which reduces to ¼. We also did graphs using bias wieghts where it takes more flips for the graph to have the bouncing behavior. The biologist suspect that the length of the flips effects the probability of making the flips. Then you would compare mathematical graph to the biologist graph to see the bias.

MATLAB programs that we developed using Markov Chains

1.Flip Length

2.Pure Flip

3.Bias Weighter


Poisson

Using this statistical method we used to make a chart of the probability of finding true positives based on the number of plasmids.

P(missing all true positives)

We first looked at trying to find the probability of there not being a Hamiltonian Path(HP) and not seeing one.(if they don’t see anything what is the probability that nothing is there.) Knowing that the probability of seeing a HP given that there is not a HP is 0 and the probability of seeing a HP given that there is not one is .99.Since we did not know the probability of not seeing a HP nor do we not know the probability that there is not a HP we decided we need to change the wording of the question to what is the probability of missing all the true positives. We would be able to find this probability if we know the number of cells that can be seen. After talking to the biologist they said if there is a HP there is a 100% probability that they will see it. The answer to our original question of finding the probability for missing all the true positives is 0.

How many orderings of positively oriented edges are 2 flips away from being solved?

We started off stating that n=nodes, e=the number of edges and we needed to arrange n-1 edges. First, we proved that the number of positively orientated edges could not be 1 flip away. Flips are represented by ∧

Example 1

1 2∧ 3 4∧

1 2 -3 -4 New row

Since everything is starts positive we need 2 chunks and they must be in numerical order [1,….,a]….[a+1,…..,n-1]

Example 2

1 2 ∧ _ _ 3 4 ∧

1 2 ∧-4 -3 ∧ _ _

1 2 3 4

Example 3

1 2 3 ∧ _ 4 ∧ _

1 2 3 ∧ -4 ∧ _ _

1 2 3 4 _ _

Next we figured out a formula for how many different ways to have chunks. Number of edges after 1st chunk - number of edges in 2nd chunk

n-1

∑ (e-i) – (n-1-i) = Number of unimportant edges

i=0

n-1

∑ (e-n+1)

i=0

n(e-n+1)

The equation for the number of orderings of positively oriented edges 2 flips away from being solved is:

(e-(n-1) ! × (n(e-n+1))