This website was designed for an undergraduate course at Davidson College.
Information on Plotting Kyte-Doolittle
Hydropathy using Perl
The engine behind the hydropathy analysis used in this web page is a program written in the computer language perl. Perl is a language used often for creating computer tools used in bioinformatics, genomics, proteomics, and was used extensively in the Human Genome Project. The programs created by perl can perform crucial functions such as DNA and amino acid sequencing, DNA to amino acid translation to protein structure estimation, and complex protein analysis like Kyte-Doolittle hydropathy.
The specific Perl code used for the analysis and plotting on this website has been written to take an input sequence, remove any letters and spaces, capitalize any lowercase letters, and then perform hydropathic analysis using the selected window size. Window size is a concept used in analyzing sequences of letters or numbers, like in DNA and amino acid sequences that denotes how many units the perl program analyzes at one time. For example, if an entered sequence is 20 amino acids long and the window size is 9 amino acids, the program will look at the first nine amino acids (say, 1-9) and assign each a value. The program then averages these nine scores and plots this point on a graph; the reading window on the sequence then moves over one amino acid and perform the same function on amino acids 2-10. The program repeats this procedure of assigning and averaging scores, then plotting them to the end of the sequence.
The horizontal green line on the graph at y = 2 represents the threshold commonly used to evaluate whether or not a section of a protein is hydrophobic enough to be integral within a cell membrane.
Troubleshooting
The perl analysis program for this website has been written to take lowercase letter, numbers, and spaces in the sequence into account; however, the program also only takes into account the 20 common amino acids-lacking any identification for amino acids "J", "O","U", and "X". If you experience problems when inputting or analying your sequence, if possible, check for unrecognizable elements in the sequence: unknown amino acids, punctuation, or non-letter symbols.
In entering a value for window size, make sure you have only entered one integer in alpha-numeric form, such as: 9. The perl code that analyzes sequences cannot identify numbers written out, such as: nine.
email the authors: Amber Hartman, Peter Leese, Talbot Presley, and Lang Robertson