CHM-530: Protein Structure Prediction
Introduction
In a postgenomic world, the heavy lifting has turned to protein-structure prediction from sequence (DNA or translated amino acid). There is a plethora of tools available to the biochemist to do just that. While these are still just predictive tools, they are an invaluable asset for understanding molecular interactions within the cell. In this assignment, you will translate a given DNA sequence and then put it through a secondary structure prediction tool. Additionally, you will be asked to explain basic aspects of translation and the types of secondary structure that the prediction tool searches for in the amino acid sequence. Use the Lehninger Principles of Biochemistry textbook or other online sources to answer the questions.
Procedure
1. Starting with the DNA sequence listed below, go to the ExPASy Tool (https://web.expasy.org/translate/).
a. Cut and paste the sequence into the translate tool. The output shows different open reading frames.
b. There will be six frames giving all possible protein sequences for the offered DNA sequence. Select the longest continuous, red-highlighted sequence (open reading frame), and click on the red “M” to gain access to the amino acid sequence. Be sure it says that there are 503 amino acids.
c. Copy the single letter amino acid code, which should start with M, in FASTA format, and paste it into the “Protein Structure Report” below. Make sure only open reading frame amino acid information is copied and pasted into this worksheet. Answer the question associated with Part 1.
2. Take that single letter amino acid code and do a secondary structure prediction on the sequence using the SCRATCH Protein Predictor website. (http://scratch.proteomics.ics.uci.edu/).
a. Paste in your converted amino acid sequence, and select the prediction option SSpro8: Secondary Structure (8 Class). You will be asked about the output of that selection. You can also choose whatever predictor options you want for your own curiousity. You will receive an e-mail with your results, which may take up to 30 minutes.
The DNA sequence is on the next page
TGCTGACCCTATGATGTATCCTATGGTCATTTATTAAGATGTTATCCTAA
AAAGTATATAACGATTTATTATAGTGTGATAGTAATACCAGAACGAGAAA
TTAGAAAATTGTAAAAAAAGAATTTTAAAATATTATGCGGCTACTTTTCC
TACAGTTTCTGCAATTTTTGCTTCTTCTTCAGCAAATGCGCATAGCATTG
CTTCTATGTCGGCATATCCTTCTGCTTTTGCTGTCATTGCTATTTTGTTG
TATGTGGATATGTGCTCCTCTCCCTCTTTAGTAGCGAAATCAGATAGTAA
CTTCTTTACCTTTAATTCGGTGGCTACTTTTCCTACAGTTTCTGCAATTT
TTGCTTCTTCTTCAGCAAATGCGCATAGCATTGCTTCTATGTCGGCATAT
CCTTCTGCTTTTGCTGTCATTGCTATTTTGTTGTATGTGGATATGTGCTC
CTCTCCCTCTTTTATTGAGAATTCTTCTAATATTTTTTCTACTTTACTGG
ATATCATAGGTATTCGTAATGAATAATCGGAAGGCAAATATATAAGCATT
TGTTAAGCTTTTTTAATACTAAATATAATTAGCATTTTTGTATTTCAACA
AAGTTTGAGATTTTTGTATTACGGAACTAAAAATCCTCTAAAAAACTTAA
CTTGTATATAAAATTCTTTCGTATAATTTCTTTGCCTCTTCATACTTCTC
CTTTGATTTGGTTTCTAATTCGTTCTTCCTTTCAGGAAGTTTTTCAGCTA
ATAGTTTTGAGAGCATATAAACTGTAGCTGTAAGCATAAATTTCTCTTTT
AATTGCTCACTTTCCTTTTTATTTAACTCTTCGAACCAATTGGTTATATA
TTCTAACCCTAAATACTTTATCATTTTCTCTAATATTCCCCTAGCATGAC
CCAATTCAACTAAGGCTTTTTCCCTAATTTTTTCAGATTCTTCCTTTTTA
TTAACCTCCTCCAGCTTTTGAGAGGAGAACAATAGTAATAAATGGTCTTC
GGAGTTAGCCATAAAAAGCTCTTTTAATCCTATTTCAGTCTGTGTCCCCT
TCATCACCTTTAAATTGTATTCATAGCTAATATACTCTTGTTAAAATAAT
GATGACTAACTCCAATACTGACCAATGATGTCGTAACCCGAAACTGAATA
AAAGTAAAATCCTTCCCTACTGAGAATATTTGTATGATAACCTCAAAAAG
AATGAAAGCCCTTGAAATTAATAGCGAAGCATTAGGCGTGCCAACATTAC
TCTTGATGGAAAACGCAGGGAGAAGTGTAAAGGATGAAATAATGAAAAGA
CTGAATTTGGACTATTCTAAAAAGGTTGTAGTATTTGCAGGAACTGGTGG
AAAAGGAGGAGACGGATTAGTAGTAGCAAGGCACCTTGCCTCGGAAGGGT
CAGAGGTTCATGTTTTACTTTTAGGCGAGAACAAACATCCGGACGCAATC
ATTAACTTGAATGCAATATATGAAATGGATTATTCTATTAGAGAAGTTAA
ACTGATAAAAGATACTGACGAATTGCAACCAGTTAAAGCTGACGTGCTTA
TAGATGCCATGTTAGGCACGGGATTTTCTGGTAAAGTTAGAGAACCATTT
AGAACAGCTATTAGAGTATTTAATCAGAGCTCTGGTTTTAAGGTTTCTAT
AGATATACCCTCTGGGATAAATGCAGACGATGAAGAACAGCAGGGAGAAC
ACGTTATTCCCGACCTAATAGTCACCTTTCATGATCTTAAGCCAGGCTTA
AAAAAATTTGAGAGTAAAGTGGTCGTCAAGAAAATAGGTATTCCTAAAGA
GGCTGAAATATATGTTGGTCCCGGTGATGTCATTGTCAATGTGAAGAAAA
GAGAGTATAACACAAAGAAAGGAGATAATGGAAGAGTTTTGATCATTGGA
GGGAATTTTACATTTAGTGGAGCCCCAACTCTATCTGCTTTGGGAGCCTT
AAGGACGGGAGCAGATCTGGTATATGTCGCATCTCCAGAGGAGACAGCTA
AGGTCATCTCTAGCTTTTCCCCTGACCTTATATCTATTAAGCTTAAGGGA
AAGAATATATCTACAGACAATTTGGATGAGCTAAAACCATGGATTGATAA
AGCTGACGTCGTAGTTGTAGGACCTGGTATGGGACAAGAAAGGGAAACTG
TAGATGCTTCCATAGAGATAGTTAGATATCTGAAAGCAAAGAATAAACCT
TCAGTCATAGATGCTGATGCGTTAAAATCAGTGGCAGGTATGGAATTATT
CCCGAATGCAGTAATAACTCCTCATGCAGGAGAATTTAAGATATATTCAG
GGGTTCAGCCTGATTCGAACATGAGAAAAAGAATTGAGCAAGTGAAGGAG
TGCTCACTGAAATGTAATTGTGTAGTACTCCTTAAGGGTTATGTTGATAT
CATAGCAGAAAAGGAAGAATTTAAACTTAATAAGACAGGAAATCCTGGAA
TGGCAGTTGGCGGTACTGGGGATACATTGACAGGAATAATTGCCTCATTT
ATGGCTCAAAAACTATCTCCATTCACTTCTGCTTACTTGGGAGCATTCGT
TAATGGTTTAGCAGGGTCTATAGCATATGAAAAACTTGGCGCACATCTAG
TTGCAACAGATATAATAGAAAACATTCCTAAGGTAATTAATGAACCTTTA
GAAGTGTTCAAGAAAAAAGTGTACAAAAGGATTTTAGATACTTAGGTTTT
ACCCCTAATTCTTTTAATAATCTCAAGTGATTTGTTTGCATGTTCTTCTG
CATTTCCTAGACCGCTCAATACCTCTATAATTTTTCCGTTTTCGTCTATG
ATAAAGGTTACTCTCTGAGCACTTGAGCCTTTCTCGTTTAGAACACCGTA
TAATTTAGCTATTTGTTTATTTGAGTCAGAAACTATAGGAAATCTGGCAC
CGCATTTGTCTGCAAAACTCTTTTGAGTTGAAACTGTATCAACACTAACA
CCTATAACTTCAGCATTTAACTGTTTAAATTGGTCATAAAGTTGTCCAAA
TTTTATGGTCTCTCTAGTACAACCAGGTGTAAACGCCTTAGGATAGAAAT
ATAGTACAACTACAGATTTGCCTCTATATGAAGATAGTTTCAATTTTCCT
ATAGTTGAATCTCCTTCAAAATCAGGAGCTTCATTTCCTTTTTCTAAAGC
CATAGATTATCTGATATAAATATATTCAGTTATGGTTTTTAACCTCTTTT
TCGCTTATGCCTTACA
CHM-530: Protein Structure Report
Name_____________________________
Part 1: Translation
Paste the single letter amino acid code in the box below. (10 points)
Questions
1. Why are there six different frames from a single DNA sequence? (5 points)
2. Why does each red-highlighted region begin with “M”? (5 points)
Part 2
Paste the Predicted Secondary Structure in the box below. Do not paste amino acid code. (10 points)
1. What do the following secondary structure designations mean? What specifically is the difference between E and B? Use “SCRATCH; A Quick Description” (http://scratch.proteomics.ics.uci.edu/explanation.html) as a reference. (7.5 points)
H:
G:
I:
E:
B:
T:
S:
C:
2. From the e-mailed results from SCRATCH, what type and amount of secondary structure did the prediction tool suggest? Specifically, explain how much alpha helix and beta sheet (bridge) was found. (7.5 points)