How to Determine a Protein Structure by DNA Sequencing

The determination of a protein's multidimensional structure through DNA sequencing is also known as protein structure prediction and requires conversion of the protein's DNA sequence into its amino acid sequence. The process requires a solid understanding of molecular cell and protein biology as well as a thorough familiarity with methods in computational biology such as UNIX programming. In addition, it will require extensive training in software packages or source codes used in fold-prediction and sequence analysis.

Things You'll Need

  • Several computers running different operating systems (UNIX, Windows etc)
  • Access to protein sequence databases
  • Access to chosen software packages or those listed in article
Show More

Instructions

    • 1

      Obtain DNA sequences -- Many databases exist which will have acquired, annotated and assembled the data produced by hundreds of sequencing experiments. These include TREMBL and SWISS-PROT, which should be used to obtain DNA sequences (in the form of a protein databank format, or PDB file

    • 2

      Interrogate sequences -- Look for homology between the protein of interest and others with known and experimentally validated structures. FASTA and PSI-BLAST software packages can be used for this. Compare the various parameters generated by the comparison, such as E-values, which is an indicator of this data set's statistical significance.

    • 3

      Structural alignment -- Align the sequence of the protein of interest with several other proteins used in Step 2. Since these other proteins were already experimentally shown to have certain structural features, e.g. alpha-helices or beta sheets, these will be annotated within their sequences and can be used to identify identical or similar regions. Determine all the secondary structures present in the sequence of interest and annotate these regions.

    • 4

      Model construction -- A template model is built by overlaying the sequences of the experimentally validated protein and the target protein, using the former as a guide to indicate where key features such as catalytic domains or other important structural elements are located.

    • 5

      Identify folds -- Using computational biology programs such as Threader or other UNIX-based modeling software, import the annotated sequence of interest and run the fold-prediction algorithm. These programs assign a score to the predictions to indicate which folds are the most accurate, likely to occur, stable, or other factors.

Learnify Hub © www.0685.com All Rights Reserved