Obtain DNA sequences -- Many databases exist which will have acquired, annotated and assembled the data produced by hundreds of sequencing experiments. These include TREMBL and SWISS-PROT, which should be used to obtain DNA sequences (in the form of a protein databank format, or PDB file
Interrogate sequences -- Look for homology between the protein of interest and others with known and experimentally validated structures. FASTA and PSI-BLAST software packages can be used for this. Compare the various parameters generated by the comparison, such as E-values, which is an indicator of this data set's statistical significance.
Structural alignment -- Align the sequence of the protein of interest with several other proteins used in Step 2. Since these other proteins were already experimentally shown to have certain structural features, e.g. alpha-helices or beta sheets, these will be annotated within their sequences and can be used to identify identical or similar regions. Determine all the secondary structures present in the sequence of interest and annotate these regions.
Model construction -- A template model is built by overlaying the sequences of the experimentally validated protein and the target protein, using the former as a guide to indicate where key features such as catalytic domains or other important structural elements are located.
Identify folds -- Using computational biology programs such as Threader or other UNIX-based modeling software, import the annotated sequence of interest and run the fold-prediction algorithm. These programs assign a score to the predictions to indicate which folds are the most accurate, likely to occur, stable, or other factors.