SIFT takes a query sequence and uses multiple alignment information to predict tolerated and deleterious substitutions for every position of the query sequence.
- searches for similar sequences (PSI-BLAST)
- chooses closely related sequences that may share similar function to the query sequence
- obtains the alignment of these chosen sequences
- calculates normalized probabilities for all possible substitutions from the alignment.
Positions with normalized probabilities less than 0.05 are predicted to be deleterious Those greater than or equal to 0.05 are predicted to be tolerated.
- SIFT Score Ranges from 0 to 1. The amino acid substitution is predicted damaging is the score is <= 0.05, and tolerated if the score is > 0.05.
- Median Info Ranges from 0 to 4.32, ideally the number would be between 2.75 and 3.5. This is used to measure the diversity of the sequences used for prediction. A warning will occur if this is greater than 3.25 because this indicates that the prediction was based on closely related sequences.
- Seqs at Position This is the number of sequences that have an amino acid at the position of prediction. SIFT automatically chooses the sequence for you, but if the substitution is located at the beginning or end of the protein, there may be only a few sequences represented at that position, and this column indicates this.
- SWISS-PROT small but high quality. Less likely to time out but may not have enough sequences for prediction.
- SWISS-PROT/TrEMBL larger than SWISS-PROT, good quality (what the author like to use)