Abstract: MATH/CHEM/COMP 2002, Dubrovnik,
June 24-29, 2002
|
Stochastic
Pairwise Alignments
Ulrike Mueckstein, Ivo L. Hofacker, and Peter F. Stadler Institute for Theoretical Chemistry & Molecular Structural Biology, University of Vienna, Waehringerstrasse 17, A-1090 Vienna, Austria The level of sequence conservation between
related nucleic acids or proteins often varies considerably along the
sequence. Both regions with high variability (mutational hot-spots) and
regions of almost perfect sequence identity may occur in the same pair of
molecules. The reliability of an alignment therefore strongly depends on the
level of local sequence similarity. Especially in regions of high
variability, many alignments of almost equal quality exist, and the optimal
alignment is highly arbitrary. We discuss two approaches which deal with the
inherent ambiguity of the alignment problem based on the computation of the
partition function over all canonical pairwise alignments. The ensemble of
possible alignments can be described by the probabilities of a match between
position i in the first and position j in the
second sequence. Alternatively, we introduce a probabilistic
backtracking procedure that generates ensembles of suboptimal alignments with
correct statistical weights. A comparison between structure based alignments
and large samples of stochastic alignments shows that the ensemble contains
correct alignments with significant probabilities even though the optimal
alignment deviates significantly from the structural alignment. Ensembles of
suboptimal alignments obtained by stochastic backtracking can be used as
input to any bioinformatics method based on pairwise alignment in order to
gain reliability information not available from a single optimal alignment. |