Invited Talk
Special Topics in Computer Science
Department of Computer Science and Information Engineering
National Chung Cheng University
January 6, 2003
Efficient Algorithms for Locating the Length-Constrained Heaviest Segments,
with Applications to Biomolecular Sequence Analysis
林耀鈴教授 (Prof. Yaw-Ling Lin)
靜宜大學資管系 (Dept CSIM, Providence University)
Abstract
We study two fundamental problems concerning the search for
interesting regions in sequences: (i) given a sequence of real
numbers of length $n$ and an upper bound $U$, find a consecutive
subsequence of length at most $U$ with the maximum sum and (ii)
given a sequence of real numbers of length $n$ and a lower bound
$L$, find a consecutive subsequence of length at least $L$ with
the maximum average. We present an $O(n)$-time algorithm for the
first problem and an $O(n \log L)$-time algorithm for the second.
The algorithms have potential applications in several areas of
biomolecular sequence analysis including locating GC-rich regions
in a genomic DNA sequence, post-processing sequence alignments,
annotating multiple sequence alignments, and computing
length-constrained ungapped local alignment. Our preliminary tests
on both simulated and real data demonstrate that the algorithms
are very efficient and able to locate useful (such as GC-rich)
regions.
appeared in the Journal of Computer and System Sciences, JCSS.
Click Here for the Slides