On-line Viterbi Algorithm and Its Relationship to Random Walks

Rastislav \v{S}r\'amek
Bro\v{n}a Brejov\'a
Tom\'a\v{s} Vina\v{r}

Tom\'a\v{s} Vina\v{r} update to 2010-01-25

https://arxiv.org/abs/0704.0062
In this paper, we introduce the on-line Viterbi algorithm for decoding hiddenMarkov models (HMMs) in much smaller than linear space. Our analysis ontwo-state HMMs suggests that the expected maximum memory used to decodesequence of length $n$ with $m$-state HMM can be as low as $\Theta(m\log n)$,without a significant slow-down compared to the classical Viterbi algorithm.Classical Viterbi algorithm requires $O(mn)$ space, which is impractical foranalysis of long DNA sequences (such as complete human genome chromosomes) andfor continuous data streams. We also experimentally demonstrate the performanceof the on-line Viterbi algorithm on a simple HMM for gene finding on bothsimulated and real DNA sequences.

journal: Algorithms in Bioinformatics: 7th International Workshop (WABI), 4645 volume of Lecture Notes in Computer Science, pp. 240-251, Philadelphia, PA, USA, September 2007. Springer

category: cs.DS

在线维特比算法及其与随机游走的关系

Rastislav \v{S}r\'amek
Bro\v{n}a Brejov\'a
Tom\'a\v{s} Vina\v{r}

Tom\'a\v{s} Vina\v{r} update to 2010-01-25

https://arxiv.org/abs/0704.0062
在本文中,我们介绍了在线 Viterbi 算法,用于在比线性空间小得多的空间中解码隐藏马尔可夫模型 (HMM)。我们对双态 HMM 的分析表明,使用 $m$-state HMM 解码长度 $n$ 的序列的预期最大内存可以低至 $\Theta(m\log n)$,相比之下没有显着的减速经典的维特比算法。经典的维特比算法需要 $O(mn)$ 空间,这对于分析长 DNA 序列(例如完整的人类基因组染色体)和连续数据流是不切实际的。我们还通过实验证明了在线 Viterbi 算法在简单 HMM 上的性能,用于在模拟和真实 DNA 序列上进行基因查找。

期刊参考: Algorithms in Bioinformatics: 7th International Workshop (WABI), 4645 volume of Lecture Notes in Computer Science, pp. 240-251, Philadelphia, PA, USA, September 2007. Springer

category: cs.DS