会议专题

Finding Longest Increasing and Common Subsequences in Streaming Data

We present algorithms and lower bounds for the Longest Increasing Subsequence (LIS) and Longest Common Subsequence (LCS) problems in the data-streaming model. To decide if the LIS of a given stream of elements drawn from an alphabet E has length at least k, we discuss a one-pass algorithm using O(k log|∑|) space, with update time either O(log k) or O(log log |E|); for |E|=O(1), we can achieve O(log k) space and constant-time updates. We also prove a lower bound of Ω(k) on the space requirement for this problem for general alphabets E, even when the input stream is a permutation of E. For finding the actual LIS, we give a log(l + 1/∈)-pass algorithm using O(k1+∈ log |E|) space, for any ∈ > 0.For LCS, there is a trivial Θ(1)-approximate O(log n)-space streaming algorithm when |E|=O(1). For general alphabet E, the problem is much harder. We prove several lower bounds on the LCS problem, of which the strongest is the following: it is necessary to use Ω(n/ρ2) space to approximate the LCS of two n-element streams to within a factor of p, even if the streams are permutations of each other.

David Liben-Nowell Erik Vee An Zhu

Department of Mathematics and Computer Science, Carleton College IBM Almaden Research Center Google, Inc.

国际会议

The 11th Annual International Computing and Combinatorics Conference COCOON 2005(第11届国际计算和组合会议)

昆明

英文

263-272

2005-08-01(万方平台首次上网日期,不代表论文的发表时间)