Abstract
Given a size-n input text T and its suffix array, a new method is proposed to compute the K-order longest common prefix (LCP) array for T, in terms of that the maximum LCP of two suffixes is truncated to be at most K. This method employs a fingerprint function to convert a comparison of two variable-length strings into a comparison of their fingerprints encoded as fixed-size integers. This method takes O (n log K) time and O (n) space on internal and external memory models. It is also scalable for a typical distributed model consisting of d computing nodes, where the time and space complexities are evenly divided onto each node as O (n log K/d) and O (n/d), respectively. For performance evaluation, an experimental study has been conducted on both external memory and distributed models. From our perspective, a cluster of computers in a local area network is commonly available in practice, but there is currently a lack of scalable LCP-array construction algorithm for such a distributed model. Our method provides a candidate solution to meet this demand. Copyright © 2017 Springer Nature Singapore Pte Ltd.
Original language | English |
---|---|
Title of host publication | Parallel architecture, algorithm and programming: 8th International Symposium, PAAP 2017, Haikou, China, June 17-18, 2017, Proceedings |
Editors | Guoliang CHEN, Hong SHEN , Mingrui CHEN |
Place of Publication | Singapore |
Publisher | Springer |
Pages | 579-593 |
ISBN (Electronic) | 9789811064425 |
ISBN (Print) | 9789811064418 |
DOIs | |
Publication status | Published - 2017 |
Citation
Wu, Y., Han, L. B., Chan, W. H., & Nong, G. (2017). Scalable K-order LCP array construction for massive data. In G. Chen, H. Shen, & M. Chen (Eds.), Parallel architecture, algorithm and programming: 8th International Symposium, PAAP 2017, Haikou, China, June 17-18, 2017, Proceedings (pp. 579-593). Singapore: Springer.Keywords
- Fingerprint function
- K-order LCP-array construction
- External memory and distributed models