Scalable K-order LCP array construction for massive data

Yi WU, Ling Bo HAN, Wai Hong CHAN, Ge NONG

Research output: Chapter in Book/Report/Conference proceedingChapters

Abstract

Given a size-n input text T and its suffix array, a new method is proposed to compute the K-order longest common prefix (LCP) array for T, in terms of that the maximum LCP of two suffixes is truncated to be at most K. This method employs a fingerprint function to convert a comparison of two variable-length strings into a comparison of their fingerprints encoded as fixed-size integers. This method takes O (n log K) time and O (n) space on internal and external memory models. It is also scalable for a typical distributed model consisting of d computing nodes, where the time and space complexities are evenly divided onto each node as O (n log K/d) and O (n/d), respectively. For performance evaluation, an experimental study has been conducted on both external memory and distributed models. From our perspective, a cluster of computers in a local area network is commonly available in practice, but there is currently a lack of scalable LCP-array construction algorithm for such a distributed model. Our method provides a candidate solution to meet this demand. Copyright © 2017 Springer Nature Singapore Pte Ltd.
Original languageEnglish
Title of host publicationParallel architecture, algorithm and programming: 8th International Symposium, PAAP 2017, Haikou, China, June 17-18, 2017, Proceedings
EditorsGuoliang CHEN, Hong SHEN , Mingrui CHEN
Place of PublicationSingapore
PublisherSpringer
Pages579-593
ISBN (Electronic)9789811064425
ISBN (Print)9789811064418
DOIs
Publication statusPublished - 2017

Citation

Wu, Y., Han, L. B., Chan, W. H., & Nong, G. (2017). Scalable K-order LCP array construction for massive data. In G. Chen, H. Shen, & M. Chen (Eds.), Parallel architecture, algorithm and programming: 8th International Symposium, PAAP 2017, Haikou, China, June 17-18, 2017, Proceedings (pp. 579-593). Singapore: Springer.

Keywords

  • Fingerprint function
  • K-order LCP-array construction
  • External memory and distributed models

Fingerprint

Dive into the research topics of 'Scalable K-order LCP array construction for massive data'. Together they form a unique fingerprint.