Linear suffix array construction by almost pure induced-sorting

Ge NONG, Sen ZHANG, Wai Hong CHAN

Research output: Chapter in Book/Report/Conference proceedingChapters

45 Citations (Scopus)

Abstract

We present a linear time and space suffix array (SA) construction algorithm called the SA-IS algorithm. The SA-IS algorithm is novel because of the LMS-substrings used for the problem reduction and the pure induced-sorting (specially coined for this algorithm) used to propagate the order of suffixes as well as that of LMS-substrings, which makes the algorithm almost purely relying on induced sorting at both its crucial steps. The pure induced-sorting renders the algorithm an elegant design and in turn a surprisingly compact implementation which consists of less than 100 lines of C code. The experimental results demonstrate that this newly proposed algorithm yields noticeably better time and space efficiencies than all the currently published linear time algorithms for SA construction. Copyright © 2009 IEEE.

Original languageEnglish
Title of host publicationProceedings: 2009 Data Compression Conference, DCC 2009
EditorsJames A. STORER, Michael W. MARCELLIN
Place of PublicationLos Alamitos
PublisherIEEE Computer Society
Pages193-202
ISBN (Print)9780769535920
DOIs
Publication statusPublished - 2009

Citation

Nong, G., Zhang, S., & Chan, W. H. (2009). Linear suffix array construction by almost pure induced-sorting. In J. A. Storer & M. W. Marcellin (Eds.), Proceedings: 2009 Data Compression Conference, DCC 2009 (pp. 193-202). Los Alamitos: IEEE Computer Society.

Keywords

  • Algorithm design
  • Suffix array
  • Linear time
  • Divide-and-conquer

Fingerprint Dive into the research topics of 'Linear suffix array construction by almost pure induced-sorting'. Together they form a unique fingerprint.