We present a linear time and space suffix array (SA) construction algorithm called the SA-IS algorithm. The SA-IS algorithm is novel because of the LMS-substrings used for the problem reduction and the pure induced-sorting (specially coined for this algorithm) used to propagate the order of suffixes as well as that of LMS-substrings, which makes the algorithm almost purely relying on induced sorting at both its crucial steps. The pure induced-sorting renders the algorithm an elegant design and in turn a surprisingly compact implementation which consists of less than 100 lines of C code. The experimental results demonstrate that this newly proposed algorithm yields noticeably better time and space efficiencies than all the currently published linear time algorithms for SA construction. Copyright © 2009 IEEE.
|Title of host publication||Proceedings: 2009 Data Compression Conference, DCC 2009|
|Editors||James A. STORER, Michael W. MARCELLIN|
|Place of Publication||Los Alamitos|
|Publisher||IEEE Computer Society|
|Publication status||Published - 2009|
CitationNong, G., Zhang, S., & Chan, W. H. (2009). Linear suffix array construction by almost pure induced-sorting. In J. A. Storer & M. W. Marcellin (Eds.), Proceedings: 2009 Data Compression Conference, DCC 2009 (pp. 193-202). Los Alamitos: IEEE Computer Society.
- Algorithm design
- Suffix array
- Linear time