Fast induced sorting suffixes on a multicore machine

Bin LAO, Ge NONG, Wai Hong CHAN, Yi PAN

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Sorting the suffixes of an input string is a fundamental task in many applications such as data compression, genome alignment, and full-text search. The induced sorting (IS) method has been successfully applied to design a number of state-of-the-art suffix sorting algorithms. In particular, the SAIS algorithm designed by the IS method is not only linear in time but also fast in practice. However, the parallelization of SAIS remains a challenge due to that the IS process in the algorithm is inherently sequential and the performance bottleneck of the whole algorithm. This article presents our attempt for designing a parallel variant of SAIS on a multicore machine which is considered as a shared memory parallel model, called pSAIS. By a combined use of multithreading and pipelining, the inducing process is accelerated by fully utilizing the machine's parallel computing power. An experimental study is conducted for performance evaluation of pSAIS with the other existing parallel suffix sorting algorithms, on a set of realistic inputs with varying sizes and alphabets. The experiment results show that our program for pSAIS has a high degree of parallelism and achieves the best average time and space performance among all the parallel algorithms in comparison. While pSAIS is designed for quickly building big suffix arrays in a multicore machine, our study may give some hints for extending the induced sorting method to GPU for constructing small suffix arrays even faster. Copyright © 2018 Springer Science+Business Media, LLC, part of Springer Nature.
Original languageEnglish
Pages (from-to)3468-3485
JournalJournal of Supercomputing
Volume74
Issue number7
Early online dateMay 2018
DOIs
Publication statusPublished - Jul 2018

Fingerprint

Suffix
Sorting
Suffix Array
Sorting algorithm
Pipelining
Multithreading
Data Compression
Time-average
Parallel Computing
Shared Memory
Parallelization
Parallel Algorithms
Parallelism
Performance Evaluation
Experimental Study
Genome
Alignment
Data compression
Strings
Parallel processing systems

Citation

Lao, B., Nong, G., Chan, W. H., & Pan, Y. (2018). Fast induced sorting suffixes on a multicore machine. The Journal of Supercomputing, 74(7), 3468-3485. doi: 10.1007/s11227-018-2395-5

Keywords

  • Suffix array
  • Induced sorting
  • Parallel algorithm
  • Multicore