Abstract
Sorting the suffixes of an input string is a fundamental task in many applications such as data compression, genome alignment, and full-text search. The induced sorting (IS) method has been successfully applied to design a number of state-of-the-art suffix sorting algorithms. In particular, the SAIS algorithm designed by the IS method is not only linear in time but also fast in practice. However, the parallelization of SAIS remains a challenge due to that the IS process in the algorithm is inherently sequential and the performance bottleneck of the whole algorithm. This article presents our attempt for designing a parallel variant of SAIS on a multicore machine which is considered as a shared memory parallel model, called pSAIS. By a combined use of multithreading and pipelining, the inducing process is accelerated by fully utilizing the machine's parallel computing power. An experimental study is conducted for performance evaluation of pSAIS with the other existing parallel suffix sorting algorithms, on a set of realistic inputs with varying sizes and alphabets. The experiment results show that our program for pSAIS has a high degree of parallelism and achieves the best average time and space performance among all the parallel algorithms in comparison. While pSAIS is designed for quickly building big suffix arrays in a multicore machine, our study may give some hints for extending the induced sorting method to GPU for constructing small suffix arrays even faster. Copyright © 2018 Springer Science+Business Media, LLC, part of Springer Nature.
Original language | English |
---|---|
Pages (from-to) | 3468-3485 |
Journal | Journal of Supercomputing |
Volume | 74 |
Issue number | 7 |
Early online date | May 2018 |
DOIs | |
Publication status | Published - Jul 2018 |
Citation
Lao, B., Nong, G., Chan, W. H., & Pan, Y. (2018). Fast induced sorting suffixes on a multicore machine. The Journal of Supercomputing, 74(7), 3468-3485. doi: 10.1007/s11227-018-2395-5Keywords
- Suffix array
- Induced sorting
- Parallel algorithm
- Multicore