Bit-parallelism encodes calculated values into bit arrays. Thus,
multiple values can be updated simultaneously by a single
operation. This technique can be used to develop efficient algorithms
for string comparison problems (e.g. the approximate pattern matching
problem, the longest common substring problem).
An important parameter which has direct impact on the
efficiency of bit-parallel algorithms is the machine word size w
(e.g. w = 32 or 64 bits for conventional CPUs). On modern high
performance computing architectures, such as Intel Xeon Phi
coprocessors or CUDA-enabled GPUs, it is possible to efficiently
simulate 512-bit or 1024-bit words which could contribute to the
implementations of bit-parallel algorithms. In addition, the massive
computation resource of these types of accelerator allows the
development of powerful tools to process large text.
This project aims to provide bit-parallel string comparison tools
on modern accelerators, as well as on conventional multi-core
CPUs. Fundamentally, these tools are specified for biological data
(i.e. DNA, RNA sequences), however, they could also be extended to
process other types of text data.
Our CUDA-based implementation on a GeForce Titan graphics card
runs up to 2.9 times faster than our implementation on a Xeon
Phi 5110P. The results are presented in a journal paper which
has been submitted to Journal of Parallel Computing.
XBitPar version 2.0 consists of 5 implementations of the
Wu-Manber algorithm, which allow performing approximate pattern
matching with the Levenshtein distance.
Both GPU implementations simulate 1024-bit words by CUDA warps.
- wmCudaTile: CUDA-based version with the tiled parallelization
- wmCudaNaive: CUDA-based version which uses the
global memory of the GPU as the main working memory.
Xeon Phi implementations
Multicore CPU implementations
- wmIntr: Xeon Phi multi-threaded version, which simulates 512-bit words with Xeon Phi
- wmAutoVec: Xeon Phi multi-threaded version, which
simulates 512-bit words with arrays of 16 unsigned integers.
These implementations are specified for DNA sequences (i.e : the
alphabet consists of 4 characters A, C, G, T).
- wmHost: CPU multi-threaded version, which simulates 512-bit
words with arrays of 16 unsigned integers.
Latest source code (v2.0)
- Tran, T, Schindel, S, Liu, Y, and Schmidt, B: "Bit-Parallel approximate pattern matching on the Xeon Phi coprocessor".
26th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2014). pp. 81-88.
- Tran, T, Liu, Y, and Schmidt, B: "Bit-Parallel approximate pattern matching: Kepler GPU versus Xeon Phi".
Parallel Computing. 2015, under review.
- 15 August 2015: Second version of XBitPar (2.0) and Information Updated
- 16 January 2015: Information Updated
- 05 August 2014: First version of XBitPar (1.0)