Open Access Open Access  Restricted Access Subscription Access

A WORK DISTRIBUTION STRATEGY FOR GLOBAL SEQUENCE ALIGNMENT

Kailash Kalare, Jitendra Tembhurne

Abstract


The sequence alignment comprises to identify similarities and dissimilarities between two given sequences. In this paper, we propose a work distribution strategy for the implementation of DNA global sequence alignment. The main objective of this work is to minimize the execution time required for DNA global alignment of large biological sequences. The proposed approach dealt the issues with the memory optimizations and minimization of execution time. We considered the biological sequences of different size to fit into the global memory of the system. The proposed strategy is implemented in shared memory architecture using OpenMP programming for large biological sequences. Parallelization using OpenMP directive is relatively easy and execute the code fast. We experimented on the Dell Precision Tower 7910 with Intel Xeon processor with 32GB RAM and 28 CPU cores. The efficient use of global memory and cache memory optimization dominate the results of execution time. The results demonstrate the significantly high speed up using OpenMP as compared with other implementations.

Keywords


Global alignment; N-W algorithm; Global memory; Optimization.

Full Text:

PDF

References


M. Rosenberg, Sequence Alignment, Methods, Concepts and Strategies, University of California Press, 2011.

K. Sharma, Bioinformatics: Sequence Alignment and Markov Models, McGraw-Hill, 2008.

Blast Library, 2018, [Online]. Available: http://blast.ncbi.nlm.nih.gov/Blast.cgi.

FASTA Tool, 2018, [Online]. Available: http://www.ebi.ac.uk/Tools/fasta33/index.html.

C. Chen, B. Schmidt, “An adaptive grid implementation of DNA sequence alignment,” Journal of Future Generation Computer Systems, vol. 21, issue 7, pp. 988-1003, 2005.

M. Randic, J. Zupan, D. Vikic and D. Plavsic, “A novel unexpected use of a graphical representation of DNA: graphical alignment of DNA sequences,” Journal of Chemical Physics Letters, vol. 431, isuess (4-6), pp. 375-379, 2006.

L. Liu, C. Li, F. Bai, Q. Zhao and Y. Wang, “An optimization approach and its application to compare DNA sequences,” Journal of Molecular Structure, vol. 1082, pp. 49-55, 2015.

Y. Xin, B. Liu, B. Min, W. Li, R.C.C. Cheung, A.S. Fong and T.F. Chan, “Parallel architecture for DNA sequence inexact matching with Burrows-Wheeler transform,” Microelectronics Journal, vol. 44, issue 8, pp. 670-682, 2013.

F. Saeed, A.P. Rathke, J. Gwarnicki, T.B. Wolf, A. Khokhar, “A high performance multiple sequence alignment system for pyrosequencing reads from multiple reference genomes,” Journal of Parallel Distributed Computing, vol. 72, issue 1, pp. 83-93, 2012.

D.D. Shrimankar and S.R. Sathe, “Analysis of parallel algorithms on SMP node and cluster of workstations using parallel programming models with new tile-based method for large biological datasets,” Bioinformatics and Biology Insights, vol. 10, pp. 255-265, 2016.

C. Li, Z. Ji, F. Gu, “Efficient parallel design for BWT-based DNA sequences data multi-compression algorithm,” Proceedings of International Conference on Automatic Control and Artificial Intelligence, Xiamen, China, 3-5 March, 2012, pp. 967-970.

S. Memeti and S. Pllana, “A machine learning approach for accelerating DNA sequence analysis,” International Journal of High Performance Computing Applications, vol. 32, issue 3, pp. 363-379, 2016.

Q. Xue, J. Xie, J. Shu, H. Zhang, D. Dai, X. Wu, W. Zhang, “A parallel algorithm for DNA sequences alignment based on MPI,” Proceedings of International Conference on Information Science, Electronics and Electrical Engineering, Sapporo, Japan, 26-28 April, 2014, pp. 786-789.

L. Feuser and N. Moreano, “Parallel solutions to the k-difference primer problem,” Proceedings of the International Conference on Computational Science, ICCS’2018, Lecture Notes in Computer Science, Vol. 10860, Springer, Cham, 2018.

S.B. Needleman, C.D. Wunch, “A general method applicable to the search for similarities in the amino acid sequence of two proteins,” Journal of Molecular Biology, vol. 48, pp. 443-453, 1970.

T.F. Smith, M.S. Waterman, “Identification of common molecular subsequences,” Journal of Molecular Biology, vol. 147, pp. 195-197, 1981.

V. Bharadwaj, W.H. Min, “Handling biological sequence alignment on networked computing systems,” Journal of Parallel Distributed Computing, vol. 69, issue 10, pp. 854-865, 2009.

T. Almeida, N. Roma, “A parallel programming framework for multi-core DNA sequence alignment,” Proceedings of International Conference on Complex, Intelligent and Software Intensive Systems, Krakow, Poland, 15-18 February, 2010, pp. 907-912.

E. Ruccii, A. De Giusti, F. Chichizola, M. Naiouf, L. De Giustil, “DNA sequence alignment: hybrid parallel programming on a multicore cluster,” Proceedings of the International Conference on Computers, Digital Communications and Computing, Recent Advances in Computers, Communications, Applied Social Science and Mathematics, Barcelona, Spain, 2011, pp 183-190.

B. Chapman, G. Jost and R.V.D. Pas, Using OpenMP: Portable Shared Memory Parallel Programming, MIT Press, 2007.

A.A. Khan, L. Hassan, S. Ullah, “OpenMP based parallel and scalable genetic sequence alignment,” Journal of Engineering and Applied Science, vol. 34, issue 2, pp. 29-34, 2015.

P. Borovska and M. Lazarova, “Parallel models for sequence alignment on CPU and GPU,” Proceedings of International Conference on Computer Systems and Technologies – CompSysTech-2011, Vienna, Austria, June 16–17, 2011, pp. 210-215.

S.R. Sathe, D.D. Shrimankar, “Parallelization of DNA sequence alignment using OpenMP,” Proceedings of International Conference on Communication, Computing & Security, Rourkela, Odisha, India, February 12-14, 2011, pp. 200-203.

A. Chaibou1 and O. Sie, “Comparative study of the parallelization of the Smith-Waterman algorithm on OpenMP and Cuda C,” Journal of Computer and Communications, vol. 3, pp. 107-117, 2015.


Refbacks

  • There are currently no refbacks.