Abstract
This article presents massively parallel execution of the BLAST algorithm on supercomputers and HPC clusters using thousands of processors. Our work is based on the optimal splitting up the set of queries running with the non-modified NCBI-BLAST package for sequence alignment. The work distribution and search management have been implemented in Java using a PCJ (Parallel Computing in Java) library. The PCJ-BLAST package is responsible for reading sequence for comparison, splitting it up and start multiple NCBI-BLAST executables. We also investigated a problem of parallel I/O and thanks to PCJ library we deliver high throughput execution of BLAST. The presented results show that using Java and PCJ library we achieved very good performance and efficiency. In result, we have significantly reduced time required for sequence analysis. We have also proved that PCJ library can be used as an efficient tool for fast development of the scalable applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
http://www.ncbi.nlm.nih.gov/BLAST [Accessed: June 8, 2017].
- 2.
http://pcj.icm.edu.pl [Accessed: March 20, 2016].
References
Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic local alignment search tool. J. Mol. Biol. 215(3), 403–410 (1990)
Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25(17), 3389–3402 (1997)
Braun, R.C., Pedretti, K.T., Casavant, T.L., Scheetz, T.E., Birkett, C.L., Roberts, C.A.: Parallelization of local BLAST service on workstation clusters. Future Gener. Comput. Syst. 17(6), 745–754 (2001)
Cofer, H.: SGI® High Throughput Computing (HTC) Wrapper Program for Bioinformatics on SGI ICE™ and SGI UV™ Systems. Np: Silicon Graphics International (2012)
Chi, E.H.H., Shoop, E., Carlis, J., Retzel, E., Riedl, J.: Efficiency of shared-memory multiprocessors for a genetic sequence similarity search algorithm. Technical report, University of Minnesota, CS Department, vol. TR97-05 (1997)
Darling, A., Carey, L., Feng, W.C.: The design, implementation, and evaluation of mpiBLAST. In: Proceedings of ClusterWorld Conference and Expo in Conjunction with the 4th International Conference on Linux Clusters: The HPC Revolution 2003, San Jose, CA, pp. 13–15 (2003)
Bjornson, R.D., Sherman, A.H., Weston, S.B., Willard, N., Wing, J.: TurboBLAST(r): a parallel implementation of BLAST built on the TurboHub. In: Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS), 0183. IEEE (2002)
Mathog, D.R.: Parallel BLAST on split databases. Bioinformatics 19(14), 1865–1866 (2003)
Lin, H., Ma, X., Chandramohan, P., Geist, A., Samatova, N.: Efficient data access for parallel BLAST. In: Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS 2005), Washington, DC, USA. IEEE Computer Society (2005)
Nowicki, M., Górski, Ł., Grabrczyk, P., Bała, P.: PCJ - Java library for high performance computing in PGAS model. In: International Conference on High Performance Computing and Simulation, HPCS 2014, pp. 202–209. IEEE (2014)
Numrich, R.W., Reid, J.: Co-array Fortran for parallel programming. ACM SIGPLAN Fortran Forum 17(2), 1–31 (1998). ACM
Carlson, W.W., Draper, J.M., Culler, D.E., Yelick, K., Brooks, E., Warren, K.: Introduction to UPC and language specification (Vol. 576). Technical report CCS-TR-99-157, IDA Center for Computing Sciences (1999)
Hilfinger, P., Bonachea, D., Datta, K., Gay, D., Graham, S., Liblit, B., Pike, G., Su, J., Yelick, K.: Titanium language reference manual. UC Berkeley Technical report, UCB/EECS-2005-15, Berkeley, California, USA (2005)
Acknowledgments
The authors would like to thank CHIST-ERA consortium for financial support under HPDCJ project (Polish part funded by NCN grant 2014/14/Z/ST6/00007) and NordForsk for the support within NIASC consortium. The performance tests have been performed using ICM University of Warsaw computational facilities.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Nowicki, M., Bzhalava, D., Bała, P. (2017). Massively Parallel Sequence Alignment with BLAST Through Work Distribution Implemented Using PCJ Library. In: Ibrahim, S., Choo, KK., Yan, Z., Pedrycz, W. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2017. Lecture Notes in Computer Science(), vol 10393. Springer, Cham. https://doi.org/10.1007/978-3-319-65482-9_36
Download citation
DOI: https://doi.org/10.1007/978-3-319-65482-9_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-65481-2
Online ISBN: 978-3-319-65482-9
eBook Packages: Computer ScienceComputer Science (R0)