Abstract
In general, compound identification through library searching is performed on original mass spectral space by using some developed similarity measure. In this paper, the original mass spectral space was transformed into binary space by random projection. The hamming distance between query and reference the vector of binary space are calculated. The Mass Spectral Library 2005 (NIST05) main library is used as reference database and the replicate library is used as query data. With the number of binary digits increasing, the accuracy of compound identification is also increased. When the number set as 2076 bits, random projection achieve better identification performance than corresponding three similarity measures.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others

References
Denkert, C., et al.: Mass spectrometry-based metabolic profiling reveals different metabolite patterns in invasive ovarian carcinomas and ovarian borderline tumors. Cancer Res. 66(22), 10795–10804 (2006)
Stein, S.E., Scott, D.R.: Optimization and testing of mass spectral library search algorithms for compound identification. J. Am. Soc. Mass Spectrom. 5(9), 859–866 (1994)
McLafferty, F.W., et al.: Comparison of algorithms and databases for matching unknown mass spectra. J. Am. Soc. Mass Spectrom. 9(1), 92–95 (1998)
Hertz, H.S., Hites, R.A., Biemann, K.: Identification of mass spectra by computer-searching a file of known spectra. Anal. Chem. 43(6), 681–691 (1971)
Visvanathan, A.: Information-Theoretic Mass Spectral Library Search for Comprehensive Two-Dimensional Gas Chromatography with Mass Spectrometry. ProQuest, Ann Arbor (2008)
Koo, I., Zhang, X., Kim, S.: Wavelet- and Fourier-transform-based spectrum similarity approaches to compound identification in gas chromatography/mass spectrometry. Anal. Chem. 83(14), 5631–5638 (2011)
Kim, S., et al.: A method of finding optimal weight factors for compound identification in gas chromatography-mass spectrometry. Bioinformatics 28(8), 1158–1163 (2012)
Kim, S., et al.: Compound identification using partial and semipartial correlations for gas chromatography-mass spectrometry data. Anal. Chem. 84(15), 6477–6487 (2012)
Koo, I., Kim, S., Zhang, X.: Comparative analysis of mass spectral matching-based compound identification in gas chromatography-mass spectrometry. J. Chromatogr. A 1298, 132–138 (2013)
Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing. ACM (1998)
Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. In: VLDB (1999)
Charikar, M.S.: Similarity estimation techniques from rounding algorithms. In: Proceedings of the Thiry-Fourth Annual ACM Symposium on Theory of Computing. ACM (2002)
Stein, S.E., Scott, D.R.: Optimization and testing of mass spectral library search algorithms for compound identification. J. Am. Soc. Mass Spectrom. 5(9), 859–866 (1994)
Koo, I., Zhang, X., Kim, S.: Wavelet- and Fourier-transform-based spectrum similarity approaches to compound identification in gas chromatography/mass spectrometry. Anal. Chem. 83(14), 5631–5638 (2011)
Acknowledgments
This work was supported by National Natural Science Foundation of China under grant nos. 61271098 and 61032007, and Provincial Natural Science Research Program of Higher Education Institutions of Anhui Province under grant no. KJ2012A005.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Cao, LL., Zhang, ZS., Chen, P., Zhang, J. (2015). Compound Identification Using Random Projection for Gas Chromatography-Mass Spectrometry Data. In: Huang, DS., Han, K. (eds) Advanced Intelligent Computing Theories and Applications. ICIC 2015. Lecture Notes in Computer Science(), vol 9227. Springer, Cham. https://doi.org/10.1007/978-3-319-22053-6_71
Download citation
DOI: https://doi.org/10.1007/978-3-319-22053-6_71
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22052-9
Online ISBN: 978-3-319-22053-6
eBook Packages: Computer ScienceComputer Science (R0)