Abstract
Patent classification is an essential task in patent information management and patent knowledge mining. However, this task is still largely done manually due to the unsatisfactory performance of current algorithms. Recently, deep learning methods such as convolutional neural networks (CNN) have led to great progress in image processing, voice recognition, and speech recognition, which has yet to be applied to patent classification. We proposed DeepPatent, a deep learning algorithm for patent classification based on CNN and word vector embedding. We evaluated the algorithm on the standard patent classification benchmark dataset CLEF-IP and compared it with other algorithms in the CLEF-IP competition. Experiments showed that DeepPatent with automatic feature extraction achieved a classification precision of 83.98%, which outperformed all the existing algorithms that used the same information for training. Its performance is better than the state-of-art patent classifier with a precision of 83.50%, whose performance is, however, based on 4000 characters from the description section and a lot of feature engineering while DeepPatent only used the title and abstract information. DeepPatent is further tested on USPTO-2M, a patent classification benchmark data set that we contributed with 2,000,147 records after data cleaning of 2,679,443 USA raw utility patent documents in 637 categories at the subclass level. Our algorithms achieved a precision of 73.88%.










Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Altuntas, S., Dereli, T., & Kusiak, A. (2015). Forecasting technology success based on patent data. Technological Forecasting and Social Change, 96, 202–214.
Azam, N., & Yao, J. (2012). Comparison of term frequency and document frequency based feature selection metrics in text categorization. Expert Systems with Applications, 39(5), 4760–4768.
Bengio, Y., Ducharme, R., Vincent, P., & Jauvin, C. (2003). A neural probabilistic language model. Journal of Machine Learning Research, 3(6), 1137–1155.
Benzineb, K., & Guyot, J. (2011). Automated patent classification. Current Challenges in Patent Information Retrieval, 29, 239–261.
Blum, A. L., & Langley, P. (1997). Selection of relevant features and examples in machine learning. Artificial Intelligence, 97(1), 245–271.
Brants, T., & Franz, A. (2006). Web 1T 5-gram Version 1.
Chen, Y.-L., & Chang, Y.-C. (2012). A three-phase method for patent classification. Information Processing and Management, 48(6), 1017–1030.
D’hondt, E., Verberne, S., Weber, N., Koster, K., & Boves, L. (2012). Using skipgrams and pos-based feature selection for patent classification. Computational Linguistics in the Netherlands Journal, 2, 52–70.
Derieux, F., Bobeica, M., Pois, D., & Raysz, J. P. (2010). Combining semantics and statistics for patent classification. In CLEF 2010 LABs and workshops, notebook papers, 22–23 September 2010, Padua, Italy.
D’hondt, E., & Verberne, S. (2010). CLEF-IP 2010: Prior art retrieval using the different sections in patent documents. In CLEF (notebook papers/LABs/workshops).
D’hondt, E., Verberne, S., Koster, C., & Boves, L. (2013). Text representations for patent classification. Computational Linguistics, 39(3), 755–775.
Fall, C. J., Törcsvári, A., Benzineb, K., & Karetka, G. (2003). Automated categorization in the international patent classification. ACM SIGIR forum.
Grawe, M. F., Martins, C. A., & Bonfante, A. G. (2017). Automated patent classification using word embedding. In 16th IEEE international conference on machine learning and applications (ICMLA).
Guyot, J., Benzineb, K., Falquet, G., & Shift, S. (2010). myClass: A mature tool for patent classification. In CLEF (notebook papers/LABs/workshops).
Hinton, G. E., McClelland, J. L., & Rumelhart, D. E. (1984). Distributed representations [M]. Pittsburgh, PA: Carnegie-Mellon University.
Kim, Y. (2014). Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882.
Kim, Y. G., Suh, J. H., & Park, S. C. (2008). Visualization of patent analysis for emerging technology. Expert Systems with Applications, 34(3), 1804–1812.
Kingma, D., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
Korde, V., & Mahender, C. N. (2012). Text classification and classifiers: A survey. International Journal of Artificial Intelligence & Applications, 3(2), 85.
Koster, C. H., Beney, J. G., Verberne, S., & Vogel, M. (2011). Phrase-based document categorization. In K. Mayer & A. J. Trippe (Eds.), Current challenges in patent information retrieval (pp. 263–286). Berlin: Springer.
Laurens, V. D. M. (2014). Accelerating t-SNE using tree-based algorithms. Journal of Machine Learning Research, 15(1), 3221–3245.
Lecun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.
Lee, S., Lee, H. J., & Yoon, B. (2012). Modeling and analyzing technology innovation in the energy sector: Patent-based HMM approach. Computers & Industrial Engineering, 63(3), 564–577.
Lemley, M. A., & Feldman, R. (2016). Patent licensing, technology transfer, and innovation. The American Economic Review, 106(5), 188–192.
Lewis, D. D. (1992). An evaluation of phrasal and clustered representations on a text categorization task. In Proceedings of the 15th annual international ACM SIGIR conference on research and development in information retrieval.
Li, Q., Maggitti, P. G., Smith, K. G., Tesluk, P. E., & Katila, R. (2013). Top management attention to innovation: The role of search selection and intensity in new product introductions. Academy of Management Journal, 56(3), 893–916.
Li, Z., Tate, D., Lane, C., & Adams, C. (2012). A framework for automatic TRIZ level of invention estimation of patents using natural language processing, knowledge-transfer and patent citation metrics. Computer-Aided Design, 44(10), 987–1010.
Luong, T., Socher, R., & Manning, C. D. (2013). Better word representations with recursive neural networks for morphology. In CoNLL.
Meireles, M. R. G., Ferraro, G., & Geva, S. (2016). Classification and information management for patent collections: A literature review and some research questions. Information Research 21(1), paper 705.
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013a). Efficient Estimation of Word Representations in Vector Space. arXiv:13013781v3.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013b). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, MountainView.
Najafabadi, M. M., Villanustre, F., Khoshgoftaar, T. M., Seliya, N., Wald, R., & Muharemagic, E. (2015). Deep learning applications and challenges in big data analytics. Journal of Big Data, 2(1), 1.
Park, H., Yoon, J., & Kim, K. (2013). Identification and evaluation of corporations for merger and acquisition strategies using patent information and text mining. Scientometrics, 97(3), 883–909.
Park, Y., Yoon, J., & Phillips, F. (2017). Application technology opportunity discovery from technology portfolios: Use of patent classification and collaborative filtering. Technological Forecasting & Social Change, 118, 170–183.
Pennington, J., Socher, R., & Manning, C. D. (2014). Glove: Global vectors for word representation. In EMNLP.
Piroi, F., Lupu, M., Hanbury, A., & Zenz, V. (2011). CLEF-IP 2011: Retrieval in the intellectual property domain. In CLEF 2011 labs and workshop, notebook papers, 19–22 September 2011, Amsterdam, The Netherlands.
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986) Learning representations by back-propagating errors [J]. Nature, 323(6088), 533–536.
Sahlgren, M. (2008). The distributional hypothesis. Italian Journal of Linguistics, 20(1), 33–54.
Taddy, M. (2015). Document classification by inversion of distributed language representations. arXiv preprint arXiv:1504.07295.
Taylor, G. W., Fergus, R., LeCun, Y., & Bregler, C. (2010). Convolutional learning of spatio-temporal features. In European conference on computer vision.
Trappey, A. J., Hsu, F.-C., Trappey, C. V., & Lin, C.-I. (2006). Development of a patent document classification and search platform using a back-propagation network. Expert Systems with Applications, 31(4), 755–765.
Trappey, A. J., Trappey, C. V., Wu, C.-Y., & Lin, C.-W. (2012). A patent quality analysis for innovative technology and product development. Advanced Engineering Informatics, 26(1), 26–34.
Verberne, S., & D’hondt, E. (2011). Patent classification experiments with the linguistic classification system LCS in CLEF-IP 2011. In CLEF (notebook papers/labs/workshop).
Verberne, S., Vogel, M., & D’hondt, E. (2010). Patent classification experiments with the linguistic classification system LCS. In CLEF (notebook papers/LABs/workshops).
Wagner, S., & Wakeman, S. (2016). What do patent-based measures tell us about product commercialization? Evidence from the pharmaceutical industry. Research Policy, 45(5), 1091–1102.
WIPO. (2018). World intellectual property indicators 2017. Geneva: World Intellectual Property Organization.
Wu, C.-H., Ken, Yun, & Huang, Tao. (2010). Patent classification system using a new hybrid genetic algorithm support vector machine. Applied Soft Computing, 10(4), 1164–1177.
Zhang, X., Zhao, J., & LeCun, Y. (2015). Character-level convolutional networks for text classification. In Advances in neural information processing systems.
Acknowledgements
This work is supported by the National Natural Science Foundation of China under Grant Nos. 51475097, 91746116 and 51741101, the China Scholarship Council, and Science and Technology Foundation of Guizhou Province under Grant Nos. JZ[2014], Talents[2015]4011, and [2016]5013, and Collaborative Innovation[2015]02. We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan X Pascal GPU used for this research (Grant No. 61640209).
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Li, S., Hu, J., Cui, Y. et al. DeepPatent: patent classification with convolutional neural networks and word embedding. Scientometrics 117, 721–744 (2018). https://doi.org/10.1007/s11192-018-2905-5
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-018-2905-5