Support Vectors for Reinforcement Learning

Dietterich, Thomas G.; Wang, Xin

doi:10.1007/3-540-44795-4_51

Thomas G. Dietterich³ &
Xin Wang³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2167))

Included in the following conference series:

European Conference on Machine Learning

2867 Accesses

Abstract

Support vector machines introduced three important innovations to machine learning research: (a) the application of mathematical programming algorithms to solve optimization problems in machine learning, (b) the control of overfitting by maximizing the margin, and (c) the use of Mercer kernels to convert linear separators into non-linear decision boundaries in implicit spaces. Despite their attractiveness in classification and regression, support vector methods have not been applied to the problem of value function approximation in reinforcement learning. This paper presents three ways of combining linear programming with kernel methods to find value function approximations for reinforcement learning. One formulation is based on the standard approach to SVM regression; the second is based on the Bellman equation; and the third seeks only to ensure that good actions have an advantage over bad actions. All formulations attempt to minimize the norm of the weight vector while fitting the data, which corresponds to maximizing the margin in standard SVM classification. Experiments in a difficult, synthetic maze problem show that all three formulations give excellent performance. However, the third formulation is much more efficient to train and also converges more reliably. Unlike policy gradient and temporal difference methods, the kernel methods described here can easily adjust the complexity of the function approximator to fit the complexity of the value function.

Download to read the full chapter text

Chapter PDF

Bellman residuals minimization using online support vector machines

Article 18 April 2017

In-depth analysis of SVM kernel learning and its components

Article 21 October 2020

Reinforcement Learning for Control Using Value Function Approximation

Author information

Authors and Affiliations

Oregon State University, Corvallis, Oregon, USA
Thomas G. Dietterich & Xin Wang

Authors

Thomas G. Dietterich
View author publications
You can also search for this author in PubMed Google Scholar
Xin Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Albert-Ludwigs University Freiburg, Georges Köhler-Allee, Geb. 079, 79110, Freiburg, Germany
Luc De Raedt
Department of Computer Science, University of Bristol, Merchant Ventures Bldg., Woodland Road, Bristol, BS8 1UB, UK
Peter Flach

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dietterich, T.G., Wang, X. (2001). Support Vectors for Reinforcement Learning. In: De Raedt, L., Flach, P. (eds) Machine Learning: ECML 2001. ECML 2001. Lecture Notes in Computer Science(), vol 2167. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44795-4_51

Download citation

DOI: https://doi.org/10.1007/3-540-44795-4_51
Published: 30 August 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42536-6
Online ISBN: 978-3-540-44795-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Navigation

Support Vectors for Reinforcement Learning

Abstract

Chapter PDF

Similar content being viewed by others

Bellman residuals minimization using online support vector machines

In-depth analysis of SVM kernel learning and its components

Reinforcement Learning for Control Using Value Function Approximation

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Search

Navigation

Support Vectors for Reinforcement Learning

Abstract

Chapter PDF

Similar content being viewed by others

Bellman residuals minimization using online support vector machines

In-depth analysis of SVM kernel learning and its components

Reinforcement Learning for Control Using Value Function Approximation

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us