Abstract
Within a group of cooperating agents the decision making of an individual agent depends on the actions of the other agents. A lot of effort has been made to solve this problem with additional assumptions on the communication abilities of agents. However, in some realworld applications, communication is limited and the assumptions are rarely satisfied. An alternative approach newly developed is to employ a correlation device to correlate the agents’ behavior without exchanging information during execution. In this paper, we apply correlation device to large-scale and spare-reward domains. As a basis we use the framework of infinite-horizon DEC-POMDPs which represent policies as joint stochastic finite-state controllers. To solve any problem of this kind, a correlation device is firstly calculated by solving Correlation Markov Decision Processes (Correlation-MDPs) and then used to improve the local controller for each agent. By using this method, we are able to achieve a tradeoff between computational complexity and the quality of the approximation. In addition, we demonstrate that, adversarial problems can be solved by encoding the information of opponents’ behavior in the correlation device. We have successfully implemented the proposed method into our 2D simulated robot soccer team and the performance in RoboCup-2006 was encouraging.
This work is supported by the NSFC 60275024 and the 973 programme 2003CB317000.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Kitano, H., Asada, M., Kuniyoshi, Y., Noda, I., Osawa, E., Matsubara, H.: RoboCup: A Challenge problem for AI. AI Magazine
Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Planning and acting in partially observable stochastic domains. Artificial Intelligence 101, 101–134 (1998)
Bernstein, D.S., Givan, R., Immerman, N., Zilberstein, S.: The complexity of decentralized control of Markov decision processes. Mathematics of Operations Research 27(4), 819–840 (2002)
Bernstein, D.S., Hansen, E.A., Zilberstein, S.: Bounded Policy Iteration for Decentralized POMDPs. In: Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence(IJCAI), Edinburgh, Scotland, pp. 1287–1292 (July 2005)
Visser, U., Weland, H.-G.: Using online learning to analyze the opponents behavior. In: Kaminka, G.A., Lima, P.U., Rojas, R. (eds.) RoboCup 2002. LNCS (LNAI), vol. 2752, pp. 78–93. Springer, Heidelberg (2003)
Rabinovich, Z., Goldman, C.V., Rosenschein, J.S.: The Complexity of Multiagent Systems: The Price of Silence. In: Proceedings of the Second International Joint Conference on Autonomous Agents and Multi-Agent Systems(AAMAS), Melbourne, Australia, pp. 1102–1103 (2003)
Spaan, M.T.j., Groen, F.C.A.: Team coordination among robotic soccer players. In: Kaminka, G.A., Lima, P.U., Rojas, R. (eds.) RoboCup 2002. LNCS (LNAI), vol. 2752. Springer, Heidelberg (2003)
Emery-Montemerlo, R., Gordon, G., Schneider, J., Thrun, S.: Approximate solutions for partially observable stochastic games with common payoffs. In: Kudenko, D., Kazakov, D., Alonso, E. (eds.) AAMAS 2004. LNCS (LNAI), vol. 3394, pp. 136–143. Springer, Heidelberg (2005)
Szer, D., Charpillet, F., Zilberstein, S.: MAA*: A Heuristic Search Algorithm for Solving Decentralized POMDPs. In: Proceedings of the 21st Conference on UAI (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wu, F., Chen, X. (2008). Solving Large-Scale and Sparse-Reward DEC-POMDPs with Correlation-MDPs . In: Visser, U., Ribeiro, F., Ohashi, T., Dellaert, F. (eds) RoboCup 2007: Robot Soccer World Cup XI. RoboCup 2007. Lecture Notes in Computer Science(), vol 5001. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68847-1_18
Download citation
DOI: https://doi.org/10.1007/978-3-540-68847-1_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68846-4
Online ISBN: 978-3-540-68847-1
eBook Packages: Computer ScienceComputer Science (R0)