Multiagent reinforcement learning with spiking and non-spiking agents in the iterated prisoner's dilemma

This paper investigates Multiagent Reinforcement Learning (MARL) in a general-sum game where the payoffs' structure is such that the agents are required to exploit each other in a way that benefits all agents. The contradictory nature of these games makes their study in multiagent systems quite challenging. In particular, we investigate MARL with spiking and non-spiking agents in the Iterated Prisoner's Dilemma by exploring the conditions required to enhance its cooperative outcome. According to the results, this is enhanced by: (i) a mixture of positive and negative payoff values and a high discount factor in the case of non-spiking agents and (ii) having longer eligibility trace time constant in the case of spiking agents. Moreover, it is shown that spiking and non-spiking agents have similar behaviour and therefore they can equally well be used in any multiagent interaction setting. For training the spiking agents, a novel and necessary modification enhances competition to an existing learning rule based on stochastic synaptic transmission. © 2009 Springer Berlin Heidelberg.

Multiagent reinforcement learning with spiking and non-spiking agents in the iterated prisoner's dilemma

Date

Author

ISSN

Source

Volume

Issue

Pages

Keyword(s):

Metadata

Abstract

Links

DOI

URI

Collections

Cite as APAVancouverHarvardBibTeX

Cite as