Multiagent reinforcement learning with spiking and non-spiking agents in the iterated prisoner's dilemma
Ημερομηνία
2009ISSN
0302-9743Source
19th International Conference on Artificial Neural Networks, ICANN 2009Volume
5768 LNCSIssue
PART 1Pages
737-746Google Scholar check
Keyword(s):
Metadata
Εμφάνιση πλήρους εγγραφήςΕπιτομή
This paper investigates Multiagent Reinforcement Learning (MARL) in a general-sum game where the payoffs' structure is such that the agents are required to exploit each other in a way that benefits all agents. The contradictory nature of these games makes their study in multiagent systems quite challenging. In particular, we investigate MARL with spiking and non-spiking agents in the Iterated Prisoner's Dilemma by exploring the conditions required to enhance its cooperative outcome. According to the results, this is enhanced by: (i) a mixture of positive and negative payoff values and a high discount factor in the case of non-spiking agents and (ii) having longer eligibility trace time constant in the case of spiking agents. Moreover, it is shown that spiking and non-spiking agents have similar behaviour and therefore they can equally well be used in any multiagent interaction setting. For training the spiking agents, a novel and necessary modification enhances competition to an existing learning rule based on stochastic synaptic transmission. © 2009 Springer Berlin Heidelberg.