dc.contributor.author | Vassiliades, Vassilis | en |
dc.contributor.author | Cleanthous, A. | en |
dc.contributor.author | Christodoulou, Chris C. | en |
dc.creator | Vassiliades, Vassilis | en |
dc.creator | Cleanthous, A. | en |
dc.creator | Christodoulou, Chris C. | en |
dc.date.accessioned | 2019-11-13T10:42:57Z | |
dc.date.available | 2019-11-13T10:42:57Z | |
dc.date.issued | 2009 | |
dc.identifier.issn | 0302-9743 | |
dc.identifier.uri | http://gnosis.library.ucy.ac.cy/handle/7/55135 | |
dc.description.abstract | This paper investigates Multiagent Reinforcement Learning (MARL) in a general-sum game where the payoffs' structure is such that the agents are required to exploit each other in a way that benefits all agents. The contradictory nature of these games makes their study in multiagent systems quite challenging. In particular, we investigate MARL with spiking and non-spiking agents in the Iterated Prisoner's Dilemma by exploring the conditions required to enhance its cooperative outcome. According to the results, this is enhanced by: (i) a mixture of positive and negative payoff values and a high discount factor in the case of non-spiking agents and (ii) having longer eligibility trace time constant in the case of spiking agents. Moreover, it is shown that spiking and non-spiking agents have similar behaviour and therefore they can equally well be used in any multiagent interaction setting. For training the spiking agents, a novel and necessary modification enhances competition to an existing learning rule based on stochastic synaptic transmission. © 2009 Springer Berlin Heidelberg. | en |
dc.source | 19th International Conference on Artificial Neural Networks, ICANN 2009 | en |
dc.source.uri | https://www.scopus.com/inward/record.uri?eid=2-s2.0-70350582964&doi=10.1007%2f978-3-642-04274-4_76&partnerID=40&md5=5b93685289c75a4983eb6ec5afcf87a4 | |
dc.subject | Backpropagation | en |
dc.subject | Neural networks | en |
dc.subject | Time constants | en |
dc.subject | Reinforcement | en |
dc.subject | Fertilizers | en |
dc.subject | Reinforcement learning | en |
dc.subject | Multi-agent reinforcement learning | en |
dc.subject | Iterated prisoner's dilemma | en |
dc.subject | Discount factors | en |
dc.subject | Eligibility traces | en |
dc.subject | Multi-agent interaction | en |
dc.subject | Synaptic transmission | en |
dc.subject | Learning rules | en |
dc.title | Multiagent reinforcement learning with spiking and non-spiking agents in the iterated prisoner's dilemma | en |
dc.type | info:eu-repo/semantics/article | |
dc.identifier.doi | 10.1007/978-3-642-04274-4_76 | |
dc.description.volume | 5768 LNCS | en |
dc.description.issue | PART 1 | en |
dc.description.startingpage | 737 | |
dc.description.endingpage | 746 | |
dc.author.faculty | 002 Σχολή Θετικών και Εφαρμοσμένων Επιστημών / Faculty of Pure and Applied Sciences | |
dc.author.department | Τμήμα Πληροφορικής / Department of Computer Science | |
dc.type.uhtype | Article | en |
dc.description.notes | <p>Conference code: 77563 | en |
dc.description.notes | Cited By :4</p> | en |
dc.source.abbreviation | Lect. Notes Comput. Sci. | en |
dc.contributor.orcid | Christodoulou, Chris C. [0000-0001-9398-5256] | |
dc.gnosis.orcid | 0000-0001-9398-5256 | |