Offline Reinforcement Learning in World Of Tanks

Offline reinforcement learning (RL) has emerged as a promising approach for training intelligent agents without requiring real-time interaction with an environment, addressing a key limitation of traditional RL. This capability is particularly valuable in domains where direct interactions are dangerous or impractical, such as healthcare, finance, and hazardous environments. By leveraging offline RL, it is possible to create autonomous agents capable of deriving optimal policies from static datasets, thereby facilitating automation in diverse decision-making realms. This study explores the application of offline RL techniques in the context of World of Tanks, an online multiplayer tank combat game. We evaluated several offline RL algorithms, including Conservative Q-Learning (CQL), Implicit Q-Learning (IQL), Decision Transformer (DT), and Deep Deterministic Policy Gradient (DDPG). Our results indicate that CQL and IQL achieved significant returns under various discount factors, demonstrating robustness and adaptability in offline settings. Notably, higher discount factors led to better cumulative returns, particularly for CQL and IQL. Effective handling of data distribution shifts was crucial for algorithm robustness, with regularization techniques in CQL and modified architectures in IQL proving effective. In addition, offline RL algorithms (IQL, CQL, Decision Transformer) seem to perform better than the baselines ( Behavioral Cloning, policies from dataset). The volume of training data significantly influenced the performance of offline RL algorithms, with larger datasets enhancing learning effectiveness. However, evaluating offline RL policies remains challenging due to the lack of real-time interaction with the environment. We employed methods such as model-based dynamics and policy value estimation, despite their limitations in accurately predicting real-world performance. This study contributes to the methodology of offline RL research and suggests directions for future advancements.

URI

http://gnosis.library.ucy.ac.cy/handle/7/66325

Collections

Τμήμα Πληροφορικής / Department of Computer Science [117]

Cite as

The following license files are associated with this item:

Creative Commons

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/openAccess