Experience in Using the Transformer Network Architecture to Approximate Agent’s Policy in Reinforcement Learning

 
Audio is AI-generated
85

Abstract

This paper discusses the basics of the deep reinforcement learning algorithm and the use of neural networks to approximate the agent’s policy. The comparison of using a fully connected neural network and a transformer network in the reinforcement learning algorithm is considered.

General Information

Keywords: artificial intelligence, machine learning, deep reinforcement learning, Markov decision processes, transformer, optimization

Journal rubric: Data Analysis

Article type: scientific article

DOI: https://doi.org/10.17759/mda.2024140201

Received 03.06.2024

Accepted

Published

For citation: Novikov, N.P., Vinogradov, V.I. (2024). Experience in Using the Transformer Network Architecture to Approximate Agent’s Policy in Reinforcement Learning. Modelling and Data Analysis, 14(2), 7–22. (In Russ.). https://doi.org/10.17759/mda.2024140201

© Novikov N.P., Vinogradov V.I., 2024

License: CC BY-NC 4.0

References

  1. An outline of reinforcement learning // Arxiv URL: https://arxiv.org/pdf/2201.09746.pdf (circulation date: 02.01.2024).
  2. Proximal Policy Optimization Algorithms // Arxiv URL: https://arxiv.org/pdf/1707.06347.pdf (circulation date: 24.12.2023).
  3. Attention Is All You Need // Arxiv URL: https://arxiv.org/abs/1706.03762 (circulation date: 16.12.2023).
  4. Gymnasium URL: https://gymnasium.farama.org/ (circulation date: 10.12.2023).
  5. Stable Baselines Documentation // URL: https://buildmedia.readthedocs.org/media/pdf/stable-baselines/master/stable-baselines.pdf (circulation date: 08.12.2023).
  6. High-dimensional continuous control using generalized advantage estimation // Arxiv URL: https://arxiv.org/pdf/1506.02438.pdf (circulation date: 07.12.2023).

Information About the Authors

Nikita P. Novikov, master's student, Institute of Computer Science and Applied Mathematics, Moscow Aviation Institute (National Research University) (MAI), Moscow, Russian Federation, e-mail: rtyderson@gmail.com

Vladimir I. Vinogradov, Candidate of Science (Physics and Matematics), Associate Professor, Department of Mathematical Cybernetics, Moscow Aviation Institute (National Research University), Moscow, Russian Federation, ORCID: https://orcid.org/0000-0003-3773-9653, e-mail: vvinogradov@inbox.ru

Metrics

 Web Views

Whole time: 268
Previous month: 42
Current month: 8

 PDF Downloads

Whole time: 85
Previous month: 2
Current month: 0

 Total

Whole time: 353
Previous month: 44
Current month: 8