DEVELOPMENT OF AN ADAPTIVE UAV CONTROL METHOD BASED ON DEEP REINFORCEMENT LEARNING AND DOMAIN RANDOMIZATION
DOI:
https://doi.org/10.36910/775.24153966.2025.84.39Keywords:
UAV, reinforcement learning, PPO, adaptive control, domain randomization, flight dynamicsAbstract
The paper addresses the problem of creating an adaptive control system for an unmanned aerial vehicle (UAV) to operate under conditions of stochastic uncertainty. An approach based on the Proximal Policy Optimization (PPO) algorithm using a domain randomization strategy for environmental parameters is proposed. A neural network structure and a reward function have been developed to ensure a balance between navigation accuracy and energy efficiency. The developed controller demonstrated the ability to adapt to payload mass changes (up to 20%) and external wind disturbances without the need for manual coefficient retuning. Comparative simulation results confirm the advantage of the proposed method over a classical PID controller: stabilization time is reduced by 1.5 times, and maximum deviation during disturbances is reduced by 40%.