Comparison of Traditional Optimal Control Methodologies to Reinforcement Learning Agents in Quadcopter Attitude Control Applications

Stephen Kleppinger,

doi:https://doi.org/10.21985/n2-k4zw-5193

Work

Comparison of Traditional Optimal Control Methodologies to Reinforcement Learning Agents in Quadcopter Attitude Control Applications

Public Deposited

Download PDF

Download All Files (.zip)

Attitude control in quadrotor Unmanned Aerial Vehicle (UAV) systems is traditionally managed by optimal control loops tuned to minimize errors in performance. While robust, these loops perform sub-optimally in dynamic and unpredictable environments which inspire new interest in sophisticated solution and approaches such as reinforcement learning (RL) approaches which should be able to provide control in a wider range of use cases. Proximal Policy Optimization (PPO)-trained RL agents were compared against their ProportionalIntegral-Derivative (PID) controlled counterparts in order to determine why this statistically-driven approach could outperform tuned controllers in noisy flight conditions. It is shown that PID controllers perform significantly better in stable conditions, but RL agents were able to more quickly restore UAV stability in certain harsh environments. To investigate these differences a simulated environment was built for RL training and to identify if RL-based attitude control is one of the viable solutions for future quadcopter attitude control.

Last modified