Abstract:
This bachelor thesis investigates the robustness of variational quantum circuits (VQCs) in reinforcement learning (RL) compared to classical neural networks under the influence of observation noise. Observation noise describes the uncertainty that arises when the states perceived by an RL agent deviate from the actual states of the environment, for example due to sensor noise, environmental influences or targeted adversarial attacks. A deterministic REINFORCE algorithm is used, which always selects the action with the highest probability prediction instead of the usual stochastic sampling. This methodological decision enables a targeted analysis of the direct influence of observation noise on the agent’s policy, independent of random exploration effects. Robustness is investigated using the deterministic variant of the well-known reinforcement learning environment Frozen-Lake, which is extended by an observation noise model with a self designed hot zone logic. Within these hot zones, the agent receives deliberately incorrect observations orthogonal to its original direction of movement. A classical neural network in form of a multi-layer perceptron (MLP) is compared with a VQC. Although the MLP often converges faster, it exhibits volatile and non-monotonic performance under increasing noise influence. In contrast, the VQC demonstrates superior stability with a predictable performance degradation, especially at higher noise levels. The results suggest that the structural properties of VQCs may enable better generalisation and robustness against structured observation noise.
Author:
Justin Dominik Marinus Klein
Advisors:
Julian Hager, Michael Kölle, Thomas Gabor, Claudia Linnhoff-Popien
Student Thesis | Published July 2025 | Copyright © QAR-Lab
Direct Inquiries to this work to the Advisors