• Home
  • News
  • Technology
  • Research
  • Teaching
  • Business
  • Jobs
  • Home
  • News
  • Technology
  • Research
  • Teaching
  • Business
  • Jobs
Contact
  • Deutsch
  • English

  • Home
  • News
  • Technology
  • Research
  • Teaching
  • Business
  • Jobs
Contact
  • Deutsch
  • English

Evaluating Metaheuristic Optimization Methods for Quantum Reinforcement Learning

Evaluating Metaheuristic Optimization Methods for Quantum Reinforcement Learning

Abstract:

Quantum Reinforcement Learning offers the potential for advantages over classical Reinforcement Learning, such as a more compact representation of the state space through quantum states. Furthermore, theoretical studies suggest that Quantum Reinforcement Learning can exhibit faster convergence than classical approaches in certain scenarios. However, further research is needed to validate the actual benefits of Quantum Reinforcement Learning in practical applications. This technology also faces challenges such as a flat solution landscape, characterized by missing or low gradients, which makes the application of traditional, gradient-based optimization methods inefficient. In this context, it is necessary to examine gradient-free algorithms as an alternative. The present work focuses on the integration of metaheuristic optimization algorithms such as Particle Swarm Optimization, Ant Colony Optimization, Tabu Search, Simulated Annealing, and Harmony Search into Quantum Reinforcement Learning. These algorithms offer flexibility and efficiency in parameter optimization, as they utilize specialized search strategies and adaptability. The approaches are evaluated within two Reinforcement Learning environments and compared to random action selection. The results show that in the MiniGrid environment, all algorithms lead to acceptable or even very good results, with Simulated Annealing and Particle Swarm Optimization achieving the best performance. In the Cart Pole environment, Simulated Annealing and Particle Swarm Optimization achieve optimal results, while Ant Colony Optimization, Tabu Search, and Harmony Search perform only slightly better than an algorithm with random action selection. These results demonstrate the potential of metaheuristic optimization methods such as Particle Swarm Optimization and Simulated Annealing for efficient learning in Quantum Reinforcement Learning systems, but also highlight the need for careful selection and adaptation of the algorithm to the specific problem.

Author:

Daniel Seidl

Advisors:

Michael Kölle, Maximilian Zorn, Claudia Linnhoff-Popien


Student Thesis | Published May 2024 | Copyright © QAR-Lab
Direct Inquiries to this work to the Advisors



QAR-Lab – Quantum Applications and Research Laboratory
Ludwig-Maximilians-Universität München
Oettingenstraße 67
80538 Munich
Phone: +49 89 2180-9153
E-mail: qar-lab@mobile.ifi.lmu.de

© Copyright 2025

General

Team
Contact
Legal notice

Social Media

Twitter Linkedin Github

Language

  • Deutsch
  • English
Cookie-Zustimmung verwalten
Wir verwenden Cookies, um unsere Website und unseren Service zu optimieren.
Funktional Always active
Die technische Speicherung oder der Zugang ist unbedingt erforderlich für den rechtmäßigen Zweck, die Nutzung eines bestimmten Dienstes zu ermöglichen, der vom Teilnehmer oder Nutzer ausdrücklich gewünscht wird, oder für den alleinigen Zweck, die Übertragung einer Nachricht über ein elektronisches Kommunikationsnetz durchzuführen.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistiken
Die technische Speicherung oder der Zugriff, der ausschließlich zu statistischen Zwecken erfolgt. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
Die technische Speicherung oder der Zugriff ist erforderlich, um Nutzerprofile zu erstellen, um Werbung zu versenden oder um den Nutzer auf einer Website oder über mehrere Websites hinweg zu ähnlichen Marketingzwecken zu verfolgen.
Manage options Manage services Manage {vendor_count} vendors Read more about these purposes
Einstellungen anzeigen
{title} {title} {title}