Abstract:
Recent advances in large language models (LLMs) have established transformer architectures as the dominant paradigm in natural language processing (NLP). While these models achieve state of the art performance, their exponential growth in parameter count and computational demands raises concerns regarding scalability, energy consumption and environmental impact. Simultaneously, quantum machine learning (QML) has emerged as a promising field that explores whether quantum computation can offer more efficient learning mechanisms, particularly using variational quantum circuits (VQCs), which have shown competitive performance with fewer parameters. This thesis investigates whether a quantum transformer model can be designed to structurally mirror the classical transformer while remaining feasible for execution on Noisy Intermediate-Scale Quantum (NISQ) hardware. To this end, we propose a modular, NISQ-compatible quantum transformer architecture that reproduces key classical components embedding, multi-head attention and encoder-decoder structure, using VQCs. Each component is implemented using shallow, strongly entangling circuits designed to minimize circuit depth and parameter count. The model is evaluated on synthetic language modeling tasks, comparing quantum and classical variants under matched conditions, including identical token vocabularies and equivalent parameter budgets. Results show that the quantum model is capable of learning simple formal languages, converging rapidly with fewer parameters and in some configurations achieving perfect reconstruction of deterministic token sequences. However, its performance degrades on more complex tasks requiring generalization, where classical models remain superior. These findings demonstrate the feasibility of the proposed quantum transformer architecture on near-term hardware and situate the model as a proof of concept for the architectural potential of encoder-decoder quantum transformers models in NLP.
Author:
Julian Hager
Advisors:
Michael Kölle, Gerhard Stenzel, Thomas Gabor, Claudia Linnhoff-Popien
Student Thesis | Published June 2025 | Copyright © QAR-Lab
Direct Inquiries to this work to the Advisors