Abstract
This paper presents a novel framework that applies deep Q-learning (DQN) with transfer learning to millimeter-wave (mmWave) beam selection using a software-defined radio (SDR) testbed. We implement a three-thread software architecture integrating GNU Radio, ZeroMQ, and Python-based APIs to control beam steering. The testbed attains sub-microsecond timescales to perform beamforming and establish a control loop between SDR software and the underlying mmWave phased array. We design a DQN architecture to collect received signal strength (RSS) values and perform angle-of-arrival (AoA) detection without any need for phase detection or multi-element antenna. The DQN agent is trained using a 3-layer neural network and is rewarded based on RSS improvement. We also design a transfer learning framework by reloading and averaging pre-trained DQN weights across five distinct environmental scenarios. Our results demonstrate that the agent converges more quickly and achieves lower AoA detection error when using prior knowledge from transfer learning. They also reveal that categorizing the training scenarios based on line-of-sight (LoS) vs. non-LoS significantly improves the efficacy of the transfer learning for AoA detection.