Bandit Algorithms for Deep Brain Stimulation
Abstract
Deep Brain Stimulation (DBS) is an effective treatment for Parkinson's disease, but conventional fixed-parameter stimulation can reduce battery life and cause side effects while failing to adapt to changing neural dynamics. Recent reinforcement learning approaches improve adaptability, yet most rely on deep neural networks that require offline training and are computationally too expensive for implantable hardware. This paper presents a resource-conscious adaptive DBS framework based on a Time- and Threshold-Triggered Pruned Multi-Armed Bandit (T3P MAB) algorithm. The proposed method jointly tunes stimulation frequency and amplitude, avoids prior training, and remains transparent enough to support clinician-guided adjustment. Using a computational basal ganglia-thalamic model, we show that T3P converges faster than competing MAB methods and outperforms deep-RL baselines in suppressing pathological beta-band activity while reducing stimulation power. We implemented it on different microcontrollers and report detailed energy measurements, showing convergence in under two minutes and suitability for resource-constrained implantable systems. These results support lightweight bandit-based control as a practical path toward personalized, energy-efficient DBS.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.