Impact Analysis of Speech Representation Learning Models for Acoustic Side-Channel Attack

Abstract

Acoustic side-channel attacks (ASCA) on keyboards have gained increasing attention, yet impact of speech representation learning models in ASCA remains unexplored. Addressing this, we introduce KEYAC, a dataset designed to analyze representation generalization for ASCA under both standard and VoIP codec settings. On KEYAC, we evaluate six representation learning models under zero-shot and partial fine-tuning settings using fully connected and convolutional networks. Results show that while partial fine-tuning improves performance, models struggle to generalize across VoIP codecs. We hypothesize this limitation stems from inadequate modeling of nonlinear feature interactions in conventional fine-tuning architectures. To address this, we employ Kolmogorov-Arnold Networks (KAN) for fine-tuning. Empirical results show that KAN-based fine-tuning consistently outperforms the baselines and establishes a new state-of-the-art on KEYAC.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…