Chopping and distilling variational autoencoders for real-time anomaly detection in high energy physics
Abstract
Anomaly detection (AD) has recently emerged as an exciting alternative to conventional search strategies in high energy physics using artificial intelligence (AI) and machine learning (ML). The integration of these techniques into trigger systems is even more recent, but represents a crucial step in expanding the coverage of LHC triggers. In this paper, we explore the direct comparison, as well as combination, of two compression techniques for variational autoencoder (VAE) AD trigger algorithms: utilizing only latent-space derived variables and therefore requiring only half of the VAE that we call "chopping" and applying knowledge distillation (KD) to distill the VAE into a student architecture that we call "distillation." We demonstrate the feasibility of deploying such techniques on an FPGA within the resource and latency constraints of an LHC trigger environment and further find that a combination of the two leads to the smallest models that maintain, and in some cases, improve, performance with respect to the original VAE architecture.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.