TARQ: Tail-Aware Reconstruction Quantization for Rare-Word Robust Automatic Speech Recognition
Abstract
Data-aware post-training quantization (PTQ) minimizes a per-token reconstruction loss on a small calibration corpus, implicitly weighting positions by their empirical frequency. For Automatic Speech Recognition (ASR), this misaligns with tail-sensitive risk: names, numerals, and domain-specific words receive proportionally little calibration mass. We propose Tail-Aware Reconstruction Quantization (), a label-free PTQ framework that shifts calibration toward the lexical tail via , a closed-form per-Linear-layer rule equalizing common/tail mass, paired with a metric-consistent residual correction. \ requires no entity labels, no curated calibration set, no validation decoding, and no additional training. Across eight ASR backbones and six datasets at W4G128, \ improves mean rare-Word Error Rate (rare-WER) without an aggregate-WER regression, achieves the lowest cross-corpus rare-WER swing among compared methods, and transfers to entity-rich benchmarks (ProfASR, ContextASR-Speech-En) without entity supervision.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.