ECLM: Entity Level Language Model for Spoken Language Understanding with Chain of Intent

Abstract

Large Language Models (LLMs) have demonstrated impressive capabilities in language generation and general task performance. However, their application to spoken language understanding (SLU) remains challenging, particularly for token-level tasks, where the autoregressive nature of LLMs often leads to misalignment issues. They also struggle to capture nuanced interrelations in semantic-level tasks through direct fine-tuning alone. To address these challenges, we propose the Entity-level Language Model (ECLM) framework, which reformulates slot-filling as an entity recognition task and introduces a novel concept, Chain of Intent, to enable step-by-step multi-intent recognition. Experimental results show that ECLM significantly outperforms strong baselines such as Uni-MIS, achieving gains of 3.7\% on MixATIS and 3.1\% on MixSNIPS. Compared to standard supervised fine-tuning of LLMs, ECLM further achieves improvements of 8.5\% and 21.2\% on these datasets, respectively. Our code is available at https://github.com/SJY8460/ECLM.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…