Logits-Constrained Framework with RoBERTa for Ancient Chinese NER
Abstract
This paper presents a Logits-Constrained (LC) framework for Ancient Chinese Named Entity Recognition (NER), evaluated on the EvaHan 2025 benchmark. Our two-stage model integrates GujiRoBERTa for contextual encoding and a differentiable decoding mechanism to enforce valid BMES label transitions. Experiments demonstrate that LC improves performance over traditional CRF and BiLSTM-based approaches, especially in high-label or large-data settings. We also propose a model selection criterion balancing label complexity and dataset size, providing practical guidance for real-world Ancient Chinese NLP tasks.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.