IBERT: Idiom Cloze-style reading comprehension with Attention

Abstract

Idioms are special fixed phrases usually derived from stories. They are commonly used in casual conversations and literary writings. Their meanings are usually highly non-compositional. The idiom cloze task is a challenge problem in Natural Language Processing (NLP) research problem. Previous approaches to this task are built on sequence-to-sequence (Seq2Seq) models and achieved reasonably well performance on existing datasets. However, they fall short in understanding the highly non-compositional meaning of idiomatic expressions. They also do not consider both the local and global context at the same time. In this paper, we proposed a BERT-based embedding Seq2Seq model that encodes idiomatic expressions and considers them in both global and local context. Our model uses XLNET as the encoder and RoBERTa for choosing the most probable idiom for a given context. Experiments on the EPIE Static Corpus dataset show that our model performs better than existing state-of-the-arts.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…