Contrastive Unsupervised Word Alignment with Non-Local Features

Abstract

Word alignment is an important natural language processing task that indicates the correspondence between natural languages. Recently, unsupervised learning of log-linear models for word alignment has received considerable attention as it combines the merits of generative and discriminative approaches. However, a major challenge still remains: it is intractable to calculate the expectations of non-local features that are critical for capturing the divergence between natural languages. We propose a contrastive approach that aims to differentiate observed training examples from noises. It not only introduces prior knowledge to guide unsupervised learning but also cancels out partition functions. Based on the observation that the probability mass of log-linear models for word alignment is usually highly concentrated, we propose to use top-n alignments to approximate the expectations with respect to posterior distributions. This allows for efficient and accurate calculation of expectations of non-local features. Experiments show that our approach achieves significant improvements over state-of-the-art unsupervised word alignment methods.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…