Learning Frames from Text with an Unsupervised Latent Variable Model

Abstract

We develop a probabilistic latent-variable model to discover semantic frames---types of events and their participants---from corpora. We present a Dirichlet-multinomial model in which frames are latent categories that explain the linking of verb-subject-object triples, given document-level sparsity. We analyze what the model learns, and compare it to FrameNet, noting it learns some novel and interesting frames. This document also contains a discussion of inference issues, including concentration parameter learning; and a small-scale error analysis of syntactic parsing accuracy.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…