Semi-Supervised Translation with MMD Networks

Abstract

This work aims to improve semi-supervised learning in a neural network architecture by introducing a hybrid supervised and unsupervised cost function. The unsupervised component is trained using a differentiable estimator of the Maximum Mean Discrepancy (MMD) distance between the network output and the target dataset. We introduce the notion of an n-channel network and several methods to improve performance of these nets based on supervised pre-initialization, and multi-scale kernels. This work investigates the effectiveness of these methods on language translation where very few quality translations are known a priori. We also present a thorough investigation of the hyper-parameter space of this method on both synthetic data.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…