A Better Variant of Self-Critical Sequence Training

Abstract

In this work, we present a simple yet better variant of Self-Critical Sequence Training. We make a simple change in the choice of baseline function in REINFORCE algorithm. The new baseline can bring better performance with no extra cost, compared to the greedy decoding baseline.

0

Discussion (0)

Sign in to join the discussion.

Loading comments…