Evaluating the Impact of Loss Function Variation in Deep Learning for Classification

Abstract

The loss function is arguably among the most important hyperparameters for a neural network. Many loss functions have been designed to date, making a correct choice nontrivial. However, elaborate justifications regarding the choice of the loss function are not made in related work. This is, as we see it, an indication of a dogmatic mindset in the deep learning community which lacks empirical foundation. In this work, we consider deep neural networks in a supervised classification setting and analyze the impact the choice of loss function has onto the training result. While certain loss functions perform suboptimally, our work empirically shows that under-represented losses such as the KL Divergence can outperform the State-of-the-Art choices significantly, highlighting the need to include the loss function as a tuned hyperparameter rather than a fixed choice.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…