Label Smoothing & Deep Learning: Google Brain explains why it works and when to use (SOTA tips)
Hinton, Muller and Cornblith from Google Brain released a new paper titled “When does label smoothing help?” and dive deep into the internals of how label smoothing affects the final activation layer for deep neural networks. They built a new visualization method to clarify the internal effects of label smoothing, and provide new insight into how it works internally. While label smoothing is often used, this paper explains the why and how label smoothing affects NN’s and valuable insight as to when, and when not, to use label smoothing.
This article is a summary of the paper’s insights to help you quickly leverage the findings for your own deep learning work. The full paper is recommended for deeper analysis.
What is Label Smoothing?: Label smoothing is a loss function modification that has been shown to be very effective for training deep learning networks. Label smoothing improves accuracy in image classification, translation, and even speech recognition. Our team used it for example in breaking a number of FastAI leaderboard records: