Meet DiffGrad: New Deep Learning Optimizer that solves Adam’s ‘overshoot’ issue

Less Wright
5 min readDec 26, 2019
Example of short term gradient changes on the way to the global optimum (center). Image from paper.

DiffGrad, a new optimizer introduced in the paper “diffGrad: An optimizer for CNN’s” by Dubey, et al, builds on the proven Adam optimizer by developing an adaptive ‘friction clamp’ and monitoring the local change in gradients in order to automatically lock in optimal parameter values that Adam can skip over.

--

--

Less Wright

PyTorch, Deep Learning, Object detection, Stock Index investing and long term compounding.