Meet AdaMod: a new deep learning optimizer with memory

Less Wright
5 min readJan 5, 2020

AdaMod is a new deep learning optimizer that builds on Adam, but provides an automatic warmup heuristic and long term learning rate buffering. From initial testing, AdaMod is a top 5 optimizer and readily beats or exceeds vanilla Adam, while being much less sensitive to the learning rate hyperparameter, smoother training curve, and requires no warmup mode.

AdaMod converges to same point even using up to 2 order magnitude different learning rates vs SGDM and Adam end up with different results.

--

--

Less Wright

PyTorch, Deep Learning, Object detection, Stock Index investing and long term compounding.