Member-only story

State of the Art Object Detection — use these top 3 data augmentations and Google Brain’s optimal policy for training your architecture

4 min readJun 27, 2019

Google Brain just released a new paper today that answers a lot of deep learning practitioners questions about “what works best” for training object detection models.

Excerpt from the Google paper showing the consistent improvement from their augmentation policy

While common practice is to simply use similar augmentation techniques as those done for image classification (flipping, etc), they find that specialized augmentations, as part of a ‘learned’ augmentation policy for object detection are superior. With these ‘top 3’ detection specific data augmentation techniques highly used in their full new augmentation policy, they achieve state of the art accuracy with RetinaNet on the COCO dataset, and further show it works well on a number of other architectures and datasets. In other words, if you want to optimize your object detection models, you’ll want to be sure to use these data augmentations and likely their full augmentation policy!

In addition, and continuing in the latest, greatest research theme of “data augmentation, not explicit regularization” (see my previous article here), they also show that data augmentation with these three types inherently improve L2 norm of the weights, without explicit regularization (notice a theme here? :)

“ learned augmentation policy is superior to state-of-the-art architecture regularization methods for object detection, even when considering strong baselines.”

The quote above is from their paper, and the paper is here — a great read ( https://arxiv.org/abs/1906.11172v1) but let’s dig in for a fast overview and summary of some new best practices for training a state of the art object detection model!

Their findings include:

1 — Using techniques from image classification helps object detection models, but net gains are limited.

2 — By setting up a reinforcement learning platform and testing a variety of data augmentations, they find an ‘optimal’ augmentation policy that uses a series of paired augmentations. The top 3 augmentations used in their policies are:

State of the Art Object Detection — use these top 3 data augmentations and Google Brain’s optimal policy for training your architecture

Written by Less Wright

Responses (3)