Member-only story
Progressive Sprinkles: A new data augmentation for CNN’s (and helps achieve new 98+% NIH Malaria dataset accuracy)
While working on trying to beat the state of the art for the NIH Malaria dataset (97%), I faced a dilemma…most of the standard ‘heavy’ data augmentation methods (CutOut, RICAP, and CutMix) wouldn’t work. The reason is because the visual clues regarding a cell being infected are only in a random location of the cell, while the rest of the cell is otherwise perfectly normal.
Thus if you randomly clip a ‘healthy’ half with cutmix or ricap, or block out the infected area with a large black block via CutOut, you’d be telling the CNN to look at the image portion of an otherwise clean cell and teaching it, that it was in fact infected.
Which of course would not result in an intelligent classifier.
Thus in thinking about this, I realized that if instead of blocking out a large random block ala Cutout, which could result in no usable information available to learn from (i.e. it…