Hi Robert,
Thanks for the comment. To use Mish, you need to change the code in the ResNet to point to Mish instead of Relu. (i.e. act_fn = Mish… vs act_fn = nn.ReLU)
For a pre-trained network however, I would not recommend making that change. The weights for that have already been trained using ReLU and changing the activation after that would likely produce poor results as the weights would be activated differently.
We are investigating setting up an ImageNet pre-trained MXResNet and if we have enough resources will make that available. At that point you could just plug that into your learner and go.
Otherwise, right now we recommend just using Mish for initial training as hot swapping in Mish likely would not perform well.
Best regards,
Less