1 min readApr 11, 2020
Thanks for taking the time to read the article.
Yes exploding/vanishing gradient might not be a problem a lot of people might have to directly deal with in practical settings, especially when current Deep CNNs are designed to tackle the said problem directly, an example that comes to mind will be GoogLeNet.
I read your article on just training BN normalization layers, and you are right I did like it and would even want to try it out with different datasets as you suggested.