
“Most neural networks pick one filter size at a time. But why not do all of them at once?โ Inception said.

Give me a photo, I can zoom in on tiny details, look at medium-sized patterns, scan a bigger scene for bigger shapes.

“But how is that possible?” AlexNet asked.

Instead of choosing one type of convolution (like 3ร3 or 5ร5), I runs multiple convolutions in parallel โ plus pooling โ and then concatenates the results.

“So, which version of Inception are you?” The octopus asked, to which, Inception replied “I am the first Inception-based model, GoogLeNet.”

I’m the eldest brother of the Inception family. Inception v2 added batch normalization. Inception v3 used factorized convolutions.

Inception v4 went deeper and wider, while Inception-ResNet combine Inception with residuals.

“But increasing the number of layers means becoming more computational expensive, right?” a sponge asked.

Inception replied, “Not really, if you use a sparsely connected architecture instead of a fully connected network. That’s also the major change of Inception compared to the networks that were born earlier.”
๐งช Later Versions
Version | Cool Upgrades |
---|---|
Inception v2 | Added batch normalization ๐งผ |
Inception v3 | Used factorized convolutions ๐ง |
Inception v4 | Went deeper and wider ๐๏ธ |
Inception-ResNet | Combined with residuals ๐ |