Comparing deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals a hierarchical correspondence
Construction of the object deep neural model (object DNN) used in : The DNN architecture comprised 8 layers. Each of layers 1-5 contained a combination of convolution, max-pooling and normalization stages, whereas the last three layers were fully connected. The DNN takes pixel values as inputs and propagates information feed-forward through the layers, activating model neurons successively at each layer.
Figure: Object DNN architecture
The object DNN was trained to perform object categorization on everyday object categories (683 categories, with ~1300 images in each category from the ImageNet database ) using back propagation. You can download it here.
To compare representations between the deep neural networks and human brains, we used a 118-image set of natural objects on real-world backgrounds. These 118 images were not used for training the deep object network to avoid circular inference. With 94% correct performance in a top-five categorization task on this 118 image set, the network performed at a level comparable to humans.
- A visualization of the stimulus set for the study is available here.
- Visualization of connectivity and receptive field sensitivity for every neuron in layers 1-5 available here.
- Depiction of the receptive field sensitivity of all neurons is available here.
- Voting on each of the 118 images is available here.
- The Object DNN model file and the list of categories is available here.