How Conv2d works in different input dimension and filter dimension?



I wonder how TensorFlow conv2d works in different input dimensions and filter dimensions.
For example, the input shape of a Conv2d layer is [1, 13, 13, 10] and the filter shape is
[20, 3, 3, 10] (will use 20 3×3 filters, no pedding)).

In this situation, how does the Filter works?
As far as I understand, 20 filters do dot product on 10 inputs each.

(The first filter does dot product on every 10 inputs, and next filter does same,,)
So the output shape can be [1, 11, 11, 20].

Am I right?


Assuming you have as input [b,w,h,c], and your kernel as [N,w,h] (there are no channels for kernels as you presented in your example, the channels can be seen as the total of kernels).
Then your logic is correct, each filter will compute a dot product with each channel and sum the results of each channel, so for each kernel you will get one single output, resulting in 20 channels of 11×11 (lose 2 w and h due to the lack of padding) = [11x11x20].

Answered By – Lucas Ramos

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave A Reply

Your email address will not be published.

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More