Is sigmoid function only applicable after dense() layer?
I am making a network which is similar to SE-Net(https://github.com/titu1994/keras-squeeze-excite-network/blob/master/se.py)
using keras, but quite different with it.
Suppose that I want to make some layer sequence like :
import keras Input = keras.model.Input((None,None,3)) x1 = keras.layers.Conv2d(filters = 32, kernel_size = (3,3))(Input) x_gp = keras.layers.GlobalAveragePooling()(x1) x2 = keras.layers.Conv2d(filters = 32, kernel_size = (1,1))(x_gp) x3 = keras.layers.Conv2d(filters = 8, kernel_size = (1,1))(x2) x2_ = keras.layers.Conv2d(filters = 32, kernel_size = (1,1))(x3) x_se = keras.activation.sigmoid()(x2_)
I want to know that applying x_se like this is programmable. Please tell me if I am doing wrong.
you can for sure experiment sigmoid as an activation for cnn layers too but the reason why sigmoid is not used with cnn layers are:
1. Sigmoid function is monotonic but it’s derivative is not therefore there is a possibility that your training can be stuck
2. Sigmoid range:[0,1]
if you are experimenting sigmoid with cnn layers then I would suggest you to use it only for few layers.
You can give swish a try.
Answered By – keertika jain