为什么在20个全连接层后再dropout,而不是每个全连接层后来一个dropout,就像BatchNormalization那样呢?
flatten (Flatten) (None, 784) 0
dense (Dense) (None, 100) 78500
dense_1 (Dense) (None, 100) 10100
dense_2 (Dense) (None, 100) 10100
dense_3 (Dense) (None, 100) 10100
dense_4 (Dense) (None, 100) 10100
dense_5 (Dense) (None, 100) 10100
dense_6 (Dense) (None, 100) 10100
dense_7 (Dense) (None, 100) 10100
dense_8 (Dense) (None, 100) 10100
dense_9 (Dense) (None, 100) 10100
dense_10 (Dense) (None, 100) 10100
dense_11 (Dense) (None, 100) 10100
dense_12 (Dense) (None, 100) 10100
dense_13 (Dense) (None, 100) 10100
dense_14 (Dense) (None, 100) 10100
dense_15 (Dense) (None, 100) 10100
dense_16 (Dense) (None, 100) 10100
dense_17 (Dense) (None, 100) 10100
dense_18 (Dense) (None, 100) 10100
dense_19 (Dense) (None, 100) 10100
alpha_dropout (AlphaDropout (None, 100) 0
)
dense_20 (Dense) (None, 10) 1010
=================================================================
Total params: 271,410
Trainable params: 271,410
Non-trainable params: 0