Hello @Choozi, I haven’t been following this topic, neither I am super good at it. But it seems like
the above is untrue. In a traditional TF model as well… if you don’t mention any activation straight-ahead in Dense
layers, followed by a layer of ReLU
(etc.) neurons, then the connection from Dense
layer to the layer of ReLU
is supposed to be one to one identity connection (i.e. no weights). In fact, the weights are on the input connections to the Dense
layer. Let’s wait for other experts to resolve your doubt.