当前位置: 首页 > news >正文

关于反向传播

反向传播

  我上一篇写了关于反向传播的内容的是这里:机器学习–神经网络,现在过去想要复习一下已经有点看不懂了(((

  所以今天想能不能用更通俗一点的方式描述一下back propagation这个算法的精髓

进入正题

  我们首先来看一下一个最最最最最简单的神经网络,她长这样:

在这里插入图片描述

  于是我们的 cost function 就可以直接写成这种形式:

C ( w 1 , b 1 , w 2 , b 2 , w 3 , b 3 ) C(w_1, b_1, w_2, b_2, w_3, b_3) C(w1,b1,w2,b2,w3,b3)

  然后我们令这四个节点分别为 a ( L − 3 ) , a ( L − 2 ) , a ( L − 1 ) , a ( L ) a^{(L-3)}, a^{(L-2)}, a^{(L-1)}, a^{(L)} a(L3),a(L2),a(L1),a(L),我们期望的值是 y y y,那么就显然有:

C ( w 1 , b 1 , w 2 , b 2 , w 3 , b 3 ) = 1 2 ( a ( L ) − y ) 2 C(w_1, b_1, w_2, b_2, w_3, b_3) = \frac 12(a^{(L)} - y)^2 C(w1,b1,w2,b2,w3,b3)=21(a(L)y)2

  然后还记得我们的 a ( L ) a^{(L)} a(L) 是怎么计算的吗? 没错, L L L 层的激发就是它前一层的激发的线性组合加上偏差值 b ( L ) b^{(L)} b(L) 然后再做 r e L U reLU reLU 或者 s i g m o i d sigmoid sigmoid

z ( L ) = w ( L ) a ( L − 1 ) + b ( L ) a ( L ) = σ ( z ( L ) ) \begin{aligned} & z^{(L)} = w^{(L)}a^{(L - 1)} + b^{(L)} \\ & a^{(L)} = \sigma(z^{(L)}) \end{aligned} z(L)=w(L)a(L1)+b(L)a(L)=σ(z(L))

第一层

  我们现在的目标是要弄明白 w ( L ) w^{(L)} w(L) 的改变会对 C ( . . . ) C(...) C(...) 产生多大的影响,数学上也就是想要知道 ∂ C ∂ w ( L ) \frac{\partial C}{\partial w^{(L)}} w(L)C,那么我们就要建立 C C C w ( L ) w^{(L)} w(L) 的关系:

z ( L ) = w ( L ) a ( L − 1 ) + b ( L ) a ( L ) = σ ( z ( L ) ) C = 1 2 ( a ( L ) − y ) 2 \begin{aligned} & z^{(L)} = w^{(L)}a^{(L - 1)} + b^{(L)} \\ & a^{(L)} = \sigma(z^{(L)}) \\ & C = \frac 12(a^{(L)} - y)^2 \end{aligned} z(L)=w(L)a(L1)+b(L)a(L)=σ(z(L))C=21(a(L)y)2

  我们可以看到,我们 w ( L ) w^{(L)} w(L) 想要和 C ( . . . ) C(...) C(...) 关联起来就是靠这三个方程,那么我们一步一步来:

  1. 先确定 w ( L ) w^{(L)} w(L) 改变对 z ( L ) z^{(L)} z(L) 的影响程度(也就是 ∂ z ( L ) ∂ w ( L ) \frac{\partial z^{(L)}}{\partial w^{(L)}} w(L)z(L)
  2. 再确定 z ( L ) z^{(L)} z(L) 的改变对 a ( L ) a^{(L)} a(L) 的影响程度(也就是 ∂ a ( L ) ∂ z ( L ) \frac{\partial a^{(L)}}{\partial z^{(L)}} z(L)a(L)
  3. 最后确定 a ( L ) a^{(L)} a(L) 改变对 C C C 的影响程度(也就是 ∂ C ∂ a ( L ) \frac{\partial C}{\partial a^{(L)}} a(L)C

  这样分三步来我们就可以得到 w ( L ) w^{(L)} w(L) 改变对 C ( . . . ) C(...) C(...) 的影响程度,数学上来说就是这样(其实就是链式求导法则):

∂ C ∂ w ( L ) = ∂ z ( L ) ∂ w ( L ) ⋅ ∂ a ( L ) ∂ z ( L ) ⋅ ∂ C ∂ a ( L ) \frac{\partial C}{\partial w^{(L)}} = \frac{\partial z^{(L)}}{\partial w^{(L)}} \cdot \frac{\partial a^{(L)}}{\partial z^{(L)}} \cdot \frac{\partial C}{\partial a^{(L)}} w(L)C=w(L)z(L)z(L)a(L)a(L)C

  然后我们把上式直接展开,就能得到:

∂ C ∂ w ( L ) = ∂ z ( L ) ∂ w ( L ) ⋅ ∂ a ( L ) ∂ z ( L ) ⋅ ∂ C ∂ a ( L ) = a ( L − 1 ) ⋅ σ ′ ( z ( L ) ) ⋅ ( a ( L ) − y ) \begin{aligned} \frac{\partial C}{\partial w^{(L)}} = & \frac{\partial z^{(L)}}{\partial w^{(L)}} \cdot \frac{\partial a^{(L)}}{\partial z^{(L)}} \cdot \frac{\partial C}{\partial a^{(L)}}\\ \\ = & a^{(L - 1)} \cdot \sigma'(z^{(L)}) \cdot (a^{(L)} - y) \end{aligned} w(L)C==w(L)z(L)z(L)a(L)a(L)Ca(L1)σ(z(L))(a(L)y)

  相似的,我们可以求出 ∂ C ∂ b ( L ) \frac{\partial C}{\partial b^{(L)}} b(L)C

∂ C ∂ b ( L ) = ∂ z ( L ) ∂ b ( L ) ⋅ ∂ a ( L ) ∂ z ( L ) ⋅ ∂ C ∂ a ( L ) = 1 ⋅ σ ′ ( z ( L ) ) ⋅ ( a ( L ) − y ) \begin{aligned} \frac{\partial C}{\partial b^{(L)}} = & \frac{\partial z^{(L)}}{\partial b^{(L)}} \cdot \frac{\partial a^{(L)}}{\partial z^{(L)}} \cdot \frac{\partial C}{\partial a^{(L)}}\\ \\ = & 1 \cdot \sigma'(z^{(L)}) \cdot (a^{(L)} - y) \end{aligned} b(L)C==b(L)z(L)z(L)a(L)a(L)C1σ(z(L))(a(L)y)

第二层

  同样的,我们前一层的东西也是如法炮制,我们有:

z ( L − 1 ) = w ( L − 1 ) a ( L − 2 ) + b ( L − 1 ) a ( L − 1 ) = σ ( z ( L − 1 ) ) z ( L ) = w ( L ) a ( L − 1 ) + b ( L ) a ( L ) = σ ( z ( L ) ) \begin{aligned} & z^{(L - 1)} = w^{(L - 1)}a^{(L - 2)} + b^{(L - 1)} \\ & a^{(L - 1)} = \sigma(z^{(L - 1)}) \\ & z^{(L)} = w^{(L)}a^{(L - 1)} + b^{(L)} \\ & a^{(L)} = \sigma(z^{(L)}) \end{aligned} z(L1)=w(L1)a(L2)+b(L1)a(L1)=σ(z(L1))z(L)=w(L)a(L1)+b(L)a(L)=σ(z(L))

  于是可以得到:

∂ C ∂ w ( L − 1 ) = ∂ z ( L − 1 ) ∂ w ( L − 1 ) ⋅ ∂ a ( L − 1 ) ∂ z ( L − 1 ) ⋅ ∂ C ∂ a ( L − 1 ) = ∂ z ( L − 1 ) ∂ w ( L − 1 ) ⋅ ∂ a ( L − 1 ) ∂ z ( L − 1 ) ⋅ [ ∂ z ( L ) ∂ a ( L − 1 ) ⋅ ∂ a ( L ) ∂ z ( L ) ⋅ ∂ C ∂ a ( L ) ] = a ( L − 2 ) ⋅ σ ′ ( z ( L − 1 ) ) ⋅ [ w ( L ) ⋅ σ ′ ( z ( L ) ) ⋅ ( a ( L ) − y ) ] \begin{aligned} \frac{\partial C}{\partial w^{(L - 1)}} = & \frac{\partial z^{(L - 1)}}{\partial w^{(L - 1)}} \cdot \frac{\partial a^{(L - 1)}}{\partial z^{(L - 1)}} \cdot \frac{\partial C}{\partial a^{(L - 1)}} \\ \\ = & \frac{\partial z^{(L - 1)}}{\partial w^{(L - 1)}} \cdot \frac{\partial a^{(L - 1)}}{\partial z^{(L - 1)}} \cdot \left[\frac{\partial z^{(L)}}{\partial a^{(L - 1)}} \cdot \frac{\partial a^{(L)}}{\partial z^{(L)}} \cdot \frac{\partial C}{\partial a^{(L)}}\right] \\ \\ = & a^{(L - 2)} \cdot \sigma'(z^{(L - 1)}) \cdot \left[w^{(L)} \cdot \sigma'(z^{(L)}) \cdot (a^{(L)} - y)\right] \end{aligned} w(L1)C===w(L1)z(L1)z(L1)a(L1)a(L1)Cw(L1)z(L1)z(L1)a(L1)[a(L1)z(L)z(L)a(L)a(L)C]a(L2)σ(z(L1))[w(L)σ(z(L))(a(L)y)]

∂ C ∂ b ( L − 1 ) = ∂ z ( L − 1 ) ∂ b ( L − 1 ) ⋅ ∂ a ( L − 1 ) ∂ z ( L − 1 ) ⋅ ∂ C ∂ a ( L − 1 ) = ∂ z ( L − 1 ) ∂ b ( L − 1 ) ⋅ ∂ a ( L − 1 ) ∂ z ( L − 1 ) ⋅ [ ∂ z ( L ) ∂ a ( L − 1 ) ⋅ ∂ a ( L ) ∂ z ( L ) ⋅ ∂ C ∂ a ( L ) ] = 1 ⋅ σ ′ ( z ( L − 1 ) ) ⋅ [ w ( L ) ⋅ σ ′ ( z ( L ) ) ⋅ ( a ( L ) − y ) ] \begin{aligned} \frac{\partial C}{\partial b^{(L - 1)}} = & \frac{\partial z^{(L - 1)}}{\partial b^{(L - 1)}} \cdot \frac{\partial a^{(L - 1)}}{\partial z^{(L - 1)}} \cdot \frac{\partial C}{\partial a^{(L - 1)}} \\ \\ = & \frac{\partial z^{(L - 1)}}{\partial b^{(L - 1)}} \cdot \frac{\partial a^{(L - 1)}}{\partial z^{(L - 1)}} \cdot \left[\frac{\partial z^{(L)}}{\partial a^{(L - 1)}} \cdot \frac{\partial a^{(L)}}{\partial z^{(L)}} \cdot \frac{\partial C}{\partial a^{(L)}}\right] \\ \\ = & 1 \cdot \sigma'(z^{(L - 1)}) \cdot \left[w^{(L)} \cdot \sigma'(z^{(L)}) \cdot (a^{(L)} - y)\right] \end{aligned} b(L1)C===b(L1)z(L1)z(L1)a(L1)a(L1)Cb(L1)z(L1)z(L1)a(L1)[a(L1)z(L)z(L)a(L)a(L)C]1σ(z(L1))[w(L)σ(z(L))(a(L)y)]

对比一下?

  我们现在已经算了两层这个东西了,根据已知的这几个表达式我们能看出什么规律吗?我们把求出来的这四个式子再写在下面供对比:

∂ C ∂ w ( L ) = a ( L − 1 ) ⋅ σ ′ ( z ( L ) ) ⋅ ∂ C ∂ a ( L ) , ∂ C ∂ b ( L ) = 1 ⋅ σ ′ ( z ( L ) ) ⋅ ∂ C ∂ a ( L ) ∂ C ∂ w ( L − 1 ) = a ( L − 2 ) ⋅ σ ′ ( z ( L − 1 ) ) ⋅ ∂ C ∂ a ( L − 1 ) , ∂ C ∂ b ( L − 1 ) = 1 ⋅ σ ′ ( z ( L − 1 ) ) ⋅ ∂ C ∂ a ( L ) \begin{aligned} &\frac{\partial C}{\partial w^{(L)}} = a^{(L - 1)} \cdot \sigma'(z^{(L)}) \cdot \frac{\partial C}{\partial a^{(L)}} ,\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; \frac{\partial C}{\partial b^{(L)}} = 1 \cdot \sigma'(z^{(L)}) \cdot \frac{\partial C}{\partial a^{(L)}} \\ &\frac{\partial C}{\partial w^{(L - 1)}} = a^{(L - 2)} \cdot \sigma'(z^{(L - 1)}) \cdot \frac{\partial C}{\partial a^{(L - 1)}}, \;\;\;\;\;\;\;\;\;\; \frac{\partial C}{\partial b^{(L - 1)}} = 1 \cdot \sigma'(z^{(L - 1)}) \cdot \frac{\partial C}{\partial a^{(L)}} \end{aligned} w(L)C=a(L1)σ(z(L))a(L)C,b(L)C=1σ(z(L))a(L)Cw(L1)C=a(L2)σ(z(L1))a(L1)C,b(L1)C=1σ(z(L1))a(L)C

  我们可以发现在这几个式子的最后我们都有一个 ∂ C ∂ a \frac{\partial C}{\partial a} aC 的项,而且这个项是可以递推计算的,所以我们考虑令 δ ( n ) = ∂ C ∂ a ( n ) \delta^{(n)} = \frac{\partial C}{\partial a^{(n)}} δ(n)=a(n)C,于是上面四个式子可以写成:

∂ C ∂ w ( L ) = a ( L − 1 ) ⋅ σ ′ ( z ( L ) ) ⋅ δ ( L ) , ∂ C ∂ b ( L ) = 1 ⋅ σ ′ ( z ( L ) ) ⋅ δ ( L ) ∂ C ∂ w ( L − 1 ) = a ( L − 2 ) ⋅ σ ′ ( z ( L − 1 ) ) δ ( L − 1 ) , ∂ C ∂ b ( L − 1 ) = 1 ⋅ σ ′ ( z ( L − 1 ) ) ⋅ δ ( L − 1 ) \begin{aligned} &\frac{\partial C}{\partial w^{(L)}} = a^{(L - 1)} \cdot \sigma'(z^{(L)}) \cdot \delta^{(L)},\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; \frac{\partial C}{\partial b^{(L)}} = 1 \cdot \sigma'(z^{(L)}) \cdot \delta^{(L)} \\ &\frac{\partial C}{\partial w^{(L - 1)}} = a^{(L - 2)} \cdot \sigma'(z^{(L - 1)}) \delta^{(L - 1)}, \;\;\;\;\;\;\;\;\;\;\;\; \frac{\partial C}{\partial b^{(L - 1)}} = 1 \cdot \sigma'(z^{(L - 1)}) \cdot \delta^{(L - 1)} \end{aligned} w(L)C=a(L1)σ(z(L))δ(L),b(L)C=1σ(z(L))δ(L)w(L1)C=a(L2)σ(z(L1))δ(L1),b(L1)C=1σ(z(L1))δ(L1)

  其中 δ ( L ) \delta^{(L)} δ(L) 的递推式可以写成:

δ ( L − 1 ) = w ( L ) ⋅ σ ′ ( z ( L ) ) ⋅ δ ( L ) \delta^{(L - 1)} = w^{(L)}\cdot \sigma'(z^{(L)}) \cdot \delta^{(L)} δ(L1)=w(L)σ(z(L))δ(L)

  这个递推关系也很好证明:

δ ( L − 1 ) = ∂ C ∂ a ( L − 1 ) = ∂ C ∂ a ( L ) ⋅ ∂ a ( L ) ∂ z ( L ) ⋅ ∂ z ( L ) ∂ a ( L − 1 ) = δ ( L ) ⋅ σ ′ ( z ( L ) ) ⋅ w ( L ) \begin{aligned} \delta^{(L - 1)} = & \frac{\partial C}{\partial a^{(L - 1)}} \\ \\ = & \frac{\partial C}{\partial a^{(L)}} \cdot \frac{\partial a^{(L)}}{\partial z^{(L)}} \cdot \frac{\partial z^{(L)}}{\partial a^{(L - 1)}} \\ \\ = & \delta^{(L)} \cdot \sigma'(z^{(L)}) \cdot w^{(L)} \end{aligned} δ(L1)===a(L1)Ca(L)Cz(L)a(L)a(L1)z(L)δ(L)σ(z(L))w(L)

第三层

  我们最后来验证一下前二层的式子是否满足我们刚才得到的递推关系,同样的,我们有:

z ( L − 2 ) = w ( L − 2 ) a ( L − 3 ) + b ( L − 2 ) a ( L − 2 ) = σ ( z ( L − 2 ) ) z ( L − 1 ) = w ( L − 1 ) a ( L − 2 ) + b ( L − 1 ) a ( L − 1 ) = σ ( z ( L − 1 ) ) \begin{aligned} & z^{(L - 2)} = w^{(L - 2)}a^{(L - 3)} + b^{(L - 2)} \\ & a^{(L - 2)} = \sigma(z^{(L - 2)}) \\ & z^{(L - 1)} = w^{(L - 1)}a^{(L - 2)} + b^{(L - 1)} \\ & a^{(L - 1)} = \sigma(z^{(L - 1)}) \\ \end{aligned} z(L2)=w(L2)a(L3)+b(L2)a(L2)=σ(z(L2))z(L1)=w(L1)a(L2)+b(L1)a(L1)=σ(z(L1))

  于是:

∂ C ∂ w ( L − 2 ) = ∂ z ( L − 2 ) ∂ w ( L − 2 ) ⋅ ∂ a ( L − 2 ) ∂ z ( L − 2 ) ⋅ ∂ C ∂ a ( L − 2 ) = ∂ z ( L − 2 ) ∂ w ( L − 2 ) ⋅ ∂ a ( L − 2 ) ∂ z ( L − 2 ) ⋅ δ ( L − 2 ) = a ( L − 2 ) ⋅ σ ′ ( z ( L − 1 ) ) ⋅ δ ( L − 2 ) \begin{aligned} \frac{\partial C}{\partial w^{(L - 2)}} = & \frac{\partial z^{(L - 2)}}{\partial w^{(L - 2)}} \cdot \frac{\partial a^{(L - 2)}}{\partial z^{(L - 2)}} \cdot \frac{\partial C}{\partial a^{(L - 2)}} \\ \\ = & \frac{\partial z^{(L - 2)}}{\partial w^{(L - 2)}} \cdot \frac{\partial a^{(L - 2)}}{\partial z^{(L - 2)}} \cdot \delta^{(L - 2)} \\ \\ = & a^{(L - 2)} \cdot \sigma'(z^{(L - 1)}) \cdot \delta^{(L - 2)} \end{aligned} w(L2)C===w(L2)z(L2)z(L2)a(L2)a(L2)Cw(L2)z(L2)z(L2)a(L2)δ(L2)a(L2)σ(z(L1))δ(L2)

∂ C ∂ b ( L − 2 ) = ∂ z ( L − 2 ) ∂ b ( L − 2 ) ⋅ ∂ a ( L − 2 ) ∂ z ( L − 2 ) ⋅ ∂ C ∂ a ( L − 2 ) = ∂ z ( L − 2 ) ∂ b ( L − 2 ) ⋅ ∂ a ( L − 2 ) ∂ z ( L − 2 ) ⋅ δ ( L − 2 ) = a ( L − 2 ) ⋅ σ ′ ( z ( L − 1 ) ) ⋅ δ ( L − 2 ) \begin{aligned} \frac{\partial C}{\partial b^{(L - 2)}} = & \frac{\partial z^{(L - 2)}}{\partial b^{(L - 2)}} \cdot \frac{\partial a^{(L - 2)}}{\partial z^{(L - 2)}} \cdot \frac{\partial C}{\partial a^{(L - 2)}} \\ \\ = & \frac{\partial z^{(L - 2)}}{\partial b^{(L - 2)}} \cdot \frac{\partial a^{(L - 2)}}{\partial z^{(L - 2)}} \cdot \delta^{(L - 2)} \\ \\ = & a^{(L - 2)} \cdot \sigma'(z^{(L - 1)}) \cdot \delta^{(L - 2)} \end{aligned} b(L2)C===b(L2)z(L2)z(L2)a(L2)a(L2)Cb(L2)z(L2)z(L2)a(L2)δ(L2)a(L2)σ(z(L1))δ(L2)

  其中:

δ ( L − 2 ) = ∂ C ∂ a ( L − 2 ) = ∂ C ∂ a ( L − 1 ) ⋅ ∂ a ( L − 1 ) ∂ z ( L − 1 ) ⋅ ∂ z ( L − 1 ) ∂ a ( L − 2 ) = δ ( L − 1 ) ⋅ σ ′ ( z ( L − 1 ) ) ⋅ w ( L − 1 ) \begin{aligned} \delta^{(L - 2)} = & \frac{\partial C}{\partial a^{(L - 2)}} \\ \\ = & \frac{\partial C}{\partial a^{(L - 1)}} \cdot \frac{\partial a^{(L - 1)}}{\partial z^{(L - 1)}} \cdot \frac{\partial z^{(L - 1)}}{\partial a^{(L - 2)}} \\ \\ = & \delta^{(L - 1)} \cdot \sigma'(z^{(L - 1)}) \cdot w^{(L - 1)} \end{aligned} δ(L2)===a(L2)Ca(L1)Cz(L1)a(L1)a(L2)z(L1)δ(L1)σ(z(L1))w(L1)

  我们现在把已经得到的三组递推关系整理一下:

∂ C ∂ w ( L ) = a ( L − 1 ) ⋅ σ ′ ( z ( L ) ) ⋅ δ ( L ) , ∂ C ∂ b ( L ) = 1 ⋅ σ ′ ( z ( L ) ) ⋅ δ ( L ) ∂ C ∂ w ( L − 1 ) = a ( L − 2 ) ⋅ σ ′ ( z ( L − 1 ) ) δ ( L − 1 ) , ∂ C ∂ b ( L − 1 ) = 1 ⋅ σ ′ ( z ( L − 1 ) ) ⋅ δ ( L − 1 ) ∂ C ∂ w ( L − 2 ) = a ( L − 3 ) ⋅ σ ′ ( z ( L − 2 ) ) δ ( L − 2 ) , ∂ C ∂ b ( L − 2 ) = 1 ⋅ σ ′ ( z ( L − 2 ) ) ⋅ δ ( L − 2 ) \begin{aligned} &\frac{\partial C}{\partial w^{(L)}} = a^{(L - 1)} \cdot \sigma'(z^{(L)}) \cdot \delta^{(L)},\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; \frac{\partial C}{\partial b^{(L)}} = 1 \cdot \sigma'(z^{(L)}) \cdot \delta^{(L)} \\ &\frac{\partial C}{\partial w^{(L - 1)}} = a^{(L - 2)} \cdot \sigma'(z^{(L - 1)}) \delta^{(L - 1)}, \;\;\;\;\;\;\;\;\;\;\;\; \frac{\partial C}{\partial b^{(L - 1)}} = 1 \cdot \sigma'(z^{(L - 1)}) \cdot \delta^{(L - 1)} \\ &\frac{\partial C}{\partial w^{(L - 2)}} = a^{(L - 3)} \cdot \sigma'(z^{(L - 2)}) \delta^{(L - 2)}, \;\;\;\;\;\;\;\;\;\;\;\; \frac{\partial C}{\partial b^{(L - 2)}} = 1 \cdot \sigma'(z^{(L - 2)}) \cdot \delta^{(L - 2)} \end{aligned} w(L)C=a(L1)σ(z(L))δ(L),b(L)C=1σ(z(L))δ(L)w(L1)C=a(L2)σ(z(L1))δ(L1),b(L1)C=1σ(z(L1))δ(L1)w(L2)C=a(L3)σ(z(L2))δ(L2),b(L2)C=1σ(z(L2))δ(L2)

δ ( L ) = ( a ( L ) − y ) δ ( L − 1 ) = w ( L ) ⋅ σ ′ ( z ( L ) ) ⋅ δ ( L ) δ ( L − 2 ) = w ( L − 1 ) ⋅ σ ′ ( z ( L − 1 ) ) ⋅ δ ( L − 1 ) \begin{aligned} & \delta^{(L)} = (a^{(L)} - y) \\ & \delta^{(L - 1)} = w^{(L)}\cdot \sigma'(z^{(L)}) \cdot \delta^{(L)} \\ & \delta^{(L - 2)} = w^{(L - 1)}\cdot \sigma'(z^{(L - 1)}) \cdot \delta^{(L - 1)} \end{aligned} δ(L)=(a(L)y)δ(L1)=w(L)σ(z(L))δ(L)δ(L2)=w(L1)σ(z(L1))δ(L1)

  于是我们找到了一种递推关系使得我们可以递推来求得 C C C 关于所有 w w w b b b 的偏导数,这样一来我们就能利用这些偏导数来做 S G D SGD SGD 了。

http://www.dtcms.com/a/507995.html

相关文章:

  • --- 数据结构 AVL树 ---
  • 8、docker容器跨主机连接
  • 怎么建网站教程视频app网站开发软件、
  • Python 检测运动模糊 源代码
  • PHP面试题——字符串操作
  • SOLIDWORKS 2025——2D与3D的集成得到了显著提升
  • TypeScript函数与对象的类型增强
  • 专业做网站方案手机登录不了建设银行网站
  • 盐城市城乡建设局门户网站珠海建网站多少钱
  • 合肥建设企业网站软件开发建设网站
  • Ansible三大Web界面方案全解析
  • 北京网站搭建哪家好电子采购平台系统
  • [Power BI] 表
  • 做一个网站需要多少时间手机不想访问指定网站怎么做
  • hash算法性能优化实战
  • Java虚拟线程原理与性能优化实战
  • 同城派送小程序
  • 做网站开票是多少个点的票大秦wordpress微信支付
  • 东莞营销型网站建设费用网站流量查询 优帮云
  • 行业网站推广外包企业网站建设方案百度文库
  • 微软数字防御报告:AI成为新型威胁,自动化漏洞利用技术颠覆传统
  • 网站开发有哪些工作岗位网站建设公司哪家好 搜搜磐石网络
  • 2025年11月计划(qt网络+ue独立游戏)
  • 临沂企业网站开发官网如何制作小程序商城
  • 电商网站运营规划在阿里巴巴上做网站需要什么条件
  • 2025年6月英语四六级真题及参考答案【三套全】完整版PDF电子版
  • 大数据计算引擎-Catalyst 优化器:Spark SQL 的 “智能翻译官 + 效率管家”
  • 从零学算法1717
  • 什么是算法样本数据集?样本数据分享
  • 中山建网站多少钱美工图片制作软件