当前位置：首页 > wzjs >正文

网络培训心得体会1000字点金推广优化公司

wzjs 2025/8/1 12:31:53

网络培训心得体会1000字,点金推广优化公司,网校课程,wordpress云图插件文章目录 1. 内置损失函数2. 继承 nn.Module 自定义损失函数3. 继承 autograd.Function 自定义损失函数3. 三种不同方式实现 MSE 实验 PyTorch 除了内置损失函数，还可以自定义损失函数。我们以均方误差为例来讲解 PyTorch 中损失函数的使用方法。均方误差(Mean Squa…

文章目录

1. 内置损失函数
2. 继承 nn.Module 自定义损失函数
3. 继承 autograd.Function 自定义损失函数
3. 三种不同方式实现 MSE 实验

PyTorch 除了内置损失函数，还可以自定义损失函数。我们以均方误差为例来讲解 PyTorch 中损失函数的使用方法。均方误差(Mean Squared Error, MSE)是预测值

x=(x_1, x_2, ..., x_n)

与真实值

y=(y_1, y_2, ..., y_n)

之差的平方和的平均值，数学公式如下：

\text{MSE} = \frac{1}{n} \sum_{i=1}^{n} (x_i - y_i)^2

计算

\text{MSE}

损失函数对输入向量

x

的梯度如下：

\dfrac{dMSE}{dx} = \dfrac{2}{n}(x-y)

具体而言，

\dfrac{dMSE}{dx_i} = \dfrac{2}{n}(x_i - y_i)

1. 内置损失函数

PyTorch 在 torch.nn 模块中提供了均方误差函数：

import torch.nn as nnmse_loss = nn.MSELoss()

2. 继承 nn.Module 自定义损失函数

只需实现 forward() 方法，无需手动编写反向传播（自动求导引擎处理）。自定义损失函数类实例化后直接调用即可计算损失值。
继承 nn.Module 自定义均方误差损失函数的实现代码如下：

import torch.nn as nnclass MSELossV1(nn.Module):def __init__(self):super().__init__()def forward(self, input, target):squared_diff = (input - target) ** 2n = squared_diff.numel()return squared_diff.sum() / n

3. 继承 autograd.Function 自定义损失函数

在 PyTorch 中，torch.autograd.Function 是一个用于定义自定义自动求导操作的类。它允许用户实现自定义的前向传播forward 和反向传播 backward 逻辑。这对于实现非标准操作、自定义激活函数、或在某些特殊场景中替代现有 PyTorch 操作非常有用。
torch.autograd.Function 实现自定义求导，需要实现 forward 和 backward 方法，这意味着需要自己手算反向传播求梯度公式。
ctx 是上下文对象，用于在 forward 和 backward 之间传递数据。常用方法是：

ctx.save_for_backward(*tensors)：保存张量供反向传播使用
ctx.saved_tensors：获取保存的张量

forward 方法返回计算结果，而 backward 返回对每个输入的梯度。
Function.apply(input) 是调用自定义函数的标准方式。继承 autograd.Function 自定义均方误差损失函数的实现代码如下：

import torch
from torch.autograd import Functionclass MSELossV2(Function):@staticmethoddef forward(ctx, input, target):squared_diff = (input - target) ** 2n = squared_diff.numel()ctx.save_for_backward(input, target)return squared_diff.sum() / n@staticmethoddef backward(ctx, grad_output):input, target = ctx.saved_tensorsn = input.numel()grad_input = 2 / n * (input - target) * grad_outputreturn grad_input, None

在 PyTorch 的 torch.autograd.Function 中，backward 方法的返回值数量和顺序必须与 forward 方法的输入参数一一对应。例如，forward 传入的参数为 input 和 target，则 backward 也要返回两个梯度（例如 grad_input, None）。
每个输入参数都需要对应一个梯度输出：

如果输入参数是张量且需要梯度(requires_grad=True)，返回其梯度
如果输入参数是整数或不需要梯度的张量，返回 None

backward 中的 grad_output 是一个张量，其形状与当前操作的输出张量一致。它表示在反向传播时，每个输出元素的梯度乘以一个
权重（即 grad_output 的值），从而影响输入梯度的计算。

如果 grad_output 未指定（默认为 None），PyTorch 会假设输出是一个标量，并自动使用全 1 的权重，即 torch.ones_like(output)
如果输出是向量或张量，则必须显式指定 grad_output，否则会报错

grad_output 的使用总结如下：

场景	`grad_output` 的作用	示例
标量输出	默认为 1，无需显式指定	`loss.backward()`
向量输出	必须指定，形状与输出一致	`y.backward(torch.ones_like(y))`
多输出	每个输出对应一个 `grad_output`	`grad_output=[v1, v2]`
自定义反向传播	传递上层梯度，计算输入梯度	`backward(ctx, grad_output)`

代码示例：

import torchx = torch.tensor([2.0], requires_grad=True)
у = x**2
у.backward()   # 等价于 y.backward(torch.tensor(1.0))
print(x.grad)  # 输出 4.0 (dy/dx = 2x = 4)x2 = torch.tensor([1.0, 2.0], requires_grad=True)
y = x2 * 2
grad_output = torch.tensor([1.0, 0.5])  # 权重分别为 1 和 0.5
y.backward(grad_output)  # x2_grad = tensor([2., 1.]) (grad_output · dy/dx = [1.0, 0.5] · [2., 2.] = [2., 1.])

3. 三种不同方式实现 MSE 实验

实验代码如下：

import torch
import torch.nn as nn
from torch.autograd import Functionclass MSELossV1(nn.Module):def __init__(self):super().__init__()def forward(self, input, target):squared_diff = (input - target) ** 2n = squared_diff.numel()return squared_diff.sum() / nclass MSELossV2(Function):@staticmethoddef forward(ctx, input, target):squared_diff = (input - target) ** 2n = squared_diff.numel()ctx.save_for_backward(input, target)return squared_diff.sum() / n@staticmethoddef backward(ctx, grad_output):input, target = ctx.saved_tensorsn = input.numel()grad_input = 2 / n * (input - target) * grad_outputreturn grad_input, Noneif __name__ == "__main__":mse_loss = nn.MSELoss()mse_loss_v2 = MSELossV1()x = torch.tensor([[1.0, 2.0, 3.0],[4.0, 5.0, 6.0],[7.0, 8.0, 9.0]], requires_grad=True)x2 = x.detach().clone().requires_grad_(True)x3 = x.detach().clone().requires_grad_(True)y = torch.tensor([[0.5, 2.5, 2.0],[3.5, 5.5, 5.0],[6.5, 8.5, 8.0]])loss = mse_loss(x, y)loss2 = mse_loss_v2(x2, y)loss3 = MSELossV2.apply(x3, y)print(f"loss: {loss}, loss2: {loss2}, loss3: {loss3}")loss.backward()loss2.backward()loss3.backward()print(f"x.grad: \n{x.grad}\n x2.grad: \n{x2.grad}\n x3.grad: \n{x3.grad}")