当前位置：首页 > news >正文

PyTorch 中 nn.Linear() 参数详解与实战解析（gpt）

news 2025/7/1 14:23:58

🔍 PyTorch 中 `nn.Linear()` 参数详解与实战解析

在使用 PyTorch 构建神经网络时，nn.Linear() 是最常用也最基础的模块之一。它用于实现一个全连接层（Fully Connected Layer），本质上就是对输入进行一次线性变换：

$y = xA^T + b$

本文将详细介绍 nn.Linear() 的参数含义、属性说明、初始化机制，并结合实际代码案例帮助你真正理解它的工作原理。

📌 一、基本用法：线性层的定义

PyTorch 中创建一个线性层的语法如下：

nn.Linear(in_features, out_features, bias=True)

in_features: 输入特征的维度
out_features: 输出特征的维度
bias: 是否包含偏置项 b，默认 True

这个线性层的作用是：将输入 x ∈ ℝ^{in_features} 映射为输出 y ∈ ℝ^{out_features}，形如：

$y = Wx^T + b$

其中：

权重矩阵 W 形状为 (out_features, in_features)
偏置向量 b 形状为 (out_features,)

🧪 二、代码案例解析

import torch
import torch.nn as nna_data = nn.Sequential()
a_data.fc1 = nn.Linear(28 * 28, 500)  # 输入 784 维，输出 500 维
print(a_data.fc1)
print(a_data.fc1.weight.shape)

输出结果：

Linear(in_features=784, out_features=500, bias=True)
torch.Size([500, 784])

解释：

输入是一张 28x28 的图像，展平成 784 维向量
线性层输出 500 维特征，因此 weight 的形状为 [500, 784]
每一行表示将输入 784 维投影到某个输出维度的权重组合

⚙️ 三、权重与偏置的初始化机制

在 nn.Linear 中，PyTorch 默认使用如下规则初始化参数：

✅ 权重 `weight`

形状：(out_features, in_features)
初始化：均匀分布 $\mathcal{U}(-\sqrt{k}, \sqrt{k})$ ，其中 $in_features k = \frac{1}{\text{in\_features}}$

✅ 偏置 `bias`

形状：(out_features,)
初始化：同样是 $\mathcal{U}(-\sqrt{k}, \sqrt{k})$

这种初始化策略可以有效防止神经网络训练初期出现梯度爆炸或消失问题。

💡 四、实战示例：批量输入与输出

m = nn.Linear(20, 30)
input = torch.randn(128, 20)   # 批量输入 128 个样本，每个 20 维
output = m(input)
print(output.shape)            # 输出为 [128, 30]

解释：

输入张量 shape 是 [128, 20]
经过线性层后，输出变成 [128, 30]，即每个样本都被线性映射为 30 维向量

📘 五、总结

项目	含义
输入 shape	`[batch_size, in_features]`
输出 shape	`[batch_size, out_features]`
权重 shape	`[out_features, in_features]`
偏置 shape	`[out_features]`
初始化方式	$\mathcal{U}(-\sqrt{k}, \sqrt{k})$ ， $in_features k = \frac{1}{\text{in\_features}}$