使用PyTorch实现线性回归:从零实现到高级API
本文将通过两种方式实现线性回归模型:从零编写代码和使用PyTorch的高级API。通过对比,读者可以深入理解线性回归的原理及PyTorch框架的便捷性。
1. 环境准备
首先导入必要的库:
import torch
import random
import numpy as np
from torch.utils import data
from torch import nn
from d2l import torch as d2l
2. 生成合成数据
自定义数据生成函数
def synthetic_data(w, b, num_examples):x = torch.normal(0, 1, (num_examples, len(w))) # 生成正态分布的特征y = torch.matmul(x, w) + b # 计算标签 y = Xw + by += torch.normal(0, 0.01, y.shape) # 添加噪声return x, y.reshape((-1, 1)) # 返回特征和列向量形式的标签# 真实参数
true_w = torch.tensor([2, -3, 4], dtype=torch.float32)
true_b = 4.2# 生成1000个样本
features, labels = synthetic_data(true_w, true_b, 1000)
print("特征示例:", features[0], "\n标签示例:", labels[0])
输出:
特征示例: tensor([-0.0977, 0.9691, 0.7497])
标签示例: tensor([4.0930])
使用d2l
库生成数据(高级API)
features, labels = d2l.synthetic_data(true_w, true_b, 1000)
3. 数据加载器
自定义数据迭代器
def data_iter(batch_size, features, labels):num_examples = len(features)indices = list(range(num_examples))random.shuffle(indices) # 打乱索引for i in range(0, num_examples, batch_size):batch_indices = torch.tensor(indices[i: min(i + batch_size, num_examples)])yield features[batch_indices], labels[batch_indices]# 测试数据迭代器
batch_size = 10
for X, y in data_iter(batch_size, features, labels):print("特征批次:\n", X, "\n标签批次:\n", y)break
输出:
特征批次:tensor([[-1.0714, -0.1717, 1.7789],[-0.0269, 0.2470, 0.4188],[ 0.3600, -0.2094, 0.9657],[ 0.0178, 0.3378, -0.9160],[ 0.4890, -1.4583, -1.1834],[ 1.1314, 0.7549, -0.2300],[ 2.7993, 0.0479, 0.4679],[ 0.0216, -0.7768, -0.8906],[-1.0918, 0.4722, -0.0571],[-1.2701, 0.4411, -0.4607]])
标签批次:tensor([[ 9.6695],[ 5.0800],[ 9.4123],[-0.4340],[ 4.8405],[ 3.2809],[11.5307],[ 3.0129],[ 0.3531],[-1.5160]])
使用PyTorch的DataLoader(高级API)
def load_array(data_arrays, batch_size, is_train=True):dataset = data.TensorDataset(*data_arrays)return data.DataLoader(dataset, batch_size, shuffle=is_train)data_iter = load_array((features, labels), batch_size)
print("高级API数据批次示例:", next(iter(data_iter)))
输出:
高级API数据批次示例:[tensor([[-0.3681, 1.0367, -1.4455],[ 0.9359, -1.0421, 0.9007],[ 0.3054, 1.1599, -0.2623],[-0.2199, -0.1331, -0.8884],[ 0.0704, -0.0540, -1.5971],[ 0.6973, -0.1256, -0.0169],[ 2.0356, 0.1585, -0.8231],[ 0.4966, -0.4041, 0.3239],[ 0.3591, -1.7367, -0.1096],[ 0.9977, -0.9013, -0.2911]]), tensor([[-5.4257],[12.7944],[ 0.2839],[ 0.5955],[-1.8793],[ 5.9008],[ 4.5037],[ 7.7093],[ 9.6797],[ 7.7381]])]
4. 从零实现线性回归
模型参数初始化
w = torch.normal(0, 0.01, size=(3, 1), requires_grad=True)
b = torch.zeros(1, requires_grad=True)
定义模型和损失函数
def linreg(X, w, b):return torch.matmul(X, w) + bdef loss_fn(y_hat, y):return (y_hat - y.reshape(y_hat.shape))**2 / 2 # 均方损失
优化算法(SGD)
def sgd(params, lr, batch_size):with torch.no_grad():for param in params:param -= lr * param.grad / batch_sizeparam.grad.zero_()
训练过程
lr = 0.01
num_epochs = 10for epoch in range(num_epochs):for X, y in data_iter(batch_size, features, labels):l = loss_fn(linreg(X, w, b), y).sum()l.backward()sgd([w, b], lr, batch_size)with torch.no_grad():train_l = loss_fn(linreg(features, w, b), labels)print(f"epoch:{epoch+1}, loss:{train_l.mean().item()}")
输出:
epoch:1, loss:3.123943328857422
epoch:2, loss:0.4247102737426758
epoch:3, loss:0.058328405022621155
...
epoch:10, loss:4.815837019123137e-05
参数误差分析
print("w的估计误差:", true_w - w.reshape(true_w.shape))
print("b的估计误差:", true_b - b)
输出:
w的估计误差: tensor([-0.0002, 0.0010, 0.0004], grad_fn=<SubBackward0>)
b的估计误差: tensor([-6.7234e-05], grad_fn=<RsubBackward1>)
5. 使用PyTorch高级API实现
定义模型和优化器
net = nn.Sequential(nn.Linear(3, 1)) # 输入维度3,输出维度1
net[0].weight.data.normal_(0, 0.01) # 权重初始化
net[0].bias.data.fill_(0) # 偏置初始化
loss = nn.MSELoss() # 均方误差损失
trainer = torch.optim.SGD(net.parameters(), lr=0.03)
训练过程
num_epochs = 3
for epoch in range(num_epochs):for X, y in data_iter:l = loss(net(X), y)trainer.zero_grad()l.backward()trainer.step()l = loss(net(features), labels)print(f"epoch:{epoch+1}, loss:{l}")
输出:
epoch:1, loss:0.0005811635637655854
epoch:2, loss:9.686605335446075e-05
epoch:3, loss:9.718212822917849e-05
6. 总结
本文通过两种方式实现了线性回归模型:
-
从零实现:手动定义模型、损失函数和优化算法,深入理解底层原理。
-
高级API实现:利用PyTorch的
nn.Module
和DataLoader
简化代码,提高开发效率。
两种方法均能有效逼近真实参数,且最终误差极小(接近 1e-4
)。读者可根据需求选择适合的实现方式。