第五章 神经网络的优化
损失函数
loss越小,代表预测值与真实值之间的差距越小,模型效果越好
作用
1.计算实际输出和目标之间的差距
2.为我们更新输出提供一定的依据(反向传播)grad
官方文档
L1loss官方文档:https://docs.pytorch.org/docs/stable/generated/torch.nn.L1Loss.html#torch.nn.L1Loss
MSEloss(均方误差)官方文档:https://docs.pytorch.org/docs/stable/generated/torch.nn.MSELoss.html#torch.nn.MSELoss
实例
import torch
from torch.nn import L1Loss, MSELossinput=torch.tensor([1,2,3],dtype=torch.float)
target=torch.tensor([1,2,5],dtype=torch.float)
input=torch.reshape(input,(1,1,1,3))#因为数据为一行三列
target=torch.reshape(target,(1,1,1,3))#因为数据为一行三列
loss=L1Loss()
res=loss(input,target)
print(res)
loss_mse=MSELoss()
res_mse=loss_mse(input,target)
print(res_mse)
tensor(0.6667)
tensor(1.3333)
优化器
作用
优化器是一种算法,用于调整神经网络的参数(如权重和偏置),以最小化损失函数的值。
官方文档
https://docs.pytorch.org/docs/stable/optim.html
实例
SGD
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)
Adam
optimizer = optim.Adam([var1, var2], lr=0.0001)
lr:学习速率
以Cifar_10为例搭建模型实战
根据图片搭建CIFAR——10模型,图片链接:
https://img-blog.csdnimg.cn/f217ce07c45f4c7c930b36f24e1b695d.png
图片中卷积层参数padding可以计算得到,(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros', device=None, dtype=None)
公式链接:https://docs.pytorch.org/docs/stable/generated/torch.nn.Conv2d.html#torch.nn.Conv2d
import torch
from torch import nn
from torch.nn import Conv2d, MaxPool2d, Flatten, Linear, CrossEntropyLossclass Myx(nn.Module):def __init__(self):super(Myx, self).__init__()self.conv1=Conv2d(3,32,5,padding=2)self.maxpool1=MaxPool2d(2)self.conv2=Conv2d(32,32,5,padding=2)self.maxpool2=MaxPool2d(2)self.conv3=Conv2d(32,64,5,padding=2)self.maxpool3=MaxPool2d(2)self.flatten=Flatten()self.linear1=Linear(1024,64)self.linear2=Linear(64,10)def forward(self, x):x = self.conv1(x)x = self.maxpool1(x)x = self.conv2(x)x = self.maxpool2(x)x = self.conv3(x)x = self.maxpool3(x)x = self.flatten(x)x = self.linear1(x)x = self.linear2(x)return x
myx=Myx()
print(myx)
Myx((conv1): Conv2d(3, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))(maxpool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)(conv2): Conv2d(32, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))(maxpool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)(conv3): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))(maxpool3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)(flatten): Flatten(start_dim=1, end_dim=-1)(linear1): Linear(in_features=1024, out_features=64, bias=True)(linear2): Linear(in_features=64, out_features=10, bias=True)
)
检验网络能否运行
import torch
input=torch.ones(64,3,32,32)
output=myx(input)
print(output.shape)
torch.Size([64, 10])
可以看到输入3*32*32,输出为10
使用sequential进行模型的封装
官方文档
https://docs.pytorch.org/docs/stable/generated/torch.nn.Sequential.html#torch.nn.Sequential
实例
from torch.nn import Sequential
class Myx(nn.Module):def __init__(self):super(Myx, self).__init__()self.model1=Sequential(Conv2d(3,32,5,padding=2),MaxPool2d(2),Conv2d(32,32,5,padding=2),MaxPool2d(2),Conv2d(32,64,5,padding=2),MaxPool2d(2),Flatten(),Linear(1024,64),Linear(64,10))def forward(self, x):x = self.model1(x)return x
完整模型
import torch
import torchvision
from torch import nn
from torch.nn import Conv2d, MaxPool2d, Flatten, Linear
from torch.nn import Sequential
from torch.utils.data import DataLoader
from torch.nn import CrossEntropyLossclass Myx(nn.Module):#创建网络def __init__(self):super(Myx, self).__init__()self.model1=Sequential(Conv2d(3,32,5,padding=2),MaxPool2d(2),Conv2d(32,32,5,padding=2),MaxPool2d(2),Conv2d(32,64,5,padding=2),MaxPool2d(2),Flatten(),Linear(1024,64),Linear(64,10))def forward(self, x):x = self.model1(x)return x
dataset=torchvision.datasets.CIFAR10(root="D:\myx\learn_pytorch\.dataset",train=False,transform=torchvision.transforms.ToTensor(),download=True)#下载数据集
dataloader=DataLoader(dataset,batch_size=1)#加载数据集
myx=Myx()#创建实例
loss=CrossEntropyLoss()#损失函数
optim=torch.optim.SGD(myx.parameters(),lr=0.01)#优化器,第一个参数为模型参数直接调用即可for epoch in range(20):total_loss=0.0for data in dataloader:imgs,labels=dataoutput=myx(imgs)loss_cross=loss(output,labels)optim.zero_grad()#将每次的梯度初始化为0,如果梯度不清零,pytorch中会将上次计算的梯度和本次计算的梯度累加loss_cross.backward()#损失函数反向传播得到梯度optim.step()#调用优化器,根据梯度对模型参数进行调节total_loss+=loss_crossprint(total_loss)optim.step()#调用优化器,根据梯度对模型参数进行调节
tensor(18737.3340, grad_fn=<AddBackward0>)
tensor(16176.0625, grad_fn=<AddBackward0>)
tensor(15556.8574, grad_fn=<AddBackward0>)
...