【深度学习入门】小土堆-学习笔记
最近需要开始入门深度学习相关的内容,因此先选择了小土堆的pytorch教程,了解了pytorch的基本使用方法,并且跟着视频搭建了一个自己的神经网络,并在数据集上得到了验证。本文是对看完这系列教程之后的总结,仅作分享,如有谬误欢迎指正。
土堆教程链接:PyTorch深度学习快速入门教程(绝对通俗易懂!)【小土堆】_哔哩哔哩_bilibili
概述
就我个人的理解而言,我们在日常项目种常说到的训练一个模型,其实质是通过建立一个模型,让其在训练集上进行参数的优化,最后再应用到验证集中得以验证。
因此,我将这个过程归纳为了以下几步,同时也作为本文后续的小标题:
1. 加载数据
2. 定义模型
3. 训练与验证
4. 完整的代码
1. 加载数据
加载数据相关的代码如下:
# 加载数据
train_set = torchvision.datasets.CIFAR10(root='./dataset', train=True,download=True, transform=torchvision.transforms.ToTensor())
test_set = torchvision.datasets.CIFAR10(root='./dataset', train=False,download=True, transform=torchvision.transforms.ToTensor())# dataloader
train_dataloader = DataLoader(train_set, batch_size=64)
test_dataloader = DataLoader(test_set, batch_size=64)
 
这部分代码中涉及以下几个点:
1. 采用torchvision的datasets方法加载数据集并保存在指定路径下,并用transform.Totensor对其格式做初步的处理。
2. 使用dataloader将数据分为64个一组,方面在后续的训练和验证中批量载入。
2. 定义模型
定义模型相关的代码如下:
class Net(nn.Module):def __init__(self):super(Net, self).__init__()self.model = nn.Sequential(nn.Conv2d(3, 32, kernel_size=5, stride=1, padding=2),nn.MaxPool2d(kernel_size=2),nn.Conv2d(32, 32, kernel_size=5, stride=1, padding=2),nn.MaxPool2d(kernel_size=2),nn.Conv2d(32, 64, kernel_size=5, stride=1, padding=2),nn.MaxPool2d(kernel_size=2),nn.Flatten(),nn.Linear(64*4*4, 64),nn.Linear(64, 10))def forward(self, x):x = self.model(x)return x 
模型的定义都遵循这样的模板,在init中定义模型结构,在forward中定义输入与输出变换。
其中,在init部分,有这几个点需要注意:
1. Conv2d:是卷积方法,其作用是丰富数据的可观测指标
2. MaxPool2d:池化层,用于减少数据维度,加快训练
3. Linear:线性层,起到归纳结论的作用
3. 训练与验证
for i in range(epoch):print("-----第{}轮训练开始-----".format(i))net.train()for train_data in train_dataloader:imgs, targets = train_dataimgs = imgs.to(device)targets = targets.to(device)outputs = net(imgs)loss = loss_fn(outputs, targets)optimizer.zero_grad()loss.backward()optimizer.step()total_train_step += 1if total_train_step % 100 == 0:end_time = time.time()print("已完成{}次训练,loss:{},用时:{}".format(total_train_step,loss.item(), end_time - start_time))writer.add_scalar('train_loss', loss.item(), total_train_step)# 测试步骤开始total_test_loss = 0total_test_accuracy = 0net.eval()with torch.no_grad():for test_data in test_dataloader:imgs, targets = test_dataimgs = imgs.to(device)targets = targets.to(device)outputs = net(imgs)loss = loss_fn(outputs, targets)total_test_loss += loss.item()accuracy = (outputs.argmax(1) == targets).sum()total_test_accuracy += accuracy.item()total_test_step += 1# print("整体测试集上的Loss:{}".format(total_test_loss))print("整体测试集上的正确率:{}".format(total_test_accuracy / test_length))writer.add_scalar('test_loss', total_test_loss, total_test_step) 
在训练时,遵循这样的流程:取数据-送入模型得到输出-得到结果与理想值之间的差异-清空上一轮的梯度数据-反向传播-优化参数
验证时,则不再需要关注梯度,而是单纯使用经过训练后的模型对验证集上的初始数据做处理,最后得到输出,进而算出相关指标。
4. 完整的代码
import torch
import torchvision
from torch import optim, nn
from torch.nn import CrossEntropyLoss
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
import time# from p23_model import *class Net(nn.Module):def __init__(self):super(Net, self).__init__()self.model = nn.Sequential(nn.Conv2d(3, 32, kernel_size=5, stride=1, padding=2),nn.MaxPool2d(kernel_size=2),nn.Conv2d(32, 32, kernel_size=5, stride=1, padding=2),nn.MaxPool2d(kernel_size=2),nn.Conv2d(32, 64, kernel_size=5, stride=1, padding=2),nn.MaxPool2d(kernel_size=2),nn.Flatten(),nn.Linear(64*4*4, 64),nn.Linear(64, 10))def forward(self, x):x = self.model(x)return x# 定义训练设备
device= torch.device("cuda" if torch.cuda.is_available() else "cpu")
start_time = time.time()# 加载数据
train_set = torchvision.datasets.CIFAR10(root='./dataset', train=True,download=True, transform=torchvision.transforms.ToTensor())
test_set = torchvision.datasets.CIFAR10(root='./dataset', train=False,download=True, transform=torchvision.transforms.ToTensor())# 计算长度
train_length = len(train_set)
test_length = len(test_set)print("训练集长度为{},测试集长度为{}".format(train_length,test_length))# dataloader
train_dataloader = DataLoader(train_set, batch_size=64)
test_dataloader = DataLoader(test_set, batch_size=64)# 创建模型
net = Net()
net = net.to(device)# 损失函数
loss_fn = CrossEntropyLoss()
loss_fn = loss_fn.to(device)# 优化器
learning_rate = 1e-2
optimizer = optim.SGD(net.parameters(), lr=learning_rate)# 看板
writer = SummaryWriter("p23_board")
# 训练
total_train_step = 0
total_test_step = 0
epoch = 10for i in range(epoch):print("-----第{}轮训练开始-----".format(i))net.train()for train_data in train_dataloader:imgs, targets = train_dataimgs = imgs.to(device)targets = targets.to(device)outputs = net(imgs)loss = loss_fn(outputs, targets)optimizer.zero_grad()loss.backward()optimizer.step()total_train_step += 1if total_train_step % 100 == 0:end_time = time.time()print("已完成{}次训练,loss:{},用时:{}".format(total_train_step,loss.item(), end_time - start_time))writer.add_scalar('train_loss', loss.item(), total_train_step)# 测试步骤开始total_test_loss = 0total_test_accuracy = 0net.eval()with torch.no_grad():for test_data in test_dataloader:imgs, targets = test_dataimgs = imgs.to(device)targets = targets.to(device)outputs = net(imgs)loss = loss_fn(outputs, targets)total_test_loss += loss.item()accuracy = (outputs.argmax(1) == targets).sum()total_test_accuracy += accuracy.item()total_test_step += 1# print("整体测试集上的Loss:{}".format(total_test_loss))print("整体测试集上的正确率:{}".format(total_test_accuracy / test_length))writer.add_scalar('test_loss', total_test_loss, total_test_step)writer.close() 
我们训练得到了自己的模型,现在我们随便找一张照片,让模型处理一下该照片,看看输出是什么:
from PIL import Image
from torchvision import transformsfrom p23_model import *# 加载待测试的分类图片
img_path = "ship.jpg"
img = Image.open(img_path)
img = img.convert('RGB')transform = transforms.Compose([transforms.Resize((32, 32)),transforms.ToTensor(),
])img = transform(img)
img = torch.reshape(img, (1, 3, 32, 32))# 加载训练好的模型
model = torch.load("gpu_model_29.pth",map_location=torch.device('cpu'))
model.eval()# 测试
with torch.no_grad():output = model(img)
print(output.argmax(dim=1))
 
