当前位置：首页 > news >正文

小白的进阶之路系列之四----人工智能从初步到精通pytorch自定义数据集下

news 来源：原创 2025/5/30 18:36:50

本篇涵盖的内容

在之前的文章中，我们已经讨论了如何获取数据，转换数据以及如何准备自定义数据集，本篇文章将涵盖更加深入的问题，希望通过详细的代码示例，帮助大家了解PyTorch自定义数据集是如何应对各种复杂实际情况中，数据处理的。

更加详细的，我们将讨论下面一些内容：

主题	内容
7 Model 0：没有数据增强的TinyVGG	到这个阶段，我们已经准备好了数据，让我们建立一个能够拟合数据的模型。我们还将创建一些训练和测试函数来训练和评估我们的模型。
8 探索损失曲线	损失曲线是观察你的模型如何训练/改进的好方法。它们也是一种很好的方法来判断你的模型是过拟合还是欠拟合。
9 Model 1：带数据增强功能的TinyVGG	到目前为止，我们已经尝试了一个没有数据增强的模型？
10 比较模型结果	让我们比较不同模型的损失曲线，看看哪个表现更好，并讨论一些改进性能的选项。
11 对自定义图像进行预测	我们的模型是在披萨、牛排和寿司图像的数据集上训练的。在本节中，我们将介绍如何使用我们训练好的模型来预测现有数据集之外的图像。

7 Model 0：没有数据增强的TinyVGG

好了，我们已经看到了如何把数据从文件夹里的图像变成变换后的张量。

现在让我们构建一个计算机视觉模型，看看我们是否可以将图像分类为披萨、牛排或寿司。

首先，我们将从一个简单的变换开始，仅将图像大小调整为（64,64）并将它们转换为张量。

7.1 为模型0创建转换和加载数据

# Create simple transform
simple_transform = transforms.Compose([ transforms.Resize((64, 64)),transforms.ToTensor(),
])

很好，现在我们有了一个简单的变换，让我们

加载数据，首先使用torchvision.datasets.ImageFolder()将每个训练和测试文件夹转换为Dataset
然后使用torch.utils.data.DataLoader())转换为数据加载器。
我们将把batch_size=32和num_workers设置为机器上尽可能多的cpu（这取决于您使用的机器）。

# 1. Load and transform data
from torchvision import datasets
train_data_simple = datasets.ImageFolder(root=train_dir, transform=simple_transform)
test_data_simple = datasets.ImageFolder(root=test_dir, transform=simple_transform)# 2. Turn data into DataLoaders
import os
from torch.utils.data import DataLoader# Setup batch size and number of workers 
BATCH_SIZE = 32
NUM_WORKERS = os.cpu_count()
print(f"Creating DataLoader's with batch size {BATCH_SIZE} and {NUM_WORKERS} workers.")# Create DataLoader's
train_dataloader_simple = DataLoader(train_data_simple, batch_size=BATCH_SIZE, shuffle=True, num_workers=NUM_WORKERS)test_dataloader_simple = DataLoader(test_data_simple, batch_size=BATCH_SIZE, shuffle=False, num_workers=NUM_WORKERS)print(train_dataloader_simple, test_dataloader_simple)

输出为：

Creating DataLoader's with batch size 32 and 16 workers.
<torch.utils.data.dataloader.DataLoader object at 0x0000024974F734D0> <torch.utils.data.dataloader.DataLoader object at 0x0000024974F07A80>

很好dataloader已经创建好了，现在让我们设立模型。

7.2创建TinyVGG模型类

在上一篇文章中，我们使用了来自CNN解释器网站的TinyVGG模型。

让我们重新创建相同的模型，只不过这次我们将使用彩色图像而不是灰度图像（对于RGB像素，in_channels=3而不是in_channels=1）。

class TinyVGG(nn.Module):"""Model architecture copying TinyVGG from: https://poloclub.github.io/cnn-explainer/"""def __init__(self, input_shape: int, hidden_units: int, output_shape: int) -> None:super().__init__()self.conv_block_1 = nn.Sequential(nn.Conv2d(in_channels=input_shape, out_channels=hidden_units, kernel_size=3, # how big is the square that's going over the image?stride=1, # defaultpadding=1), # options = "valid" (no padding) or "same" (output has same shape as input) or int for specific number nn.ReLU(),nn.Conv2d(in_channels=hidden_units, out_channels=hidden_units,kernel_size=3,stride=1,padding=1),nn.ReLU(),nn.MaxPool2d(kernel_size=2,stride=2) # default stride value is same as kernel_size)self.conv_block_2 = nn.Sequential(nn.Conv2d(hidden_units, hidden_units, kernel_size=3, padding=1),nn.ReLU(),nn.Conv2d(hidden_units, hidden_units, kernel_size=3, padding=1),nn.ReLU(),nn.MaxPool2d(2))self.classifier = nn.Sequential(nn.Flatten(),# Where did this in_features shape come from? # It's because each layer of our network compresses and changes the shape of our input data.nn.Linear(in_features=hidden_units*16*16,out_features=output_shape))def forward(self, x: torch.Tensor):x = self.conv_block_1(x)# print(x.shape)x = self.conv_block_2(x)# print(x.shape)x = self.classifier(x)# print(x.shape)return x# return self.classifier(self.conv_block_2(self.conv_block_1(x))) # <- leverage the benefits of operator fusiontorch.manual_seed(42)
model_0 = TinyVGG(input_shape=3, # number of color channels (3 for RGB) hidden_units=10, output_shape=len(train_data.classes)).to(device)
print(model_0)

输出为：

TinyVGG((conv_block_1): Sequential((0): Conv2d(3, 10, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))(1): ReLU()(2): Conv2d(10, 10, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))(3): ReLU()(4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False))(conv_block_2): Sequential((0): Conv2d(10, 10, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))(1): ReLU()(2): Conv2d(10, 10, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))(3): ReLU()(4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False))(classifier): Sequential((0): Flatten(start_dim=1, end_dim=-1)(1): Linear(in_features=2560, out_features=3, bias=True)</