DAY 43 训练
DAY 43 训练
- 使用 PyTorch 进行猫狗分类与可视化:从零开始的深度学习实践
- 数据预处理与加载:构建模型根基
- 模型构建与优化:深度学习核心实现
- 训练与评估:模型性能提升之旅
- 可视化探索:洞察模型决策奥秘
使用 PyTorch 进行猫狗分类与可视化:从零开始的深度学习实践
在当今数字化时代,图像分类任务作为计算机视觉领域的重要基石,正推动着诸多行业的智能化转型。本文将带您深入探索如何利用 PyTorch 框架高效实现猫狗二分类模型的构建与训练,并借助 Grad-CAM 技术直观可视化模型决策依据,助力您快速踏入深度学习实践领域。
数据预处理与加载:构建模型根基
优质的数据预处理是模型成功的关键起点。我们精心设计了针对训练集与测试集的不同预处理流程:
data_transforms = {'train': transforms.Compose([transforms.RandomResizedCrop(224),transforms.RandomHorizontalFlip(),transforms.ToTensor(),transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])]),'test': transforms.Compose([transforms.Resize(256),transforms.CenterCrop(224),transforms.ToTensor(),transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])]),
}
训练集经随机裁剪与水平翻转增强数据多样性,转化为张量后标准化;测试集则仅调整尺寸与裁剪。我们利用 PyTorch 的 ImageFolder
与 DataLoader
简洁加载本地猫狗数据集:
image_datasets = {x: datasets.ImageFolder(data_dir + '/' + x, data_transforms[x]) for x in ['train', 'test']}
dataloaders = {x: DataLoader(image_datasets[x], batch_size=4, shuffle=True, num_workers=0) for x in ['train', 'test']}
模型构建与优化:深度学习核心实现
基于业经验证的 ResNet18 预训练模型,我们巧妙改造以适配猫狗二分类任务:
model = models.resnet18(pretrained=True)
num_ftrs = model.fc.in_features
model.fc = nn.Linear(num_ftrs, 2) # 猫狗二分类
保留其强大特征提取能力,仅替换全连接层。配备交叉熵损失函数与动量随机梯度下降优化器:
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)
训练与评估:模型性能提升之旅
定义综合训练评估函数,精细把控训练流程:
def train_and_evaluate():device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")model.to(device)num_epochs = 5for epoch in range(num_epochs):print(f'Epoch {epoch}/{num_epochs - 1}')print('-' * 10)for phase in ['train', 'test']:if phase == 'train':model.train()else:model.eval()running_loss = 0.0running_corrects = 0for inputs, labels in dataloaders[phase]:inputs = inputs.to(device)labels = labels.to(device)optimizer.zero_grad()with torch.set_grad_enabled(phase == 'train'):outputs = model(inputs)_, preds = torch.max(outputs, 1)loss = criterion(outputs, labels)if phase == 'train':loss.backward()optimizer.step()running_loss += loss.item()running_corrects += torch.sum(preds == labels.data)epoch_loss = running_loss * inputs.size(0) / dataset_sizes[phase]epoch_acc = running_corrects.double() / dataset_sizes[phase]print(f'{phase} Loss: {epoch_loss:.4f} Acc: {epoch_acc:.4f}')print('Training complete')
多轮迭代训练,动态监测损失与准确率,逐步优化模型性能。
可视化探索:洞察模型决策奥秘
借助 Grad-CAM 技术,我们得以窥探模型决策依据:
def visualize_results():device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")model.eval()cam_extractor = GradCAM(model, 'layer4')inputs, labels = next(iter(dataloaders['test']))input_tensor = inputs[0].unsqueeze(0).to(device)out = model(input_tensor)activation_map = cam_extractor(out.squeeze(0).argmax().item(), out)inv_normalize = transforms.Normalize(mean=[-0.485/0.229, -0.456/0.224, -0.406/0.225],std=[1/0.229, 1/0.224, 1/0.225])img = inv_normalize(input_tensor[0].cpu())img_pil = transforms.ToPILImage()(img)activation_map_pil = transforms.ToPILImage()(torch.from_numpy(activation_map[0].cpu().numpy()))result = overlay_mask(img_pil, activation_map_pil, alpha=0.5)plt.imshow(result)plt.axis('off')plt.show()
精心反归一化还原图像,叠加热力图直观呈现关键区域,助力理解模型决策逻辑。
@浙大疏锦行