当前位置：首页 > news >正文

深度学习入门-05

news 2025/10/26 16:58:23

基于小土堆学习
优化器是用于优化损失函数的；

常见的Transforms：

在这里插入图片描述
需要注意不同的数据格式；

class Compose:
    """Composes several transforms together. This transform does not support torchscript.
    Please, see the note below.

    Args:
        transforms (list of ``Transform`` objects): list of transforms to compose.

    Example:
        >>> transforms.Compose([
        >>>     transforms.CenterCrop(10),
        >>>     transforms.PILToTensor(),
        >>>     transforms.ConvertImageDtype(torch.float),
        >>> ])

    .. note::
        In order to script the transformations, please use ``torch.nn.Sequential`` as below.

        >>> transforms = torch.nn.Sequential(
        >>>     transforms.CenterCrop(10),
        >>>     transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)),
        >>> )
        >>> scripted_transforms = torch.jit.script(transforms)

        Make sure to use only scriptable transformations, i.e. that work with ``torch.Tensor``, does not require
        `lambda` functions or ``PIL.Image``.

    """

    def __init__(self, transforms):
        if not torch.jit.is_scripting() and not torch.jit.is_tracing():
            _log_api_usage_once(self)
        self.transforms = transforms

    def __call__(self, img):
        for t in self.transforms:
            img = t(img)
        return img

    def __repr__(self) -> str:
        format_string = self.__class__.__name__ + "("
        for t in self.transforms:
            format_string += "\n"
            format_string += f"    {t}"
        format_string += "\n)"
        return format_string

这段代码定义了一个名为Compose的类，它用于将多个图像变换（transforms）组合在一起。这个类不是为了支持torchscript而设计的。Compose类接收一个变换对象列表作为输入，并允许按顺序对这些变换进行组合和执行。

构造函数__init__接收一个变换列表transforms，并将这个列表存储在实例变量self.transforms中。如果当前不是在torchscript的脚本化或追踪模式下，构造函数还会调用_log_api_usage_once(self)函数来记录API的使用情况（尽管这个函数在代码段中没有给出定义）。

__call__方法允许Compose实例像函数一样被调用。它遍历self.transforms中的每个变换，依次将图像img传递给这些变换，并返回最终的变换结果。

__repr__方法定义了Compose实例的字符串表示形式。它返回一个格式化的字符串，其中包含了Compose类名和所有组合的变换的字符串表示。

示例用法展示了如何使用Compose来组合多个图像变换，比如中心裁剪、将PIL图像转换为张量、以及转换图像数据类型。同时，注释也指出了如何使用torch.nn.Sequential来替代Compose进行脚本化变换，以及在使用torch.jit.script时需要注意的事项。

Compose 类在图像处理库中（如 torchvision.transforms）非常常见，它用于将多个图像变换步骤组合成一个单一的可调用对象。以下是一个使用 Compose 类的示例，假设我们已经在使用 PyTorch 和 torchvision：

import torchvision.transforms as transforms  
from PIL import Image  
  
# 定义一系列图像变换操作  
transform_list = [  
    transforms.Resize((256, 256)),  # 将图像大小调整为 256x256  
    transforms.CenterCrop(224),     # 从图像中心裁剪出 224x224 的区域  
    transforms.ToTensor(),          # 将 PIL 图像转换为 PyTorch 张量  
    transforms.Normalize(mean=[0.485, 0.456, 0.406],  # 标准化图像  
                         std=[0.229, 0.224, 0.225])  
]  
  
# 使用 Compose 类将这些变换组合起来  
transform = transforms.Compose(transform_list)  
  
# 加载一张 PIL 图像  
img = Image.open('path_to_your_image.jpg')  
  
# 应用组合好的变换  
transformed_img = transform(img)  
  
# 此时 transformed_img 是一个 PyTorch 张量，可以用于模型推理等后续步骤

在这个示例中，我们首先导入 torchvision.transforms 模块，并定义了一个包含多个图像变换操作的列表 transform_list。然后，我们使用 transforms.Compose 将这些操作组合成一个单一的变换对象 transform。最后，我们加载一张 PIL 图像，并应用这个组合好的变换，得到可以用于后续步骤的 PyTorch 张量 transformed_img。

class Person:
    def __call__(self, name):
        print("__call__"+"Hello"+"  "+name)

    def hello(self,name):
        print("hello"+"  "+name)
person = Person()
person("ZHANGSAN")
person.hello("lisis")

对比person调用函数的情况
结果如下

C:\Anaconda3\envs\pytorch_test\python.exe H:\Python\Test\01test\CallTest.py 
__call__Hello  ZHANGSAN
hello  lisis

输入函数名称后，在pycharm中单击ctrl+P，就可以提示出，当前需要输入什么参数；

1、 ToTensor使用

from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms

write = SummaryWriter("logs")
img = Image.open("dataset/train/bees/92663402_37f379e57a.jpg")
print(img)

#ToTensor()使用
trans_totensor = transforms.ToTensor()
img_tensor = trans_totensor(img)

write.add_image("Tensor_img",img_tensor)
write.close()

在local中输入：

 tensorboard --logdir logs --port=6007

结果为：

(pytorch_test) PS H:\Python\Test> tensorboard --logdir logs --port=6007
TensorFlow installation not found - running with reduced feature set.
Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all
TensorBoard 2.17.1 at http://localhost:6007/ (Press CTRL+C to quit)

单击6007网址，弹出：
在这里插入图片描述

class Normalize(torch.nn.Module):归一化使用

描述的是一个用于对张量图像进行标准化处理的变换操作。具体来说，它会根据给定的每个通道的均值（mean）和标准差（std），对输入张量图像的每个通道进行标准化处理。标准化处理的公式是：

output[channel] = (input[channel] - mean[channel]) / std[channel]

其中，output[channel] 表示输出张量在某个通道上的值，input[channel] 表示输入张量在该通道上的值，mean[channel] 和 std[channel] 分别表示该通道的均值和标准差。

这个变换操作不支持 PIL 图像，只能用于 PyTorch 的张量（torch.*Tensor）。
参数说明：

mean：一个序列，包含每个通道的均值。
std：一个序列，包含每个通道的标准差。
inplace：一个可选的布尔值，用于指定是否在原地（in-place）进行这个操作。如果设置为 True，则会直接在输入张量上进行修改，不会创建新的输出张量；如果设置为 False（默认值），则会创建一个新的输出张量，不会修改输入张量。

class Normalize(torch.nn.Module):
    """Normalize a tensor image with mean and standard deviation.
    This transform does not support PIL Image.
    Given mean: ``(mean[1],...,mean[n])`` and std: ``(std[1],..,std[n])`` for ``n``
    channels, this transform will normalize each channel of the input
    ``torch.*Tensor`` i.e.,
    ``output[channel] = (input[channel] - mean[channel]) / std[channel]``

    .. note::
        This transform acts out of place, i.e., it does not mutate the input tensor.

    Args:
        mean (sequence): Sequence of means for each channel.
        std (sequence): Sequence of standard deviations for each channel.
        inplace(bool,optional): Bool to make this operation in-place.

    """

    def __init__(self, mean, std, inplace=False):
        super().__init__()
        _log_api_usage_once(self)
        self.mean = mean
        self.std = std
        self.inplace = inplace

    def forward(self, tensor: Tensor) -> Tensor:
        """
        Args:
            tensor (Tensor): Tensor image to be normalized.

        Returns:
            Tensor: Normalized Tensor image.
        """
        return F.normalize(tensor, self.mean, self.std, self.inplace)

    def __repr__(self) -> str:
        return f"{self.__class__.__name__}(mean={self.mean}, std={self.std})"

这段代码定义了一个名为 Normalize 的类，它是 torch.nn.Module 的子类，用于对张量图像进行标准化处理。这个类不接受 PIL 图像作为输入，只处理 PyTorch 的张量（torch.*Tensor）。

在初始化函数 init 中，它接收三个参数：mean、std 和 inplace。mean 和 std 分别是每个通道的均值和标准差，它们都是序列类型。inplace 是一个可选的布尔值，用于指定是否在原地（in-place）进行标准化操作。

forward 函数是类的核心，它接收一个张量作为输入，并返回标准化后的张量。在这个函数中，它调用了 F.normalize 方法来执行标准化操作。F.normalize 方法是 PyTorch 提供的一个函数，用于对张量进行标准化。它接收输入张量、均值、标准差和 inplace 参数，并返回标准化后的张量。

repr 函数返回了类的字符串表示，包括类名和初始化时设置的均值和标准差。

需要注意的是，代码中的 _log_api_usage_once(self) 调用似乎是一个自定义的函数，用于记录 API 的使用情况，但在代码片段中没有给出这个函数的定义。在实际使用中，你可能需要替换或删除这个调用，或者提供 _log_api_usage_once 函数的定义。

另外，F.normalize 函数在 PyTorch 中实际上并不直接提供按照均值和标准差进行标准化的功能。通常，标准化是通过计算 (input - mean) / std 来实现的，而不是使用 F.normalize。因此，这里的 F.normalize 调用可能是一个错误或者是一个自定义的函数。在实际应用中，你应该使用正确的标准化方法来实现这个功能。

from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms

write = SummaryWriter("logs")
img = Image.open("dataset/train/bees/92663402_37f379e57a.jpg")
print(img)

#ToTensor()使用
trans_totensor = transforms.ToTensor()
img_tensor = trans_totensor(img)



print("img_tensor[0][0][0]:",img_tensor[0][0][0])
#class Normalize(torch.nn.Module):使用
trans_norm = transforms.Normalize([0.5,2,6],[0.5,3,0.5])
#此处的均值和方差为随机输入的，实际下可以计算影像的每个波段的均值和方差
img_norm = trans_norm(img_tensor)
#因为是RGB三通道，因此需要输入三个均值和三个标准差
print("img_norm[0][0][0]:",img_norm[0][0][0])
write.add_image("Tensor_img",img_tensor)
write.add_image("Norm",img_norm)
write.close()

代码运行结果：

C:\Anaconda3\envs\pytorch_test\python.exe H:\Python\Test\P10_Usefultransforms.py 
<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=500x374 at 0x20E95332EC0>
img_tensor[0][0][0]: tensor(0.1451)
img_norm[0][0][0]: tensor(-0.7098)

进程已结束,退出代码0

TensorBoard结果：
在这里插入图片描述

Resize使用

函数介绍见前篇博文

from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms

write = SummaryWriter("logs")
img = Image.open("dataset/train/bees/92663402_37f379e57a.jpg")
print(img)

#ToTensor()使用
trans_totensor = transforms.ToTensor()
img_tensor = trans_totensor(img)



print("img_tensor[0][0][0]:",img_tensor[0][0][0])

#class Normalize(torch.nn.Module):使用
trans_norm = transforms.Normalize([0.5,2,6],[0.5,3,0.5])
#此处的均值和方差为随机输入的，实际下可以计算影像的每个波段的均值和方差
img_norm = trans_norm(img_tensor)
#因为是RGB三通道，因此需要输入三个均值和三个标准差
print("img_norm[0][0][0]:",img_norm[0][0][0])

#Resize
print(img.size)
trans_resize = transforms.Resize((512,512))
img_resize = trans_resize(img_tensor)




write.add_image("Tensor_img",img_tensor)
write.add_image("Norm",img_norm)
write.add_image("img_resize ",img_resize )
write.close()

输出结果为

C:\Anaconda3\envs\pytorch_test\python.exe H:\Python\Test\P10_Usefultransforms.py 
<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=500x374 at 0x1EC84B9B1C0>
img_tensor[0][0][0]: tensor(0.1451)
img_norm[0][0][0]: tensor(-0.7098)
(500, 374)

进程已结束,退出代码0

通过该处理，将(500, 374)的图像变换为了（512,512）

在这里插入图片描述

Compose使用

from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms

write = SummaryWriter("logs")
img = Image.open("dataset/train/bees/92663402_37f379e57a.jpg")
print(img)

#ToTensor()使用
trans_totensor = transforms.ToTensor()
img_tensor = trans_totensor(img)



print("img_tensor[0][0][0]:",img_tensor[0][0][0])

#class Normalize(torch.nn.Module):使用
trans_norm = transforms.Normalize([0.5,2,6],[0.5,3,0.5])
#此处的均值和方差为随机输入的，实际下可以计算影像的每个波段的均值和方差
img_norm = trans_norm(img_tensor)
#因为是RGB三通道，因此需要输入三个均值和三个标准差
print("img_norm[0][0][0]:",img_norm[0][0][0])

#Resize
print("img.size:",img.size)
trans_resize = transforms.Resize((1024,512))
img_resize = trans_resize(img_tensor)
print("img_resize:",img_resize)

#Compose
trans_resize_2 = transforms.Resize((1024,600))
trans_compose = transforms.Compose([trans_resize_2 ,trans_totensor,trans_norm])#后一个的输入必须要和前一个的输出匹配
img_compose = trans_compose(img)

write.add_image("Compose",img_compose,3)
write.add_image("Tensor_img",img_tensor)
write.add_image("Norm",img_norm)
write.add_image("img_resize ",img_resize ,1)
write.close()

输出结果为：

C:\Anaconda3\envs\pytorch_test\python.exe H:\Python\Test\P10_Usefultransforms.py 
<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=500x374 at 0x2419D6B35E0>
img_tensor[0][0][0]: tensor(0.1451)
img_norm[0][0][0]: tensor(-0.7098)
img.size: (500, 374)
img_resize: tensor([[[0.1451, 0.1716, 0.1799,  ..., 0.1317, 0.1692, 0.1843],
         [0.1466, 0.1711, 0.1785,  ..., 0.1310, 0.1692, 0.1851],
         [0.1581, 0.1674, 0.1674,  ..., 0.1256, 0.1694, 0.1908],
         ...,
         [0.3036, 0.2894, 0.2652,  ..., 0.3344, 0.3145, 0.3170],
         [0.3021, 0.2838, 0.2608,  ..., 0.3359, 0.3172, 0.3141],
         [0.3020, 0.2830, 0.2602,  ..., 0.3361, 0.3175, 0.3137]],

        [[0.2431, 0.2431, 0.2394,  ..., 0.2143, 0.2548, 0.2510],
         [0.2443, 0.2441, 0.2395,  ..., 0.2132, 0.2543, 0.2523],
         [0.2529, 0.2513, 0.2399,  ..., 0.2049, 0.2505, 0.2623],
         ...,
         [0.3275, 0.3044, 0.2740,  ..., 0.3261, 0.2996, 0.2967],
         [0.3361, 0.3033, 0.2726,  ..., 0.3274, 0.2980, 0.2909],
         [0.3373, 0.3032, 0.2724,  ..., 0.3276, 0.2978, 0.2902]],

        [[0.4000, 0.3849, 0.3548,  ..., 0.0872, 0.1664, 0.2118],
         [0.4013, 0.3856, 0.3554,  ..., 0.0863, 0.1664, 0.2133],
         [0.4113, 0.3915, 0.3597,  ..., 0.0796, 0.1668, 0.2247],
         ...,
         [0.2265, 0.2236, 0.2051,  ..., 0.2404, 0.2115, 0.2055],
         [0.2308, 0.2238, 0.2051,  ..., 0.2445, 0.2115, 0.2041],
         [0.2314, 0.2238, 0.2051,  ..., 0.2450, 0.2115, 0.2039]]])

进程已结束,退出代码0

在这里插入图片描述

随机裁剪RandomCrop

这段代码描述的是一个图像处理操作，即在给定的图像上随机位置进行裁剪。这个操作适用于torch张量图像，并且支持不同的裁剪参数和填充模式。下面是详细的解释：

size：指定裁剪后图像的大小。如果size是一个整数，那么裁剪的图像将是正方形（size x
size）。如果size是一个长度为1的序列，那么它将被解释为(size[0], size[0])，即正方形裁剪。
padding：可选参数，用于在图像边界上添加填充。如果提供了一个整数，所有边界都将使用相同的填充。如果提供了长度为2的序列，分别表示左右和上下边界的填充。如果提供了长度为4的序列，分别表示左、上、右、下边界的填充。在torchscript模式下，不支持单个整数的填充，应使用长度为1的序列。
pad_if_needed：布尔值参数，如果图像小于所需的大小，将自动添加填充以避免引发异常。由于裁剪是在填充之后进行的，所以填充看起来是在随机偏移处进行的。
fill：用于常数填充的像素填充值。默认是0。如果是一个长度为3的元组，它将分别用于填充R、G、B通道。这个值仅当padding_mode是constant时使用。对于torch张量，仅支持数字填充值；对于PIL图像，支持整数或元组值。
padding_mode：填充类型，应该是constant、edge、reflect或symmetric之一。默认是constant。
constant：使用fill指定的常数值进行填充。
edge：使用边缘的最后一个值进行填充。如果输入是5D torch张量，则最后3个维度将被填充，而不是最后2个。
reflect：使用图像的反射进行填充，但不重复边缘的最后一个值。
symmetric：使用图像的反射进行填充，重复边缘的最后一个值。

这个操作允许对图像进行灵活的裁剪和填充处理，适用于不同的图像处理需求。

class RandomCrop(torch.nn.Module):
    """Crop the given image at a random location.
    If the image is torch Tensor, it is expected
    to have [..., H, W] shape, where ... means an arbitrary number of leading dimensions,
    but if non-constant padding is used, the input is expected to have at most 2 leading dimensions

    Args:
        size (sequence or int): Desired output size of the crop. If size is an
            int instead of sequence like (h, w), a square crop (size, size) is
            made. If provided a sequence of length 1, it will be interpreted as (size[0], size[0]).
        padding (int or sequence, optional): Optional padding on each border
            of the image. Default is None. If a single int is provided this
            is used to pad all borders. If sequence of length 2 is provided this is the padding
            on left/right and top/bottom respectively. If a sequence of length 4 is provided
            this is the padding for the left, top, right and bottom borders respectively.

            .. note::
                In torchscript mode padding as single int is not supported, use a sequence of
                length 1: ``[padding, ]``.
        pad_if_needed (boolean): It will pad the image if smaller than the
            desired size to avoid raising an exception. Since cropping is done
            after padding, the padding seems to be done at a random offset.
        fill (number or tuple): Pixel fill value for constant fill. Default is 0. If a tuple of
            length 3, it is used to fill R, G, B channels respectively.
            This value is only used when the padding_mode is constant.
            Only number is supported for torch Tensor.
            Only int or tuple value is supported for PIL Image.
        padding_mode (str): Type of padding. Should be: constant, edge, reflect or symmetric.
            Default is constant.

            - constant: pads with a constant value, this value is specified with fill

            - edge: pads with the last value at the edge of the image.
              If input a 5D torch Tensor, the last 3 dimensions will be padded instead of the last 2

            - reflect: pads with reflection of image without repeating the last value on the edge.
              For example, padding [1, 2, 3, 4] with 2 elements on both sides in reflect mode
              will result in [3, 2, 1, 2, 3, 4, 3, 2]

            - symmetric: pads with reflection of image repeating the last value on the edge.
              For example, padding [1, 2, 3, 4] with 2 elements on both sides in symmetric mode
              will result in [2, 1, 1, 2, 3, 4, 4, 3]
    """

    @staticmethod
    def get_params(img: Tensor, output_size: Tuple[int, int]) -> Tuple[int, int, int, int]:
        """Get parameters for ``crop`` for a random crop.

        Args:
            img (PIL Image or Tensor): Image to be cropped.
            output_size (tuple): Expected output size of the crop.

        Returns:
            tuple: params (i, j, h, w) to be passed to ``crop`` for random crop.
        """
        _, h, w = F.get_dimensions(img)
        th, tw = output_size

        if h < th or w < tw:
            raise ValueError(f"Required crop size {(th, tw)} is larger than input image size {(h, w)}")

        if w == tw and h == th:
            return 0, 0, h, w

        i = torch.randint(0, h - th + 1, size=(1,)).item()
        j = torch.randint(0, w - tw + 1, size=(1,)).item()
        return i, j, th, tw

    def __init__(self, size, padding=None, pad_if_needed=False, fill=0, padding_mode="constant"):
        super().__init__()
        _log_api_usage_once(self)

        self.size = tuple(_setup_size(size, error_msg="Please provide only two dimensions (h, w) for size."))

        self.padding = padding
        self.pad_if_needed = pad_if_needed
        self.fill = fill
        self.padding_mode = padding_mode

    def forward(self, img):
        """
        Args:
            img (PIL Image or Tensor): Image to be cropped.

        Returns:
            PIL Image or Tensor: Cropped image.
        """
        if self.padding is not None:
            img = F.pad(img, self.padding, self.fill, self.padding_mode)

        _, height, width = F.get_dimensions(img)
        # pad the width if needed
        if self.pad_if_needed and width < self.size[1]:
            padding = [self.size[1] - width, 0]
            img = F.pad(img, padding, self.fill, self.padding_mode)
        # pad the height if needed
        if self.pad_if_needed and height < self.size[0]:
            padding = [0, self.size[0] - height]
            img = F.pad(img, padding, self.fill, self.padding_mode)

        i, j, h, w = self.get_params(img, self.size)

        return F.crop(img, i, j, h, w)

    def __repr__(self) -> str:
        return f"{self.__class__.__name__}(size={self.size}, padding={self.padding})"

这段代码定义了一个名为 RandomCrop 的 PyTorch 模块，用于在给定图像上执行随机裁剪操作。这个模块是继承自 torch.nn.Module，使得它可以作为神经网络的一部分。以下是详细的解释和示例：

解释
初始化 (init 方法):

size: 期望的输出裁剪大小。可以是一个整数（表示正方形裁剪），或者是一个包含两个整数的元组（表示高度和宽度）。
padding: 可选的填充参数，用于在图像边界上添加填充。可以是一个整数、一个长度为2的序列（表示左右和上下边界的填充），或者是一个长度为4的序列（表示左、上、右、下边界的填充）。
pad_if_needed: 布尔值，如果图像小于所需的大小，则自动添加填充以避免引发异常。
fill: 用于常数填充的像素填充值。默认是0。
padding_mode: 填充类型，可以是 constant、edge、reflect 或 symmetric。
get_params 方法:
计算随机裁剪的参数，包括裁剪的起始坐标和裁剪的高度和宽度。
forward 方法:
执行裁剪操作。首先，如果设置了填充，则对图像进行填充。然后，如果 pad_if_needed 为真且图像尺寸小于所需裁剪尺寸，则进一步填充图像。最后，使用 get_params 方法计算的裁剪参数对图像进行裁剪。

示例
假设我们有一个大小为 (3, 256, 256) 的图像（即3个通道，高度和宽度都是256），我们想随机裁剪出一个大小为 (128, 128) 的图像。

import torch  
from torchvision.transforms import RandomCrop  
  
# 创建一个随机裁剪模块，裁剪大小为128x128  
crop_transform = RandomCrop(size=(128, 128))  
  
# 假设我们有一个大小为 (3, 256, 256) 的图像  
image = torch.randn(3, 256, 256)  
  
# 应用裁剪  
cropped_image = crop_transform(image)  
  
# 输出裁剪后的图像大小  
print(cropped_image.shape)  # 应该输出 torch.Size([3, 128, 128])

示例代码：

from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms

write = SummaryWriter("logs")
img = Image.open("dataset/train/bees/92663402_37f379e57a.jpg")
print(img)

#ToTensor()使用
trans_totensor = transforms.ToTensor()
img_tensor = trans_totensor(img)



print("img_tensor[0][0][0]:",img_tensor[0][0][0])

#class Normalize(torch.nn.Module):使用
trans_norm = transforms.Normalize([0.5,2,6],[0.5,3,0.5])
#此处的均值和方差为随机输入的，实际下可以计算影像的每个波段的均值和方差
img_norm = trans_norm(img_tensor)
#因为是RGB三通道，因此需要输入三个均值和三个标准差
print("img_norm[0][0][0]:",img_norm[0][0][0])

#Resize
print("img.size:",img.size)
trans_resize = transforms.Resize((1024,512))
img_resize = trans_resize(img_tensor)
print("img_resize:",img_resize)

#Compose
trans_resize_2 = transforms.Resize((1024,600))
trans_compose = transforms.Compose([trans_resize_2 ,trans_totensor,trans_norm])
img_compose = trans_compose(img)

#RandomCrop
trans_randomcrop = transforms.RandomCrop(size=(100,50))
img_randomcrop = trans_randomcrop(img_tensor)
shape = img_tensor.shape
print("img_resize shape:",shape)



write.add_image("img_randomcrop",img_randomcrop,0)
write.add_image("Compose",img_compose,3)
write.add_image("Tensor_img",img_tensor)
write.add_image("Norm",img_norm)
write.add_image("img_resize ",img_resize ,1)
write.close()

结果为：

C:\Anaconda3\envs\pytorch_test\python.exe H:\Python\Test\P10_Usefultransforms.py 
<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=500x374 at 0x16586E878B0>
img_tensor[0][0][0]: tensor(0.1451)
img_norm[0][0][0]: tensor(-0.7098)
img.size: (500, 374)
img_resize: tensor([[[0.1451, 0.1716, 0.1799,  ..., 0.1317, 0.1692, 0.1843],
         [0.1466, 0.1711, 0.1785,  ..., 0.1310, 0.1692, 0.1851],
         [0.1581, 0.1674, 0.1674,  ..., 0.1256, 0.1694, 0.1908],
         ...,
         [0.3036, 0.2894, 0.2652,  ..., 0.3344, 0.3145, 0.3170],
         [0.3021, 0.2838, 0.2608,  ..., 0.3359, 0.3172, 0.3141],
         [0.3020, 0.2830, 0.2602,  ..., 0.3361, 0.3175, 0.3137]],

        [[0.2431, 0.2431, 0.2394,  ..., 0.2143, 0.2548, 0.2510],
         [0.2443, 0.2441, 0.2395,  ..., 0.2132, 0.2543, 0.2523],
         [0.2529, 0.2513, 0.2399,  ..., 0.2049, 0.2505, 0.2623],
         ...,
         [0.3275, 0.3044, 0.2740,  ..., 0.3261, 0.2996, 0.2967],
         [0.3361, 0.3033, 0.2726,  ..., 0.3274, 0.2980, 0.2909],
         [0.3373, 0.3032, 0.2724,  ..., 0.3276, 0.2978, 0.2902]],

        [[0.4000, 0.3849, 0.3548,  ..., 0.0872, 0.1664, 0.2118],
         [0.4013, 0.3856, 0.3554,  ..., 0.0863, 0.1664, 0.2133],
         [0.4113, 0.3915, 0.3597,  ..., 0.0796, 0.1668, 0.2247],
         ...,
         [0.2265, 0.2236, 0.2051,  ..., 0.2404, 0.2115, 0.2055],
         [0.2308, 0.2238, 0.2051,  ..., 0.2445, 0.2115, 0.2041],
         [0.2314, 0.2238, 0.2051,  ..., 0.2450, 0.2115, 0.2039]]])
img_resize shape: torch.Size([3, 374, 500])

进程已结束,退出代码0

在这里插入图片描述

可以用for循环进行随机裁剪：

from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms

write = SummaryWriter("logs")
img = Image.open("dataset/train/bees/92663402_37f379e57a.jpg")
print(img)

#ToTensor()使用
trans_totensor = transforms.ToTensor()
img_tensor = trans_totensor(img)



print("img_tensor[0][0][0]:",img_tensor[0][0][0])

#class Normalize(torch.nn.Module):使用
trans_norm = transforms.Normalize([0.5,2,6],[0.5,3,0.5])
#此处的均值和方差为随机输入的，实际下可以计算影像的每个波段的均值和方差
img_norm = trans_norm(img_tensor)
#因为是RGB三通道，因此需要输入三个均值和三个标准差
print("img_norm[0][0][0]:",img_norm[0][0][0])

#Resize
print("img.size:",img.size)
trans_resize = transforms.Resize((1024,512))
img_resize = trans_resize(img_tensor)
print("img_resize:",img_resize)

#Compose
trans_resize_2 = transforms.Resize((1024,600))
trans_compose = transforms.Compose([trans_resize_2 ,trans_totensor,trans_norm])
img_compose = trans_compose(img)

#RandomCrop
trans_randomcrop = transforms.RandomCrop(size=(100,50))
for i in range(12):
    img_randomcrop = trans_randomcrop(img_tensor)
    shape = img_tensor.shape
    print("img_resize shape:", shape)
    write.add_image("img_randomcrop", img_randomcrop, i)






write.add_image("Compose",img_compose,3)
write.add_image("Tensor_img",img_tensor)
write.add_image("Norm",img_norm)
write.add_image("img_resize ",img_resize ,1)
write.close()

结果为：

C:\Anaconda3\envs\pytorch_test\python.exe H:\Python\Test\P10_Usefultransforms.py 
<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=500x374 at 0x1F771D07970>
img_tensor[0][0][0]: tensor(0.1451)
img_norm[0][0][0]: tensor(-0.7098)
img.size: (500, 374)
img_resize: tensor([[[0.1451, 0.1716, 0.1799,  ..., 0.1317, 0.1692, 0.1843],
         [0.1466, 0.1711, 0.1785,  ..., 0.1310, 0.1692, 0.1851],
         [0.1581, 0.1674, 0.1674,  ..., 0.1256, 0.1694, 0.1908],
         ...,
         [0.3036, 0.2894, 0.2652,  ..., 0.3344, 0.3145, 0.3170],
         [0.3021, 0.2838, 0.2608,  ..., 0.3359, 0.3172, 0.3141],
         [0.3020, 0.2830, 0.2602,  ..., 0.3361, 0.3175, 0.3137]],

        [[0.2431, 0.2431, 0.2394,  ..., 0.2143, 0.2548, 0.2510],
         [0.2443, 0.2441, 0.2395,  ..., 0.2132, 0.2543, 0.2523],
         [0.2529, 0.2513, 0.2399,  ..., 0.2049, 0.2505, 0.2623],
         ...,
         [0.3275, 0.3044, 0.2740,  ..., 0.3261, 0.2996, 0.2967],
         [0.3361, 0.3033, 0.2726,  ..., 0.3274, 0.2980, 0.2909],
         [0.3373, 0.3032, 0.2724,  ..., 0.3276, 0.2978, 0.2902]],

        [[0.4000, 0.3849, 0.3548,  ..., 0.0872, 0.1664, 0.2118],
         [0.4013, 0.3856, 0.3554,  ..., 0.0863, 0.1664, 0.2133],
         [0.4113, 0.3915, 0.3597,  ..., 0.0796, 0.1668, 0.2247],
         ...,
         [0.2265, 0.2236, 0.2051,  ..., 0.2404, 0.2115, 0.2055],
         [0.2308, 0.2238, 0.2051,  ..., 0.2445, 0.2115, 0.2041],
         [0.2314, 0.2238, 0.2051,  ..., 0.2450, 0.2115, 0.2039]]])
img_resize shape: torch.Size([3, 374, 500])
img_resize shape: torch.Size([3, 374, 500])
img_resize shape: torch.Size([3, 374, 500])
img_resize shape: torch.Size([3, 374, 500])
img_resize shape: torch.Size([3, 374, 500])
img_resize shape: torch.Size([3, 374, 500])
img_resize shape: torch.Size([3, 374, 500])
img_resize shape: torch.Size([3, 374, 500])
img_resize shape: torch.Size([3, 374, 500])
img_resize shape: torch.Size([3, 374, 500])
img_resize shape: torch.Size([3, 374, 500])
img_resize shape: torch.Size([3, 374, 500])

进程已结束,退出代码0