torch.nn.Sequential() and torch.nn.ModuleList()
torch.nn.Sequential{} and torch.nn.ModuleList{}
- 1. `torch.nn.Sequential()`
- 2. `torch.nn.ModuleList()`
- 3. What’s the difference between a `torch.nn.Sequential()` and a `torch.nn.ModuleList()`?
- References
1. torch.nn.Sequential()
https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html
torch.nn.Sequential(*args: Module)
torch.nn.Sequential(arg: OrderedDict[str, Module])
A sequential container.
Modules will be added to it in the order they are passed in the constructor. Alternatively, an OrderedDict
of modules can be passed in. The forward()
method of torch.nn.Sequential()
accepts any input and forwards it to the first module it contains. It then “chains” outputs to inputs sequentially for each subsequent module, finally returning the output of the last module.
torch.nn.Sequential()
不必手动在 forward()
函数中一层一层地前向传播。torch.nn.Sequential()
定义的网络中各层会按照定义的顺序进行级联,因此需要保证各层的输入和输出之间要衔接。
The value a torch.nn.Sequential()
provides over manually calling a sequence of modules is that it allows treating the whole container as a single module, such that performing a transformation on the torch.nn.Sequential()
applies to each of the modules it stores (which are each a registered submodule of the torch.nn.Sequential()
).
torch.nn.Sequential()
是一个有序的容器,神经网络模块将按照在传入构造器的顺序依次被添加到计算图中执行。
Using torch.nn.Sequential()
to create a small model. When submodel
is run, input will first be passed to nn.Conv2d(1, 20, 5)
. The output of nn.Conv2d(1, 20, 5)
will be used as the input to the first nn.ReLU
; the output of the first nn.ReLU
will become the input for nn.Conv2d(20, 64, 5)
. Finally, the output of nn.Conv2d(20, 64, 5)
will be used as input to the second nn.ReLU
#!/usr/bin/env python
# coding=utf-8import torch
from torch import nndevice = "cpu"
print(f"torch.__version__ = {torch.__version__}, torch.accelerator.current_accelerator().type = {torch.accelerator.current_accelerator().type}")
print(f"device = {device}\n")class NeuralNetwork(nn.Module):def __init__(self):super().__init__()self.submodel = nn.Sequential(nn.Conv2d(1, 20, 5),nn.ReLU(),nn.Conv2d(20, 64, 5),nn.ReLU(),)def forward(self, input):output = self.submodel(input)return outputmodel = NeuralNetwork().to(device)print(f"Model structure: {model}\n")for name, param in model.named_parameters():print(f"Layer: {name} | Size: {param.size()}\n")
/home/yongqiang/miniconda3/bin/python /home/yongqiang/stable_diffusion_work/stable_diffusion_diffusers/yongqiang.py
torch.__version__ = 2.6.0+cu124, torch.accelerator.current_accelerator().type = cuda
device = cpuModel structure: NeuralNetwork((submodel): Sequential((0): Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1))(1): ReLU()(2): Conv2d(20, 64, kernel_size=(5, 5), stride=(1, 1))(3): ReLU())
)Layer: submodel.0.weight | Size: torch.Size([20, 1, 5, 5])Layer: submodel.0.bias | Size: torch.Size([20])Layer: submodel.2.weight | Size: torch.Size([64, 20, 5, 5])Layer: submodel.2.bias | Size: torch.Size([64])Process finished with exit code 0
Using torch.nn.Sequential()
with OrderedDict
. This is functionally the same as the above code.
#!/usr/bin/env python
# coding=utf-8from collections import OrderedDictimport torch
from torch import nndevice = "cpu"
print(f"torch.__version__ = {torch.__version__}, torch.accelerator.current_accelerator().type = {torch.accelerator.current_accelerator().type}")
print(f"device = {device}\n")class NeuralNetwork(nn.Module):def __init__(self):super().__init__()self.submodel = nn.Sequential(OrderedDict([('conv1', nn.Conv2d(1, 20, 5)),('relu1', nn.ReLU()),('conv2', nn.Conv2d(20, 64, 5)),('relu2', nn.ReLU())]))def forward(self, input):output = self.submodel(input)return outputmodel = NeuralNetwork().to(device)print(f"Model structure: {model}\n")for name, param in model.named_parameters():print(f"Layer: {name} | Size: {param.size()}\n")
/home/yongqiang/miniconda3/bin/python /home/yongqiang/stable_diffusion_work/stable_diffusion_diffusers/yongqiang.py
torch.__version__ = 2.6.0+cu124, torch.accelerator.current_accelerator().type = cuda
device = cpuModel structure: NeuralNetwork((submodel): Sequential((conv1): Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1))(relu1): ReLU()(conv2): Conv2d(20, 64, kernel_size=(5, 5), stride=(1, 1))(relu2): ReLU())
)Layer: submodel.conv1.weight | Size: torch.Size([20, 1, 5, 5])Layer: submodel.conv1.bias | Size: torch.Size([20])Layer: submodel.conv2.weight | Size: torch.Size([64, 20, 5, 5])Layer: submodel.conv2.bias | Size: torch.Size([64])Process finished with exit code 0
- append(module)
Append a given module to the end.
Parameters
module (nn.Module
) - module to append
Return type
Sequential
2. torch.nn.ModuleList()
https://pytorch.org/docs/stable/generated/torch.nn.ModuleList.html
torch.nn.ModuleList(modules=None)
Holds submodules in a list.
torch.nn.ModuleList()
can be indexed like a regular Python list, but modules it contains are properly registered, and will be visible by all Module
methods.
torch.nn.ModuleList()
将不同的模块储存在一起,并没有定义一个网络,这些模块之间并没有什么先后顺序可言。
Parameters
modules (iterable, optional) - an iterable of modules to add
#!/usr/bin/env python
# coding=utf-8import torch
from torch import nndevice = "cpu"
print(f"torch.__version__ = {torch.__version__}, torch.accelerator.current_accelerator().type = {torch.accelerator.current_accelerator().type}")
print(f"device = {device}\n")class NeuralNetwork(nn.Module):def __init__(self) -> None:super().__init__()self.linears = nn.ModuleList([nn.Linear(10, 10) for i in range(5)])def forward(self, x):# ModuleList can act as an iterable, or be indexed using intsfor index, layer in enumerate(self.linears):x = self.linears[index // 2](x) + layer(x)return xmodel = NeuralNetwork().to(device)print(f"Model structure: {model}\n")for name, param in model.named_parameters():print(f"Layer: {name} | Size: {param.size()}\n")
/home/yongqiang/miniconda3/bin/python /home/yongqiang/stable_diffusion_work/stable_diffusion_diffusers/yongqiang.py
torch.__version__ = 2.6.0+cu124, torch.accelerator.current_accelerator().type = cuda
device = cpuModel structure: NeuralNetwork((linears): ModuleList((0-4): 5 x Linear(in_features=10, out_features=10, bias=True))
)Layer: linears.0.weight | Size: torch.Size([10, 10])Layer: linears.0.bias | Size: torch.Size([10])Layer: linears.1.weight | Size: torch.Size([10, 10])Layer: linears.1.bias | Size: torch.Size([10])Layer: linears.2.weight | Size: torch.Size([10, 10])Layer: linears.2.bias | Size: torch.Size([10])Layer: linears.3.weight | Size: torch.Size([10, 10])Layer: linears.3.bias | Size: torch.Size([10])Layer: linears.4.weight | Size: torch.Size([10, 10])Layer: linears.4.bias | Size: torch.Size([10])Process finished with exit code 0
- append(module)
Append a given module to the end of the list.
Parameters
module (nn.Module
) - module to append
Return type
ModuleList
- extend(modules)
Append modules from a Python iterable to the end of the list.
Parameters
modules (iterable) - iterable of modules to append
Return type
Self
- insert(index, module)
Insert a given module before a given index in the list.
Parameters
index (int
) - index to insert.
module (nn.Module
) - module to insert
3. What’s the difference between a torch.nn.Sequential()
and a torch.nn.ModuleList()
?
A torch.nn.ModuleList()
is exactly what it sounds like-a list for storing Module
s! On the other hand, the layers in a torch.nn.Sequential()
are connected in a cascading way.
cascade /kæˈskeɪd/
n. 小瀑布;喷流;层叠
vi. 像瀑布般冲下或倾泻
v. 使瀑布似地落下
References
[1] Yongqiang Cheng, https://yongqiang.blog.csdn.net/