本文最后更新于192 天前，其中的信息可能已经过时，如有错误请发送邮件到2939296129@qq.com

模型创建于nn.Module

模型创建步骤

首先构建网络层，其中包含了卷积层、池化层、激活函数层等

然后拼接网络层，按照一定的拓扑结构和顺序拼接，使其成为LeNet，AlexNet，ResNet等

最后进行权值初始化，Xavier，Kaiming，均匀分布，正态分布等

在之前的人民币二分类问题中我们就创建了一个LeNet，下面是LeNet的示意图

LeNet计算图：输入32x32x3的张量，经过复杂运算之后，最后输出长度为10的向量

模型构建两要素

代码解释

接下来我们通过代码来看看模型是如何被创建的。

# ============================ step 2/5 模型 ============================
net = LeNet(classes=2)
net.initialize_weights()

我们进入LeNet的定义：创建2个二维卷积层和3个全连接层

class LeNet(nn.Module):
    def __init__(self, classes):
        super(LeNet, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16*5*5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, classes)

随后进行了权值的初始化

下一步就是在模型训练时进行前向传播

# forward
    inputs, labels = data
    outputs = net(inputs)

转入到lenet.py中，可以看到前向传播的代码：

def forward(self, x):
        out = F.relu(self.conv1(x))
        out = F.max_pool2d(out, 2)
        out = F.relu(self.conv2(out))
        out = F.max_pool2d(out, 2)
        out = out.view(out.size(0), -1)
        out = F.relu(self.fc1(out))
        out = F.relu(self.fc2(out))
        out = self.fc3(out)
        return out

nn.Module

在模型模块中，所有模型和网络层都是继承于nn.Module类，因此非常有必要学习

torch.nn

torch.nn，是pytorch中的神经网络模块，包含了很多子模块

nn.Module

在nn.Module中有8个重要的属性

接下来我们回到之前的代码，看看nn.Module是如何创建的

class LeNet(nn.Module):
    def __init__(self, classes):
        super(LeNet, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16*5*5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, classes)

首先调用了父类中的init函数进行初始化

在lenet.py中，首先调用了父类中的init函数进行初始化，调用了module.py中的init函数

def __init__(self, *args, **kwargs) -> None:
    """Initialize internal Module state, shared by both nn.Module and ScriptModule."""
    torch._C._log_api_usage_once("python.nn_module")

    # Backward compatibility: no args used to be allowed when call_super_init=False
    if self.call_super_init is False and bool(kwargs):
            raise TypeError("{}.__init__() got an unexpected keyword argument '{}'"
                            "".format(type(self).__name__, next(iter(kwargs))))

    if self.call_super_init is False and bool(args):
            raise TypeError(f"{type(self).__name__}.__init__() takes 1 positional argument but {len(args) + 1} were"
                            " given")

    super().__setattr__('training', True)
    super().__setattr__('_parameters', OrderedDict())
    super().__setattr__('_buffers', OrderedDict())
    super().__setattr__('_non_persistent_buffers_set', set())
    super().__setattr__('_backward_pre_hooks', OrderedDict())
    super().__setattr__('_backward_hooks', OrderedDict())
    super().__setattr__('_is_full_backward_hook', None)
    super().__setattr__('_forward_hooks', OrderedDict())
    super().__setattr__('_forward_hooks_with_kwargs', OrderedDict())
    super().__setattr__('_forward_hooks_always_called', OrderedDict())
    super().__setattr__('_forward_pre_hooks', OrderedDict())
    super().__setattr__('_forward_pre_hooks_with_kwargs', OrderedDict())
    super().__setattr__('_state_dict_hooks', OrderedDict())
    super().__setattr__('_state_dict_pre_hooks', OrderedDict())
    super().__setattr__('_load_state_dict_pre_hooks', OrderedDict())
    super().__setattr__('_load_state_dict_post_hooks', OrderedDict())
    super().__setattr__('_modules', OrderedDict())

可以看到设置了很多的属性，以及training训练状态设置为true，这个函数就是初始化了nn.Module的初始属性

下一步创建卷积层，我们进入到conv.py中看到init函数

class Conv2d(_ConvNd):
def __init__(
        self,
        in_channels: int,
        out_channels: int,
        kernel_size: _size_2_t,
        stride: _size_2_t = 1,
        padding: Union[str, _size_2_t] = 0,
        dilation: _size_2_t = 1,
        groups: int = 1,
        bias: bool = True,
        padding_mode: str = 'zeros',  # TODO: refine this type
        device=None,
        dtype=None
    ) -> None:
        factory_kwargs = {'device': device, 'dtype': dtype}
        kernel_size_ = _pair(kernel_size)
        stride_ = _pair(stride)
        padding_ = padding if isinstance(padding, str) else _pair(padding)
        dilation_ = _pair(dilation)
        super().__init__(
            in_channels, out_channels, kernel_size_, stride_, padding_, dilation_,
            False, _pair(0), groups, bias, padding_mode, **factory_kwargs)

Conv2d这个类继承于ConvNd，init中设置了一些参数属性，然后调用了父类的init函数

class _ConvNd(Module):
    def __init__(self,
                 in_channels: int,
                 out_channels: int,
                 kernel_size: Tuple[int, ...],
                 stride: Tuple[int, ...],
                 padding: Tuple[int, ...],
                 dilation: Tuple[int, ...],
                 transposed: bool,
                 output_padding: Tuple[int, ...],
                 groups: int,
                 bias: bool,
                 padding_mode: str,
                 device=None,
                 dtype=None) -> None:
        factory_kwargs = {'device': device, 'dtype': dtype}
        super().__init__()

可以看到ConvNd继承于Module，也调用了Module的init函数，初始化了8个有序字典

经过这一系列的操作，我们就成功创建了卷积层并存入module进行管理

在我们创建第2个卷积层后，在赋值时，会被setattr函数拦截，会跳转到Module中的setattr函数

def add_module(self, name: str, module: Optional['Module']) -> None:
        r"""Add a child module to the current module.

        The module can be accessed as an attribute using the given name.

        Args:
            name (str): name of the child module. The child module can be
                accessed from this module using the given name
            module (Module): child module to be added to the module.
        """
        if not isinstance(module, Module) and module is not None:
            raise TypeError(f"{torch.typename(module)} is not a Module subclass")
        elif not isinstance(name, str):
            raise TypeError(f"module name should be a string. Got {torch.typename(name)}")
        elif hasattr(self, name) and name not in self._modules:
            raise KeyError(f"attribute '{name}' already exists")
        elif '.' in name:
            raise KeyError(f"module name can't contain \".\", got: {name}")
        elif name == '':
            raise KeyError("module name can't be empty string \"\"")
        for hook in _global_module_registration_hooks.values():
            output = hook(self, name, module)
            if output is not None:
                module = output
        self._modules[name] = module

判断是否为module，然后加入到_modules中

总结

模型容器与AlexNet构建

模型容器——Containers

在pytorch中提供了三个常用的Containers

Sequential

在机器学习中有特征工程这一操作，但在深度学习中弱化了这一概念。

在深度学习中也有习惯，以全连接层为界限，将模型划分为特征提取模块和分类模块。

我们可以用Sequential分别包装两组网络层，然后组装成一个神经网络

方法一

直接看代码，创建一个类，可以将之前LeNet分为两部分，用Sequential包装

然后定义前向传播

# ============================ Sequential
class LeNetSequential(nn.Module):
    def __init__(self, classes):
        super(LeNetSequential, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 6, 5),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2),
            nn.Conv2d(6, 16, 5),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2),)

        self.classifier = nn.Sequential(
            nn.Linear(16*5*5, 120),
            nn.ReLU(),
            nn.Linear(120, 84),
            nn.ReLU(),
            nn.Linear(84, classes),)

    def forward(self, x):
        x = self.features(x)
        x = x.view(x.size()[0], -1)
        x = self.classifier(x)
        return x

接下来我们进入container.py中看看Sequential这个类

Sequential这个类依旧继承于Module，在init函数中调用了父类的init，随后判断参数是否为字典，非字典则for循环轮流调用add_module这个方法，将网络添加到Sequential中

class Sequential(Module):
    def __init__(self, *args):
        super().__init__()
        if len(args) == 1 and isinstance(args[0], OrderedDict):
            for key, module in args[0].items():
                self.add_module(key, module)
        else:
            for idx, module in enumerate(args):
                self.add_module(str(idx), module)

回到原来的代码，创建一个随机张量，传入网络

net = LeNetSequential(classes=2)
net = LeNetSequentialOrderDict(classes=2)

fake_img = torch.randn((4, 3, 32, 32), dtype=torch.float32)

output = net(fake_img)

随后跳转到module的__call__函数

def __call__(self, *args: Any, **kwargs: Any) -> Any:
        if self.with_module:
            module = self.module()
            if module is None:
                raise RuntimeError("You are trying to call the hook of a dead Module!")
            return self.hook(module, *args, **kwargs)
        return self.hook(*args, **kwargs)

随后跳转到我们自定义的forward函数

def forward(self, x):
        x = self.features(x)
        x = x.view(x.size()[0], -1)
        x = self.classifier(x)
        return x

之后再进入container.py中的forward，非常简洁，直接用for循环module，对输入的参数进行计算

def forward(self, input):
        for module in self:
            input = module(input)
        return input

最后我们得到了一个output

方法二

方法一中网络层是以编号按顺序排列的，当网络层过多时，就很不方便。

因此还有另一种方法，可以给网络层进行命名。

class LeNetSequentialOrderDict(nn.Module):
    def __init__(self, classes):
        super(LeNetSequentialOrderDict, self).__init__()

        self.features = nn.Sequential(OrderedDict({
            'conv1': nn.Conv2d(3, 6, 5),
            'relu1': nn.ReLU(inplace=True),
            'pool1': nn.MaxPool2d(kernel_size=2, stride=2),

            'conv2': nn.Conv2d(6, 16, 5),
            'relu2': nn.ReLU(inplace=True),
            'pool2': nn.MaxPool2d(kernel_size=2, stride=2),
        }))

        self.classifier = nn.Sequential(OrderedDict({
            'fc1': nn.Linear(16*5*5, 120),
            'relu3': nn.ReLU(),

            'fc2': nn.Linear(120, 84),
            'relu4': nn.ReLU(inplace=True),

            'fc3': nn.Linear(84, classes),
        }))

    def forward(self, x):
        x = self.features(x)
        x = x.view(x.size()[0], -1)
        x = self.classifier(x)
        return x

直接进入container.py，sequential类的init函数，当我们输入的是一个OrderedDict，则会通过字典来给网络层一个key，并添加值。（字典（泛型）：键值对）

def __init__(self, *args):
        super().__init__()
        if len(args) == 1 and isinstance(args[0], OrderedDict):
            for key, module in args[0].items():
                self.add_module(key, module)
        else:
            for idx, module in enumerate(args):
                self.add_module(str(idx), module)

ModuleList

直接看代码，ModuleList初始化直接循环创建

# ============================ ModuleList

class ModuleList(nn.Module):
    def __init__(self):
        super(ModuleList, self).__init__()
        self.linears = nn.ModuleList([nn.Linear(10, 10) for i in range(20)])

    def forward(self, x):
        for i, linear in enumerate(self.linears):
            x = linear(x)
        return x

进入ModuleList类，看看init，十分简单，判断是否为空，随后进行module的拼接

    def __init__(self, modules: Optional[Iterable[Module]] = None) -> None:
        super().__init__()
        if modules is not None:
            self += modules

之后在前向传播中，通过for循环获得每一个网络层

ModuleDict

依旧是看代码，与Sequential有点类似

class ModuleDict(nn.Module):
    def __init__(self):
        super(ModuleDict, self).__init__()
        self.choices = nn.ModuleDict({
            'conv': nn.Conv2d(10, 10, 3),
            'pool': nn.MaxPool2d(3)
        })

        self.activations = nn.ModuleDict({
            'relu': nn.ReLU(),
            'prelu': nn.PReLU()
        })

    def forward(self, x, choice, act):
        x = self.choices[choice](x)
        x = self.activations[act](x)
        return x

总结

AlexNet

简介

结构：

代码：alexnet.py

class AlexNet(nn.Module):
    def __init__(self, num_classes: int = 1000, dropout: float = 0.5) -> None:
        super().__init__()
        _log_api_usage_once(self)
        self.features = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=11, stride=4, padding=2),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
            nn.Conv2d(64, 192, kernel_size=5, padding=2),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
            nn.Conv2d(192, 384, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(384, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(256, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
        )
        self.avgpool = nn.AdaptiveAvgPool2d((6, 6))
        self.classifier = nn.Sequential(
            nn.Dropout(p=dropout),
            nn.Linear(256 * 6 * 6, 4096),
            nn.ReLU(inplace=True),
            nn.Dropout(p=dropout),
            nn.Linear(4096, 4096),
            nn.ReLU(inplace=True),
            nn.Linear(4096, num_classes),
        )

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        x = self.features(x)
        x = self.avgpool(x)
        x = torch.flatten(x, 1)
        x = self.classifier(x)
        return x

经过以上的学习，这里就能看明白了。

卷积层

1d/2d/3d Convolution

卷积核是特征提取器

卷积维度

nn.Conv2d

这里注意一下空洞卷积：

尺寸：

看代码

set_seed(3)  # 设置随机种子

# ================================= load img ==================================
path_img = os.path.join(os.path.dirname(os.path.abspath(__file__)), "lena.png")
img = Image.open(path_img).convert('RGB')  # 0~255

# convert to tensor
img_transform = transforms.Compose([transforms.ToTensor()])
img_tensor = img_transform(img)
img_tensor.unsqueeze_(dim=0)    # C*H*W to B*C*H*W

# ================================= create convolution layer ==================================

# ================ 2d
flag = 1
# flag = 0
if flag:
    conv_layer = nn.Conv2d(3, 1, 3)   # input:(i, o, size) weights:(o, i , h, w)
    nn.init.xavier_normal_(conv_layer.weight.data)

    # calculation
    img_conv = conv_layer(img_tensor)

# ================================= visualization ==================================
print("卷积前尺寸:{}\n卷积后尺寸:{}".format(img_tensor.shape, img_conv.shape))
img_conv = transform_invert(img_conv[0, 0:1, ...], img_transform)
img_raw = transform_invert(img_tensor.squeeze(), img_transform)
plt.subplot(122).imshow(img_conv, cmap='gray')
plt.subplot(121).imshow(img_raw)
plt.show()

转置卷积

转置卷积又称为~~反卷积~~（Deconvolution）和部分跨越卷积（Fractionally-strided Convolution），用于对图像进行上采样（UpSample）

为什么称为转置卷积？

正常卷积：

假设图像尺寸为4*4,卷积核为3*3,padding=0,stride=1

图像 I16*1 卷积核：K4*16 输出：O4*1 = K4*16 * I16*1

转置卷积：

假设图像尺寸为2*2,卷积核为3*3*3, padding=0,stride=1

图像：I4*1 卷积核：K16*4 输出：O16*1 = K16*4 * I4*1

卷积核在形状上是转置的，不可逆。

尺寸：

看代码

# ================ transposed
# flag = 1
flag = 0
if flag:
    conv_layer = nn.ConvTranspose2d(3, 1, 3, stride=2)   # input:(i, o, size)
    nn.init.xavier_normal_(conv_layer.weight.data)

    # calculation
    img_conv = conv_layer(img_tensor)

池化、线性、激活函数层

池化层

池化运算：对信号进行“收集”并“总结”，类似水池收集水资源，因而得名池化层

“收集”：多变少

“总结”：最大值/平均值

最大池化

这里注意两个参数

ceil_mode：尺寸向上取整
return_indices：记录池化像素索引，在反池化的时候使用。如图，在池化前，记录像素索引，反池化时根据索引进行填充。

看代码

# ================ maxpool
# flag = 1
flag = 0
if flag:
    maxpool_layer = nn.MaxPool2d((2, 2), stride=(2, 2))   # input:(i, o, size) weights:(o, i , h, w)
    img_pool = maxpool_layer(img_tensor)

效果：可以看到池化后，图片变化很小，所以池化操作可以剔除冗余信息。

平均池化

divisor_override：除法因子，可以不除以像素的个数，而是除以除法因子

上代码

# ================ avgpool
# flag = 1
flag = 0
if flag:
    avgpoollayer = nn.AvgPool2d((2, 2), stride=(2, 2))   # input:(i, o, size) weights:(o, i , h, w)
    img_pool = avgpoollayer(img_tensor)

效果：与最大池化亮度上有所不同

最大反池化

代码

# ================ max unpool
# flag = 1
flag = 0
if flag:
    # pooling
    img_tensor = torch.randint(high=5, size=(1, 1, 4, 4), dtype=torch.float)
    maxpool_layer = nn.MaxPool2d((2, 2), stride=(2, 2), return_indices=True)
    img_pool, indices = maxpool_layer(img_tensor)

    # unpooling
    img_reconstruct = torch.randn_like(img_pool, dtype=torch.float)
    maxunpool_layer = nn.MaxUnpool2d((2, 2), stride=(2, 2))
    img_unpool = maxunpool_layer(img_reconstruct, indices)

    print("raw_img:\n{}\nimg_pool:\n{}".format(img_tensor, img_pool))
    print("img_reconstruct:\n{}\nimg_unpool:\n{}".format(img_reconstruct, img_unpool))

线性层

线性层又称全连接层，其每个神经元与上一层所有神经元相连实现对前一层的线性组合，线性变换

线性加权求和

nn.Linear

代码

# ================ linear
flag = 1
# flag = 0
if flag:
    inputs = torch.tensor([[1., 2, 3]])
    linear_layer = nn.Linear(3, 4)
    linear_layer.weight.data = torch.tensor([[1., 1., 1.],
                                             [2., 2., 2.],
                                             [3., 3., 3.],
                                             [4., 4., 4.]])
    #偏置
    linear_layer.bias.data.fill_(0.5)
    output = linear_layer(inputs)
    print(inputs, inputs.shape)
    print(linear_layer.weight.data, linear_layer.weight.data.shape)
    print(output, output.shape)

激活函数层

激活函数对特征进行非线性变换，赋予多层神经网络具有深度的意义

如何理解：

图中可以看到有三个线性层W1,W2,W3，输入后分别乘上其矩阵，根据矩阵的结合性，相当于乘上了一个矩阵，所以n个线性层等于1一个线性层。

（神经网络是为了拟合一个函数，全连接层只能拟合线性的，所以再多的线性层都只能拟合线性的，而激活函数可以是非线性的，就给神经网络引入了非线性）

nn.Sigmoid

易导致梯度消失

模型创建于nn.Module

模型创建步骤

模型构建两要素

代码解释

nn.Module

torch.nn

nn.Module

总结

模型容器与AlexNet构建

模型容器——Containers

Sequential

方法一

方法二

ModuleList

ModuleDict

总结

AlexNet

卷积层

1d/2d/3d Convolution

nn.Conv2d

转置卷积

池化、线性、激活函数层

池化层

最大池化

平均池化

最大反池化

线性层

nn.Linear

激活函数层

nn.Sigmoid

nn.tanh

nn.ReLU

LeakyReLU、PReLU、RREeLU

发送评论编辑评论

模型创建于nn.Module

模型创建步骤

模型构建两要素

代码解释

nn.Module

torch.nn

nn.Module

总结

模型容器与AlexNet构建

模型容器——Containers

Sequential

方法一

方法二

ModuleList

ModuleDict

总结

AlexNet

卷积层

1d/2d/3d Convolution

nn.Conv2d

转置卷积

池化、线性、激活函数层

池化层

最大池化

平均池化

最大反池化

线性层

nn.Linear

激活函数层

nn.Sigmoid

nn.tanh

nn.ReLU

LeakyReLU、PReLU、RREeLU

发送评论 编辑评论

推荐文章

发送评论编辑评论