训练模型时加上图像数据增强吧

介绍

数据增强是防止过拟合最有效的方法，采取有效的数据增强策略可以将数据扩大十倍有余，有效提升模型泛化性，抑制过拟合。

全家福

transform_train = transforms.Compose([

    transforms.ColorJitter(brightness=0.8, contrast=0.8, saturation=0.8),
    transforms.RandomCrop(32, padding=4),  #先四周填充0，在吧图像随机裁剪成32*32
    transforms.RandomHorizontalFlip(),  #图像一半的概率翻转，一半的概率不翻转
    transforms.RandomRotation(degrees=10),

    transforms.ToTensor(),
    Cutout(n_holes=5, length=4),
    transforms.Normalize(mean=CIFAR_MEAN,std=CIFAR_STD), #R,G,B每层的归一化用到的均值和方差
   
])

随机旋转

1 2	import torchvision.transforms as T T.RandomRotation(degrees=10)

图像以0-10的随机度进行旋转，我选择的这个值比较保守。

随机翻转

transforms.RandomHorizontalFlip()

一般我是选择水平翻转，垂直翻转不知道在表达什么。

调整亮度、对比度、饱和度

transforms.ColorJitter(brightness=0.8, contrast=0.8, saturation=0.8)

随机调整范围为 \((max(1-factor,0),factor)\)

我尝试设置为8之后，图片的变化在可接受的范围内。

随机裁剪

transforms.RandomCrop(32, padding=4)

先四周填充0，在吧图像随机裁剪成32*32

Cutout

生成一组方块进行遮挡,是非常重要的数据增强方法！

使用前先使用ToTensor!

Cutout(n_holes=5, length=4)

import torch
import numpy as np

device = 'cuda' if torch.cuda.is_available() else 'cpu'
class Cutout(object):
    """
    Args:
        n_holes (int): Number of patches to cut out of each image.
        length (int): The length (in pixels) of each square patch.
    """
    def __init__(self, n_holes, length):
        self.n_holes = n_holes
        self.length = length

    def __call__(self, img):
        h = img.size(1)
        w = img.size(2)

        mask = np.ones((h, w), np.float32)

        for n in range(self.n_holes):
        	# (x,y)表示方形补丁的中心位置
            y = np.random.randint(h)
            x = np.random.randint(w)

            y1 = np.clip(y - self.length // 2, 0, h)
            y2 = np.clip(y + self.length // 2, 0, h)
            x1 = np.clip(x - self.length // 2, 0, w)
            x2 = np.clip(x + self.length // 2, 0, w)

            mask[y1: y2, x1: x2] = 0.

        mask = torch.from_numpy(mask)
        mask = mask.expand_as(img)
        img = img * mask

        return img

ToTensor

将数据缩放到0-1，严格来说只是数据预处理方法。

也会修改数据的维度。

Normalize

修改数据分布，修改数据的方差和均值以促进收敛。

要放在最后使用！

transforms.Normalize(mean=CIFAR_MEAN,std=CIFAR_STD)

CutMix

之前的数据增强方法存在的问题：

mixup：混合后的图像在局部是模糊和不自然的，因此会混淆模型，尤其是在定位方面。

cutout：被cutout的部分通常用0或者随机噪声填充，这就导致在训练过程中这部分的信息被浪费掉了。

cutmix在cutout的基础上进行改进，cutout的部分用另一张图像上cutout的部分进行填充，这样即保留了cutout的优点：让模型从目标的部分视图去学习目标的特征，让模型更关注那些less discriminative的部分。同时比cutout更高效，cutout的部分用另一张图像的部分进行填充，让模型同时学习两个目标的特征。

从下图可以看出，虽然Mixup和Cutout都提升了模型的分类精度，但在若监督定位和目标检测性能上都有不同程度的下降，而CutMix则在各个任务上都获得了显著的性能提升。