模型:

eugenesiow/rcan-bam

类库:

数据集:

eugenesiow/Div2k eugenesiow/Set5 eugenesiow/Set14 eugenesiow/BSD100 eugenesiow/Urban100 3Aeugenesiow/Urban100 3Aeugenesiow/BSD100 3Aeugenesiow/Set14 3Aeugenesiow/Set5 3Aeugenesiow/Div2k

其他:

RCAN super-image image-super-resolution

预印本库:

arxiv:1807.02758 arxiv:2104.07566

许可:

apache-2.0

模型介绍文件清单

英文

Residual Channel Attention Networks (RCAN)

RCAN模型是在DIV2K上预训练的（800个图像用于训练，通过数据增强变为4000个图像，100个图像用于验证），用于2x、3x和4x图像超分辨率。它在张等人（2018）的论文中首次提出，并在 this repository 中首次发布。

图像超分辨率的目标是从单个低分辨率（LR）图像恢复出高分辨率（HR）图像。下图显示了真实值（HR）、双三次插值的放大和模型放大的效果。

模型描述

对于图像超分辨率（SR），卷积神经网络（CNN）的深度非常重要。然而，我们观察到更深的图像SR网络更难训练。低分辨率输入和特征包含丰富的低频信息，这些信息在通道之间被等同对待，从而限制了CNN的表示能力。为了解决这些问题，我们提出了非常深的残差通道注意网络（RCAN）。具体来说，我们提出了一种残差内残差（RIR）结构来构建非常深的网络，其中包含几个带有长跳跃连接的残差组。每个残差组包含一些带有短跳跃连接的残差块。同时，RIR通过多个跳跃连接使得丰富的低频信息绕过，使主网络专注于学习高频信息。此外，我们提出了一种通道注意机制，通过考虑通道之间的相互依赖性来自适应地重新缩放通道特征。大量实验证明，我们的RCAN相对于最先进的方法实现了更好的准确性和视觉提升。

此模型还应用了由 Wang et al. (2021) 发明的平衡注意力（BAM）方法来进一步提升结果。

预期使用和限制

您可以使用预训练模型将图像放大2倍、3倍和4倍。您还可以使用训练器在自己的数据集上训练模型。

如何使用

该模型可以与 super_image 库一起使用：

pip install super-image

这是如何使用预训练模型来放大图像的示例：

from super_image import RcanModel, ImageLoader
from PIL import Image
import requests

url = 'https://paperswithcode.com/media/datasets/Set5-0000002728-07a9793f_zA3bDjj.jpg'
image = Image.open(requests.get(url, stream=True).raw)

model = RcanModel.from_pretrained('eugenesiow/rcan-bam', scale=2)      # scale 2, 3 and 4 models available
inputs = ImageLoader.load_image(image)
preds = model(inputs)

ImageLoader.save_image(preds, './scaled_2x.png')                        # save the output 2x scaled image to `./scaled_2x.png`
ImageLoader.save_compare(inputs, preds, './scaled_2x_compare.png')      # save an output comparing the super-image with a bicubic scaling

训练数据

2x、3x和4x图像超分辨率模型是在 DIV2K 上进行预训练的，该数据集包含800个高质量（2K分辨率）的图像用于训练，并通过数据增强变为4000个图像，使用100个验证图像（图像编号为801至900）。

训练过程

预处理

我们按照 Wang et al. 的预处理和训练方法进行操作。使用双三次插值作为缩放方法，将高分辨率（HR）图像的尺寸减小2倍、3倍和4倍来创建低分辨率（LR）图像。在训练过程中，使用LR输入的大小为64×64的RGB补丁，以及对应的HR补丁。在预处理阶段，对训练集应用数据增强，从原始图像的四个角和中心创建五个图像。

我们需要huggingface datasets 库来下载数据：

pip install datasets

以下代码获取数据并对数据进行预处理/增强。

from datasets import load_dataset
from super_image.data import EvalDataset, TrainDataset, augment_five_crop

augmented_dataset = load_dataset('eugenesiow/Div2k', 'bicubic_x4', split='train')\
    .map(augment_five_crop, batched=True, desc="Augmenting Dataset")                                # download and augment the data with the five_crop method
train_dataset = TrainDataset(augmented_dataset)                                                     # prepare the train dataset for loading PyTorch DataLoader
eval_dataset = EvalDataset(load_dataset('eugenesiow/Div2k', 'bicubic_x4', split='validation'))      # prepare the eval dataset for the PyTorch DataLoader

预训练

该模型在GPU上进行了训练。以下是训练代码：

from super_image import Trainer, TrainingArguments, RcanModel, RcanConfig

training_args = TrainingArguments(
    output_dir='./results',                 # output directory
    num_train_epochs=1000,                  # total number of training epochs
)

config = RcanConfig(
    scale=4,                                # train a model to upscale 4x
    bam=True,                               # apply balanced attention to the network
)
model = RcanModel(config)

trainer = Trainer(
    model=model,                         # the instantiated model to be trained
    args=training_args,                  # training arguments, defined above
    train_dataset=train_dataset,         # training dataset
    eval_dataset=eval_dataset            # evaluation dataset
)

trainer.train()

评估结果

评估指标包括 PSNR 和 SSIM 。

评估数据集包括：

下表中的结果列以PSNR/SSIM表示，并与双三次插值基准进行对比。

|数据集 |倍数 |双三次插值 |rcan-bam ||--- |--- |--- |--- ||Set5 |2x |33.64/0.9292 |**** ||Set5 |3x |30.39/0.8678 |**** ||Set5 |4x |28.42/0.8101 | 30.8/0.8701 ||Set14 |2x |30.22/0.8683 |**** ||Set14 |3x |27.53/0.7737 |**** ||Set14 |4x |25.99/0.7023 | 27.91/0.7648 ||BSD100 |2x |29.55/0.8425 |**** ||BSD100 |3x |27.20/0.7382 |**** ||BSD100 |4x |25.96/0.6672 | 27.91/0.7477 ||Urban100 |2x |26.66/0.8408 |**** ||Urban100 |3x | |**** ||Urban100 |4x |23.14/0.6573 | 24.75/0.7346 |

您可以在下面找到一个笔记本，以便轻松运行对预训练模型的评估：

BibTeX条目和引用信息

@misc{wang2021bam,
    title={BAM: A Lightweight and Efficient Balanced Attention Mechanism for Single Image Super Resolution}, 
    author={Fanyi Wang and Haotian Hu and Cheng Shen},
    year={2021},
    eprint={2104.07566},
    archivePrefix={arXiv},
    primaryClass={eess.IV}
}

@misc{zhang2018image,
      title={Image Super-Resolution Using Very Deep Residual Channel Attention Networks}, 
      author={Yulun Zhang and Kunpeng Li and Kai Li and Lichen Wang and Bineng Zhong and Yun Fu},
      year={2018},
      eprint={1807.02758},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

作者:

Eugene Siow

数据集大小:

60.87 MB