英文

regnetz_d8_evos.ch_in1k模型卡片

一个RegNetZ图像分类模型。由Ross Wightman在timm中使用ImageNet-1k进行训练。

这些RegNetZ B / C / D模型探索了不同的组大小和层配置,并没有遵循任何论文描述。与EfficientNets类似,该架构使用线性(非激活的)块输出和反转的瓶颈(中间块扩展)。

  • B16:~1.5GF @ 256x256,组宽度为16。单层干扰。
  • C16:~2.5GF @ 256x256,组宽度为16。单层干扰。
  • D32:~6GF @ 256x256,组宽度为32。分层3层干扰,无池化。
  • D8:~4GF @ 256x256,组宽度为8。分层3层干扰,无池化。
  • E8:~10GF @ 256x256,组宽度为8。分层3层干扰,无池化。

该模型使用定制的EvoNorm-S0归一化-激活层,而不是BatchNorm与SiLU激活。

该模型架构是使用timm的灵活的 BYOBNet (Bring-Your-Own-Blocks Network) 实现的。

BYOBNet允许配置:

  • 块/阶段布局
  • 干扰布局
  • 输出步幅(扩张)
  • 激活和规范化层
  • 通道和空间/自注意层

...还包括timm中许多其他架构的特征,包括:

  • 随机深度
  • 梯度检查点
  • 层次化LR衰减
  • 每个阶段的特征提取

模型细节

模型用法

图像分类

from urllib.request import urlopen
from PIL import Image
import timm

img = Image.open(urlopen(
    'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))

model = timm.create_model('regnetz_d8_evos.ch_in1k', pretrained=True)
model = model.eval()

# get model specific transforms (normalization, resize)
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)

output = model(transforms(img).unsqueeze(0))  # unsqueeze single image into batch of 1

top5_probabilities, top5_class_indices = torch.topk(output.softmax(dim=1) * 100, k=5)

特征图提取

from urllib.request import urlopen
from PIL import Image
import timm

img = Image.open(urlopen(
    'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))

model = timm.create_model(
    'regnetz_d8_evos.ch_in1k',
    pretrained=True,
    features_only=True,
)
model = model.eval()

# get model specific transforms (normalization, resize)
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)

output = model(transforms(img).unsqueeze(0))  # unsqueeze single image into batch of 1

for o in output:
    # print shape of each feature map in output
    # e.g.:
    #  torch.Size([1, 64, 128, 128])
    #  torch.Size([1, 64, 64, 64])
    #  torch.Size([1, 128, 32, 32])
    #  torch.Size([1, 256, 16, 16])
    #  torch.Size([1, 1792, 8, 8])

    print(o.shape)

图像嵌入

from urllib.request import urlopen
from PIL import Image
import timm

img = Image.open(urlopen(
    'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))

model = timm.create_model(
    'regnetz_d8_evos.ch_in1k',
    pretrained=True,
    num_classes=0,  # remove classifier nn.Linear
)
model = model.eval()

# get model specific transforms (normalization, resize)
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)

output = model(transforms(img).unsqueeze(0))  # output is (batch_size, num_features) shaped tensor

# or equivalently (without needing to set num_classes=0)

output = model.forward_features(transforms(img).unsqueeze(0))
# output is unpooled, a (1, 1792, 8, 8) shaped tensor

output = model.forward_head(output, pre_logits=True)
# output is a (1, num_features) shaped tensor

模型比较

在timm model results 中探索此模型的数据集和运行时指标。

对于下面的比较摘要,以ra_in1k、ra3_in1k、ch_in1k、sw_*和lion_*标记的权重在timm中进行训练。

model img_size top1 top5 param_count gmacs macts
12311321 384 88.228 98.684 644.81 374.99 210.2
12312321 384 86.84 98.364 145.05 95.0 88.87
12313321 384 86.024 98.05 83.59 46.87 67.67
12314321 288 86.004 97.83 83.59 26.37 38.07
12315321 224 85.996 97.848 644.81 127.66 71.58
12316321 288 85.982 97.844 83.59 26.37 38.07
12314321 224 85.574 97.666 83.59 15.96 23.04
12316321 224 85.564 97.674 83.59 15.96 23.04
12319321 288 85.398 97.584 51.82 20.06 35.34
12320321 384 85.15 97.436 1282.6 747.83 296.49
12321321 320 85.036 97.268 57.7 15.46 63.94
12319321 224 84.976 97.416 51.82 12.14 21.38
12323321 224 84.56 97.446 145.05 32.34 30.26
12324321 320 84.496 97.004 28.94 6.43 37.94
12321321 256 84.436 97.02 57.7 9.91 40.94
12326321 384 84.432 97.092 644.81 374.99 210.2
12327321 320 84.246 96.93 27.12 6.35 37.78
12328321 320 84.054 96.992 23.37 6.19 37.08
12329321 320 84.038 96.992 23.46 7.03 38.92
12330321 320 84.022 96.866 27.58 9.33 37.08
12331321 288 83.932 96.888 39.18 13.22 29.69
12332321 384 83.912 96.924 281.38 188.47 124.83
12333321 224 83.778 97.286 83.59 15.96 23.04
12324321 256 83.776 96.704 28.94 4.12 24.29
12335321 288 83.72 96.75 30.58 10.55 27.11
12336321 288 83.718 96.724 30.58 10.56 27.11
12337321 288 83.69 96.778 83.59 26.37 38.07
12327321 256 83.62 96.704 27.12 4.06 24.19
12328321 256 83.438 96.776 23.37 3.97 23.74
12330321 256 83.424 96.632 27.58 5.98 23.74
12329321 256 83.36 96.636 23.46 4.5 24.92
12342321 384 83.35 96.71 145.05 95.0 88.87
12343321 288 83.204 96.66 20.64 6.6 20.3
12344321 224 83.162 96.42 145.05 32.34 30.26
12331321 224 83.16 96.486 39.18 8.0 17.97
12335321 224 83.108 96.458 30.58 6.39 16.41
12347321 288 83.044 96.5 20.65 6.61 20.3
12336321 224 83.02 96.292 30.58 6.39 16.41
12337321 224 82.974 96.502 83.59 15.96 23.04
12350321 224 82.816 96.208 107.81 31.81 36.3
12351321 288 82.742 96.418 19.44 5.29 18.61
12352321 224 82.634 96.22 83.59 15.96 23.04
12353321 320 82.634 96.472 13.49 3.86 25.88
12354321 224 82.592 96.246 39.38 8.51 19.73
12355321 224 82.564 96.052 54.28 15.99 25.52
12356321 320 82.51 96.358 13.46 3.92 25.88
12343321 224 82.44 96.198 20.64 4.0 12.29
12347321 224 82.304 96.078 20.65 4.0 12.29
12356321 256 82.16 96.048 13.46 2.51 16.57
12353321 256 81.936 96.15 13.49 2.48 16.57
12351321 224 81.924 95.988 19.44 3.2 11.26
12362321 224 81.77 95.842 19.44 3.2 11.26
12363321 224 81.552 95.544 39.57 8.02 14.06
12364321 224 80.924 95.27 15.3 3.2 11.37
12365321 224 80.804 95.246 145.05 32.34 30.26
12366321 288 80.712 95.47 9.72 2.39 16.43
12367321 224 80.66 95.334 11.2 1.63 8.04
12368321 224 80.37 95.12 51.82 12.14 21.38
12369321 224 80.288 94.964 83.59 15.96 23.04
12370321 224 80.246 95.01 107.81 31.81 36.3
12371321 224 79.882 94.834 39.18 8.0 17.97
12366321 224 79.872 94.974 9.72 1.45 9.95
12373321 224 79.862 94.828 54.28 15.99 25.52
12374321 224 79.716 94.772 30.58 6.39 16.41
12375321 224 79.592 94.738 46.11 12.13 21.37
12376321 224 79.44 94.772 9.19 1.62 7.93
12377321 224 79.23 94.654 20.65 4.0 12.29
12378321 224 79.198 94.55 39.57 8.02 14.06
12379321 224 79.064 94.454 26.21 6.49 16.37
12380321 224 78.884 94.412 19.44 3.2 11.26
12381321 224 78.654 94.388 6.43 0.84 5.42
12382321 224 78.482 94.24 22.12 3.99 12.2
12383321 224 78.178 94.08 15.3 3.2 11.37
12384321 224 77.862 93.73 11.2 1.63 8.04
12385321 224 77.302 93.672 7.26 0.81 5.15
12386321 224 76.908 93.418 9.19 1.62 7.93
12387321 224 76.296 93.05 6.26 0.81 5.25
12388321 224 75.592 92.712 4.34 0.41 3.89
12389321 224 75.244 92.518 6.06 0.61 4.33
12390321 224 75.042 92.342 7.26 0.81 5.15
12391321 224 74.57 92.184 5.5 0.42 3.17
12392321 224 74.018 91.764 4.34 0.41 3.89
12393321 224 73.862 91.67 6.2 0.61 3.98
12394321 224 72.38 90.832 5.16 0.4 3.14
12395321 224 70.282 89.534 3.16 0.2 2.17
12396321 224 68.752 88.556 2.68 0.2 2.16

引用

@misc{rw2019timm,
  author = {Ross Wightman},
  title = {PyTorch Image Models},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub repository},
  doi = {10.5281/zenodo.4414861},
  howpublished = {\url{https://github.com/huggingface/pytorch-image-models}}
}
@InProceedings{Dollar2021,
  title = {Fast and Accurate Model Scaling},
  author = {Piotr Doll{'a}r and Mannat Singh and Ross Girshick},
  booktitle = {CVPR},
  year = {2021}
}
@article{liu2020evolving,
  title={Evolving normalization-activation layers},
  author={Liu, Hanxiao and Brock, Andy and Simonyan, Karen and Le, Quoc},
  journal={Advances in Neural Information Processing Systems},
  volume={33},
  pages={13539--13550},
  year={2020}
}