英文

regnety_1280.seer的模型卡片

一个RegNetY-128GF的特征/骨干模型。根据SEER: 在“20亿个随机互联网图像”上使用SwAV进行自监督学习进行预训练。

SEER使用SEER许可证,版权所有©Meta Platforms, Inc. license 是一种限制非商业使用和分发的许可证。

timm的RegNet实现包括其他实现中没有的一些增强功能,包括:

  • 随机深度
  • 梯度检查点
  • 逐层LR衰减
  • 配置输出步幅(扩张)
  • 配置激活和规范化层
  • 在RegNetV变体中使用的预激活瓶颈块选项
  • 仅知道有预训练权重的RegNetZ模型定义

模型细节

模型使用

图像分类

from urllib.request import urlopen
from PIL import Image
import timm

img = Image.open(urlopen(
    'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))

model = timm.create_model('regnety_1280.seer', pretrained=True)
model = model.eval()

# get model specific transforms (normalization, resize)
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)

output = model(transforms(img).unsqueeze(0))  # unsqueeze single image into batch of 1

top5_probabilities, top5_class_indices = torch.topk(output.softmax(dim=1) * 100, k=5)

特征图提取

from urllib.request import urlopen
from PIL import Image
import timm

img = Image.open(urlopen(
    'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))

model = timm.create_model(
    'regnety_1280.seer',
    pretrained=True,
    features_only=True,
)
model = model.eval()

# get model specific transforms (normalization, resize)
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)

output = model(transforms(img).unsqueeze(0))  # unsqueeze single image into batch of 1

for o in output:
    # print shape of each feature map in output
    # e.g.:
    #  torch.Size([1, 32, 112, 112])
    #  torch.Size([1, 528, 56, 56])
    #  torch.Size([1, 1056, 28, 28])
    #  torch.Size([1, 2904, 14, 14])
    #  torch.Size([1, 7392, 7, 7])

    print(o.shape)

图像嵌入

from urllib.request import urlopen
from PIL import Image
import timm

img = Image.open(urlopen(
    'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))

model = timm.create_model(
    'regnety_1280.seer',
    pretrained=True,
    num_classes=0,  # remove classifier nn.Linear
)
model = model.eval()

# get model specific transforms (normalization, resize)
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)

output = model(transforms(img).unsqueeze(0))  # output is (batch_size, num_features) shaped tensor

# or equivalently (without needing to set num_classes=0)

output = model.forward_features(transforms(img).unsqueeze(0))
# output is unpooled, a (1, 7392, 7, 7) shaped tensor

output = model.forward_head(output, pre_logits=True)
# output is a (1, num_features) shaped tensor

模型比较

在timm中探索该模型的数据集和运行时指标 model results

对于以下比较摘要,ra_in1k、ra3_in1k、ch_in1k、sw_*和lion_*标记的权重是在timm中训练的。

model img_size top1 top5 param_count gmacs macts
12311321 384 88.228 98.684 644.81 374.99 210.2
12312321 384 86.84 98.364 145.05 95.0 88.87
12313321 384 86.024 98.05 83.59 46.87 67.67
12314321 288 86.004 97.83 83.59 26.37 38.07
12315321 224 85.996 97.848 644.81 127.66 71.58
12316321 288 85.982 97.844 83.59 26.37 38.07
12314321 224 85.574 97.666 83.59 15.96 23.04
12316321 224 85.564 97.674 83.59 15.96 23.04
12319321 288 85.398 97.584 51.82 20.06 35.34
12320321 384 85.15 97.436 1282.6 747.83 296.49
12321321 320 85.036 97.268 57.7 15.46 63.94
12319321 224 84.976 97.416 51.82 12.14 21.38
12323321 224 84.56 97.446 145.05 32.34 30.26
12324321 320 84.496 97.004 28.94 6.43 37.94
12321321 256 84.436 97.02 57.7 9.91 40.94
12326321 384 84.432 97.092 644.81 374.99 210.2
12327321 320 84.246 96.93 27.12 6.35 37.78
12328321 320 84.054 96.992 23.37 6.19 37.08
12329321 320 84.038 96.992 23.46 7.03 38.92
12330321 320 84.022 96.866 27.58 9.33 37.08
12331321 288 83.932 96.888 39.18 13.22 29.69
12332321 384 83.912 96.924 281.38 188.47 124.83
12333321 224 83.778 97.286 83.59 15.96 23.04
12324321 256 83.776 96.704 28.94 4.12 24.29
12335321 288 83.72 96.75 30.58 10.55 27.11
12336321 288 83.718 96.724 30.58 10.56 27.11
12337321 288 83.69 96.778 83.59 26.37 38.07
12327321 256 83.62 96.704 27.12 4.06 24.19
12328321 256 83.438 96.776 23.37 3.97 23.74
12330321 256 83.424 96.632 27.58 5.98 23.74
12329321 256 83.36 96.636 23.46 4.5 24.92
12342321 384 83.35 96.71 145.05 95.0 88.87
12343321 288 83.204 96.66 20.64 6.6 20.3
12344321 224 83.162 96.42 145.05 32.34 30.26
12331321 224 83.16 96.486 39.18 8.0 17.97
12335321 224 83.108 96.458 30.58 6.39 16.41
12347321 288 83.044 96.5 20.65 6.61 20.3
12336321 224 83.02 96.292 30.58 6.39 16.41
12337321 224 82.974 96.502 83.59 15.96 23.04
12350321 224 82.816 96.208 107.81 31.81 36.3
12351321 288 82.742 96.418 19.44 5.29 18.61
12352321 224 82.634 96.22 83.59 15.96 23.04
12353321 320 82.634 96.472 13.49 3.86 25.88
12354321 224 82.592 96.246 39.38 8.51 19.73
12355321 224 82.564 96.052 54.28 15.99 25.52
12356321 320 82.51 96.358 13.46 3.92 25.88
12343321 224 82.44 96.198 20.64 4.0 12.29
12347321 224 82.304 96.078 20.65 4.0 12.29
12356321 256 82.16 96.048 13.46 2.51 16.57
12353321 256 81.936 96.15 13.49 2.48 16.57
12351321 224 81.924 95.988 19.44 3.2 11.26
12362321 224 81.77 95.842 19.44 3.2 11.26
12363321 224 81.552 95.544 39.57 8.02 14.06
12364321 224 80.924 95.27 15.3 3.2 11.37
12365321 224 80.804 95.246 145.05 32.34 30.26
12366321 288 80.712 95.47 9.72 2.39 16.43
12367321 224 80.66 95.334 11.2 1.63 8.04
12368321 224 80.37 95.12 51.82 12.14 21.38
12369321 224 80.288 94.964 83.59 15.96 23.04
12370321 224 80.246 95.01 107.81 31.81 36.3
12371321 224 79.882 94.834 39.18 8.0 17.97
12366321 224 79.872 94.974 9.72 1.45 9.95
12373321 224 79.862 94.828 54.28 15.99 25.52
12374321 224 79.716 94.772 30.58 6.39 16.41
12375321 224 79.592 94.738 46.11 12.13 21.37
12376321 224 79.44 94.772 9.19 1.62 7.93
12377321 224 79.23 94.654 20.65 4.0 12.29
12378321 224 79.198 94.55 39.57 8.02 14.06
12379321 224 79.064 94.454 26.21 6.49 16.37
12380321 224 78.884 94.412 19.44 3.2 11.26
12381321 224 78.654 94.388 6.43 0.84 5.42
12382321 224 78.482 94.24 22.12 3.99 12.2
12383321 224 78.178 94.08 15.3 3.2 11.37
12384321 224 77.862 93.73 11.2 1.63 8.04
12385321 224 77.302 93.672 7.26 0.81 5.15
12386321 224 76.908 93.418 9.19 1.62 7.93
12387321 224 76.296 93.05 6.26 0.81 5.25
12388321 224 75.592 92.712 4.34 0.41 3.89
12389321 224 75.244 92.518 6.06 0.61 4.33
12390321 224 75.042 92.342 7.26 0.81 5.15
12391321 224 74.57 92.184 5.5 0.42 3.17
12392321 224 74.018 91.764 4.34 0.41 3.89
12393321 224 73.862 91.67 6.2 0.61 3.98
12394321 224 72.38 90.832 5.16 0.4 3.14
12395321 224 70.282 89.534 3.16 0.2 2.17
12396321 224 68.752 88.556 2.68 0.2 2.16

引用

@article{goyal2022vision,
  title={Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision}, 
  author={Priya Goyal and Quentin Duval and Isaac Seessel and Mathilde Caron and Ishan Misra and Levent Sagun and Armand Joulin and Piotr Bojanowski},
  year={2022},
  eprint={2202.08360},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}
@InProceedings{Radosavovic2020,
  title = {Designing Network Design Spaces},
  author = {Ilija Radosavovic and Raj Prateek Kosaraju and Ross Girshick and Kaiming He and Piotr Doll{'a}r},
  booktitle = {CVPR},
  year = {2020}
}
@misc{rw2019timm,
  author = {Ross Wightman},
  title = {PyTorch Image Models},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub repository},
  doi = {10.5281/zenodo.4414861},
  howpublished = {\url{https://github.com/huggingface/pytorch-image-models}}
}