regnety_1280.seer的模型卡片

一个RegNetY-128GF的特征/骨干模型。根据SEER: 在“20亿个随机互联网图像”上使用SwAV进行自监督学习进行预训练。

timm的RegNet实现包括其他实现中没有的一些增强功能，包括：

随机深度
梯度检查点
逐层LR衰减
配置输出步幅（扩张）
配置激活和规范化层
在RegNetV变体中使用的预激活瓶颈块选项
仅知道有预训练权重的RegNetZ模型定义

模型细节

模型类型：图像分类/特征骨干
模型统计数据：
- 参数（M）：637.4
- GMACs：127.7
- 激活（M）：71.6
- 图像大小：224 x 224
论文：
- 在野外自我监督预训练视觉特征： https://arxiv.org/abs/2103.01988v2
- 设计网络设计空间： https://arxiv.org/abs/2003.13678
原始链接： https://github.com/facebookresearch/vissl
预训练数据集：RandomInternetImages-2B

模型使用

图像分类

from urllib.request import urlopen
from PIL import Image
import timm

img = Image.open(urlopen(
    'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))

model = timm.create_model('regnety_1280.seer', pretrained=True)
model = model.eval()

# get model specific transforms (normalization, resize)
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)

output = model(transforms(img).unsqueeze(0))  # unsqueeze single image into batch of 1

top5_probabilities, top5_class_indices = torch.topk(output.softmax(dim=1) * 100, k=5)

特征图提取

from urllib.request import urlopen
from PIL import Image
import timm

img = Image.open(urlopen(
    'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))

model = timm.create_model(
    'regnety_1280.seer',
    pretrained=True,
    features_only=True,
)
model = model.eval()

# get model specific transforms (normalization, resize)
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)

output = model(transforms(img).unsqueeze(0))  # unsqueeze single image into batch of 1

for o in output:
    # print shape of each feature map in output
    # e.g.:
    #  torch.Size([1, 32, 112, 112])
    #  torch.Size([1, 528, 56, 56])
    #  torch.Size([1, 1056, 28, 28])
    #  torch.Size([1, 2904, 14, 14])
    #  torch.Size([1, 7392, 7, 7])

    print(o.shape)

图像嵌入

from urllib.request import urlopen
from PIL import Image
import timm

img = Image.open(urlopen(
    'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))

model = timm.create_model(
    'regnety_1280.seer',
    pretrained=True,
    num_classes=0,  # remove classifier nn.Linear
)
model = model.eval()

# get model specific transforms (normalization, resize)
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)

output = model(transforms(img).unsqueeze(0))  # output is (batch_size, num_features) shaped tensor

# or equivalently (without needing to set num_classes=0)

output = model.forward_features(transforms(img).unsqueeze(0))
# output is unpooled, a (1, 7392, 7, 7) shaped tensor

output = model.forward_head(output, pre_logits=True)
# output is a (1, num_features) shaped tensor

模型比较

在timm中探索该模型的数据集和运行时指标 model results 。

对于以下比较摘要，ra_in1k、ra3_in1k、ch_in1k、sw_*和lion_*标记的权重是在timm中训练的。

model	img_size	top1	top5	param_count	gmacs	macts
12311321	384	88.228	98.684	644.81	374.99	210.2
12312321	384	86.84	98.364	145.05	95.0	88.87
12313321	384	86.024	98.05	83.59	46.87	67.67
12314321	288	86.004	97.83	83.59	26.37	38.07
12315321	224	85.996	97.848	644.81	127.66	71.58
12316321	288	85.982	97.844	83.59	26.37	38.07
12314321	224	85.574	97.666	83.59	15.96	23.04
12316321	224	85.564	97.674	83.59	15.96	23.04
12319321	288	85.398	97.584	51.82	20.06	35.34
12320321	384	85.15	97.436	1282.6	747.83	296.49
12321321	320	85.036	97.268	57.7	15.46	63.94
12319321	224	84.976	97.416	51.82	12.14	21.38
12323321	224	84.56	97.446	145.05	32.34	30.26
12324321	320	84.496	97.004	28.94	6.43	37.94
12321321	256	84.436	97.02	57.7	9.91	40.94
12326321	384	84.432	97.092	644.81	374.99	210.2
12327321	320	84.246	96.93	27.12	6.35	37.78
12328321	320	84.054	96.992	23.37	6.19	37.08
12329321	320	84.038	96.992	23.46	7.03	38.92
12330321	320	84.022	96.866	27.58	9.33	37.08
12331321	288	83.932	96.888	39.18	13.22	29.69
12332321	384	83.912	96.924	281.38	188.47	124.83
12333321	224	83.778	97.286	83.59	15.96	23.04
12324321	256	83.776	96.704	28.94	4.12	24.29
12335321	288	83.72	96.75	30.58	10.55	27.11
12336321	288	83.718	96.724	30.58	10.56	27.11
12337321	288	83.69	96.778	83.59	26.37	38.07
12327321	256	83.62	96.704	27.12	4.06	24.19
12328321	256	83.438	96.776	23.37	3.97	23.74
12330321	256	83.424	96.632	27.58	5.98	23.74
12329321	256	83.36	96.636	23.46	4.5	24.92
12342321	384	83.35	96.71	145.05	95.0	88.87
12343321	288	83.204	96.66	20.64	6.6	20.3
12344321	224	83.162	96.42	145.05	32.34	30.26
12331321	224	83.16	96.486	39.18	8.0	17.97
12335321	224	83.108	96.458	30.58	6.39	16.41
12347321	288	83.044	96.5	20.65	6.61	20.3
12336321	224	83.02	96.292	30.58	6.39	16.41
12337321	224	82.974	96.502	83.59	15.96	23.04
12350321	224	82.816	96.208	107.81	31.81	36.3
12351321	288	82.742	96.418	19.44	5.29	18.61
12352321	224	82.634	96.22	83.59	15.96	23.04
12353321	320	82.634	96.472	13.49	3.86	25.88
12354321	224	82.592	96.246	39.38	8.51	19.73
12355321	224	82.564	96.052	54.28	15.99	25.52
12356321	320	82.51	96.358	13.46	3.92	25.88
12343321	224	82.44	96.198	20.64	4.0	12.29
12347321	224	82.304	96.078	20.65	4.0	12.29
12356321	256	82.16	96.048	13.46	2.51	16.57
12353321	256	81.936	96.15	13.49	2.48	16.57
12351321	224	81.924	95.988	19.44	3.2	11.26
12362321	224	81.77	95.842	19.44	3.2	11.26
12363321	224	81.552	95.544	39.57	8.02	14.06
12364321	224	80.924	95.27	15.3	3.2	11.37
12365321	224	80.804	95.246	145.05	32.34	30.26
12366321	288	80.712	95.47	9.72	2.39	16.43
12367321	224	80.66	95.334	11.2	1.63	8.04
12368321	224	80.37	95.12	51.82	12.14	21.38
12369321	224	80.288	94.964	83.59	15.96	23.04
12370321	224	80.246	95.01	107.81	31.81	36.3
12371321	224	79.882	94.834	39.18	8.0	17.97
12366321	224	79.872	94.974	9.72	1.45	9.95
12373321	224	79.862	94.828	54.28	15.99	25.52
12374321	224	79.716	94.772	30.58	6.39	16.41
12375321	224	79.592	94.738	46.11	12.13	21.37
12376321	224	79.44	94.772	9.19	1.62	7.93
12377321	224	79.23	94.654	20.65	4.0	12.29
12378321	224	79.198	94.55	39.57	8.02	14.06
12379321	224	79.064	94.454	26.21	6.49	16.37
12380321	224	78.884	94.412	19.44	3.2	11.26
12381321	224	78.654	94.388	6.43	0.84	5.42
12382321	224	78.482	94.24	22.12	3.99	12.2
12383321	224	78.178	94.08	15.3	3.2	11.37
12384321	224	77.862	93.73	11.2	1.63	8.04
12385321	224	77.302	93.672	7.26	0.81	5.15
12386321	224	76.908	93.418	9.19	1.62	7.93
12387321	224	76.296	93.05	6.26	0.81	5.25
12388321	224	75.592	92.712	4.34	0.41	3.89
12389321	224	75.244	92.518	6.06	0.61	4.33
12390321	224	75.042	92.342	7.26	0.81	5.15
12391321	224	74.57	92.184	5.5	0.42	3.17
12392321	224	74.018	91.764	4.34	0.41	3.89
12393321	224	73.862	91.67	6.2	0.61	3.98
12394321	224	72.38	90.832	5.16	0.4	3.14
12395321	224	70.282	89.534	3.16	0.2	2.17
12396321	224	68.752	88.556	2.68	0.2	2.16

引用

@article{goyal2022vision,
  title={Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision}, 
  author={Priya Goyal and Quentin Duval and Isaac Seessel and Mathilde Caron and Ishan Misra and Levent Sagun and Armand Joulin and Piotr Bojanowski},
  year={2022},
  eprint={2202.08360},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

@InProceedings{Radosavovic2020,
  title = {Designing Network Design Spaces},
  author = {Ilija Radosavovic and Raj Prateek Kosaraju and Ross Girshick and Kaiming He and Piotr Doll{'a}r},
  booktitle = {CVPR},
  year = {2020}
}

@misc{rw2019timm,
  author = {Ross Wightman},
  title = {PyTorch Image Models},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub repository},
  doi = {10.5281/zenodo.4414861},
  howpublished = {\url{https://github.com/huggingface/pytorch-image-models}}
}

作者:

PyTorch Image Models

数据集大小:

4.75 GB