英文

seresnextaa101d_32x8d.sw_in12k模型卡

这是一种具有Squeeze-and-Excitation通道注意力的SE-ResNeXt-D(矩形-2抗锯齿)图像分类模型。

该模型的特点包括:

  • ReLU激活
  • 由三个3x3卷积层和池化层组成的3层干线
  • 2x2平均池化+1x1卷积进行快捷方式下采样
  • 分组的3x3 Engbottleneck卷积
  • Squeeze-and-Excitation通道注意力

使用Ross Wightman在timm中的模板描述的ImageNet-12k进行训练。

配方细节:

  • 基于Swin Transformer训练/预训练配方进行修改(与DeiT和ConvNeXt配方相关)
  • AdamW优化器,渐变裁剪,EMA权重平均
  • 余弦LR调度与预热

模型细节

模型用法

图像分类

from urllib.request import urlopen
from PIL import Image
import timm

img = Image.open(urlopen(
    'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))

model = timm.create_model('seresnextaa101d_32x8d.sw_in12k', pretrained=True)
model = model.eval()

# get model specific transforms (normalization, resize)
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)

output = model(transforms(img).unsqueeze(0))  # unsqueeze single image into batch of 1

top5_probabilities, top5_class_indices = torch.topk(output.softmax(dim=1) * 100, k=5)

特征图提取

from urllib.request import urlopen
from PIL import Image
import timm

img = Image.open(urlopen(
    'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))

model = timm.create_model(
    'seresnextaa101d_32x8d.sw_in12k',
    pretrained=True,
    features_only=True,
)
model = model.eval()

# get model specific transforms (normalization, resize)
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)

output = model(transforms(img).unsqueeze(0))  # unsqueeze single image into batch of 1

for o in output:
    # print shape of each feature map in output
    # e.g.:
    #  torch.Size([1, 64, 112, 112])
    #  torch.Size([1, 256, 56, 56])
    #  torch.Size([1, 512, 28, 28])
    #  torch.Size([1, 1024, 14, 14])
    #  torch.Size([1, 2048, 7, 7])

    print(o.shape)

图像嵌入

from urllib.request import urlopen
from PIL import Image
import timm

img = Image.open(urlopen(
    'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))

model = timm.create_model(
    'seresnextaa101d_32x8d.sw_in12k',
    pretrained=True,
    num_classes=0,  # remove classifier nn.Linear
)
model = model.eval()

# get model specific transforms (normalization, resize)
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)

output = model(transforms(img).unsqueeze(0))  # output is (batch_size, num_features) shaped tensor

# or equivalently (without needing to set num_classes=0)

output = model.forward_features(transforms(img).unsqueeze(0))
# output is unpooled, a (1, 2048, 7, 7) shaped tensor

output = model.forward_head(output, pre_logits=True)
# output is a (1, num_features) shaped tensor

模型比较

在timm中探索此模型的数据集和运行时指标 model results

model img_size top1 top5 param_count gmacs macts img/sec
12316321 320 86.72 98.17 93.6 35.2 69.7 451
12316321 288 86.51 98.08 93.6 28.5 56.4 560
12318321 288 86.49 98.03 93.6 28.5 56.4 557
12318321 224 85.96 97.82 93.6 17.2 34.2 923
12320321 224 85.11 97.44 468.5 87.3 91.1 254
12321321 416 85.0 97.12 191.9 108.4 213.8 134
12322321 352 84.96 97.22 102.1 50.2 101.2 291
12322321 320 84.73 97.18 102.1 41.5 83.7 353
12324321 384 84.71 96.99 164.0 77.6 154.7 183
12325321 288 84.57 97.08 93.6 28.5 56.4 557
12326321 320 84.45 97.08 93.2 31.5 67.8 446
12327321 352 84.43 96.97 129.9 51.1 105.5 280
12328321 288 84.36 96.92 93.6 27.6 53.0 595
12329321 320 84.35 97.04 66.8 24.1 47.7 610
12324321 288 84.3 96.94 164.0 43.7 87.1 333
12331321 224 84.28 97.17 88.8 16.5 31.2 1100
12321321 320 84.24 96.86 191.9 64.2 126.6 228
12333321 288 84.19 96.87 93.6 27.2 51.6 613
12334321 224 84.18 97.19 194.0 36.3 51.2 581
12335321 288 84.11 97.11 44.6 15.1 29.0 1144
12336321 320 83.97 96.82 64.7 31.2 67.3 518
12326321 256 83.87 96.75 93.2 20.2 43.4 692
12325321 224 83.86 96.65 93.6 17.2 34.2 923
12339321 320 83.72 96.61 86.6 24.3 48.1 617
12329321 256 83.69 96.78 66.8 15.4 30.6 943
12328321 224 83.68 96.61 93.6 16.7 32.0 986
12342321 320 83.67 96.74 60.2 24.1 47.7 706
12327321 256 83.59 96.61 129.9 27.1 55.8 526
12333321 224 83.58 96.4 93.6 16.5 31.2 1013
12335321 224 83.54 96.83 44.6 9.1 17.6 1864
12346321 288 83.46 96.54 60.2 19.1 37.3 904
12347321 224 83.35 96.85 194.0 36.3 51.2 582
12336321 256 83.23 96.53 64.7 20.0 43.1 809
12349321 224 83.22 96.75 44.2 8.0 21.2 1814
12350321 288 83.16 96.38 83.5 25.7 51.6 590
12342321 256 83.14 96.38 60.2 15.4 30.5 1096
12352321 320 83.02 96.45 44.6 16.5 34.8 992
12353321 288 82.98 96.54 44.6 13.4 28.2 1077
12354321 224 82.98 96.25 83.5 15.5 31.2 989
12339321 256 82.86 96.28 86.6 15.6 30.8 951
12356321 224 82.83 96.22 88.8 16.5 31.2 1099
12346321 224 82.8 96.13 60.2 11.6 22.6 1486
12358321 288 82.8 96.32 44.6 13.0 26.8 1291
12359321 288 82.74 95.71 60.2 19.1 37.3 905
12360321 224 82.69 96.63 88.8 16.5 31.2 1100
12361321 288 82.62 95.75 60.2 19.1 37.3 904
12362321 288 82.61 96.49 25.6 8.9 20.6 1729
12363321 288 82.53 96.13 36.8 9.9 21.5 1773
12364321 224 82.5 96.02 126.9 22.8 21.2 1078
12350321 224 82.46 95.92 83.5 15.5 31.2 987
12366321 288 82.36 96.18 35.7 8.1 20.9 1964
12367321 320 82.35 96.14 25.6 8.8 24.1 1386
12368321 288 82.31 95.63 44.6 13.0 26.8 1291
12369321 288 82.29 96.01 63.6 13.6 28.5 1078
12370321 224 82.29 96.0 60.2 11.6 22.6 1484
12371321 288 82.27 96.06 68.9 18.9 23.8 1176
12352321 256 82.26 96.07 44.6 10.6 22.2 1542
12373321 288 82.24 95.73 44.6 13.0 26.8 1290
12374321 288 82.2 96.14 27.6 7.0 23.8 1547
12353321 224 82.18 96.05 44.6 8.1 17.1 1771
12376321 224 82.17 96.22 25.0 4.3 14.4 2943
12377321 288 82.12 95.65 25.6 7.1 19.6 1704
12378321 288 82.03 95.94 25.0 7.0 23.8 1745
12379321 288 82.0 96.15 24.9 5.8 12.7 1787
12363321 256 81.99 95.85 36.8 7.8 17.0 2230
12356321 176 81.98 95.72 88.8 10.3 19.4 1768
12359321 224 81.97 95.24 60.2 11.6 22.6 1486
12358321 224 81.93 95.75 44.6 7.8 16.2 2122
12384321 224 81.9 95.77 44.6 7.8 16.2 2118
12385321 224 81.84 96.1 194.0 36.3 51.2 583
12366321 256 81.78 95.94 35.7 6.4 16.6 2471
12361321 224 81.77 95.22 60.2 11.6 22.6 1485
12362321 224 81.74 96.06 25.6 5.4 12.4 2813
12389321 288 81.65 95.54 25.6 7.1 19.6 1703
12390321 288 81.64 95.88 25.6 7.2 19.7 1694
12391321 224 81.62 96.04 88.8 16.5 31.2 1101
12392321 224 81.61 95.76 68.9 11.4 14.4 1930
12393321 288 81.61 95.83 25.6 8.5 19.2 1868
12368321 224 81.5 95.16 44.6 7.8 16.2 2125
12395321 288 81.48 95.16 25.0 7.0 23.8 1745
12396321 288 81.47 95.71 25.9 6.9 18.6 2071
12371321 224 81.45 95.53 68.9 11.4 14.4 1929
12398321 288 81.44 95.22 25.6 7.2 19.7 1908
12367321 256 81.44 95.67 25.6 5.6 15.4 2168
123100321 288 81.4 95.82 30.2 6.8 13.9 2132
123101321 288 81.37 95.74 25.6 7.2 19.7 1910
12373321 224 81.32 95.19 44.6 7.8 16.2 2125
123103321 288 81.3 95.65 28.1 6.8 18.4 1803
123104321 288 81.3 95.11 25.0 7.0 23.8 1746
12374321 224 81.27 95.62 27.6 4.3 14.4 2591
12377321 224 81.26 95.16 25.6 4.3 11.8 2823
123107321 288 81.23 95.54 15.7 4.8 19.6 2117
123108321 224 81.23 95.35 115.1 20.8 38.7 545
123109321 288 81.22 95.11 25.6 6.8 18.4 2089
123110321 288 81.22 95.63 25.6 6.8 18.4 676
123111321 288 81.18 95.09 25.6 7.2 19.7 1908
123112321 224 81.18 95.98 25.6 4.1 11.1 3455
123113321 224 81.17 95.34 25.0 4.3 14.4 2933
12378321 224 81.1 95.33 25.0 4.3 14.4 2934
123115321 288 81.1 95.23 28.1 6.8 18.4 1801
123116321 288 81.1 95.12 28.1 6.8 18.4 1799
123117321 224 81.02 95.41 60.3 12.9 25.0 1347
123118321 288 80.97 95.44 25.6 6.8 18.4 2085
12396321 256 80.94 95.45 25.9 5.4 14.7 2571
123120321 224 80.93 95.73 44.2 8.0 21.2 1814
123121321 288 80.91 95.55 25.6 6.8 18.4 2084
123122321 224 80.9 95.31 49.0 8.0 21.3 1585
123123321 224 80.9 95.3 88.2 15.5 31.2 918
123124321 288 80.86 95.52 25.6 6.8 18.4 2085
123125321 224 80.85 95.43 25.6 4.1 11.1 3450
12389321 224 80.84 95.02 25.6 4.3 11.8 2821
12379321 224 80.79 95.62 24.9 3.5 7.7 2961
123128321 288 80.79 95.36 19.8 6.0 14.8 2506
123129321 288 80.79 95.58 19.9 4.2 10.6 2349
123130321 288 80.78 94.99 25.6 6.8 18.4 2088
123131321 288 80.71 95.43 25.6 6.8 18.4 2087
123132321 288 80.7 95.39 25.0 7.0 23.8 1749
12369321 192 80.69 95.24 63.6 6.0 12.7 2270
12398321 224 80.68 94.71 25.6 4.4 11.9 3162
123135321 288 80.68 95.36 19.7 6.0 14.8 2637
123136321 224 80.67 95.3 25.6 4.1 11.1 3452
123137321 288 80.67 95.42 25.0 7.4 25.1 1626
12393321 224 80.63 95.21 25.6 5.2 11.6 3034
12390321 224 80.61 95.32 25.6 4.4 11.9 2813
123140321 224 80.61 94.99 83.5 15.5 31.2 989
123141321 288 80.6 95.31 19.9 6.0 14.8 2578
123107321 256 80.57 95.17 15.7 3.8 15.5 2710
123143321 224 80.56 95.0 60.2 11.6 22.6 1483
123101321 224 80.53 95.16 25.6 4.4 11.9 3164
12395321 224 80.53 94.46 25.0 4.3 14.4 2930
12364321 176 80.48 94.98 126.9 14.3 13.2 1719
123147321 224 80.47 95.2 60.2 11.8 23.4 1428
123148321 288 80.45 95.32 25.6 6.8 18.4 2086
123100321 224 80.45 95.24 30.2 4.1 8.4 3530
123104321 224 80.45 94.63 25.0 4.3 14.4 2936
12392321 176 80.43 95.09 68.9 7.3 9.0 3015
123152321 224 80.42 95.01 44.6 8.1 17.0 2007
123109321 224 80.38 94.6 25.6 4.1 11.1 3461
123128321 256 80.36 95.1 19.8 4.8 11.7 3267
123155321 224 80.34 94.93 44.2 8.0 21.2 1814
123156321 224 80.32 95.4 25.0 4.3 14.4 2941
123157321 224 80.28 95.16 44.7 9.2 18.6 1851
123103321 224 80.26 95.08 28.1 4.1 11.1 2972
123159321 288 80.24 95.24 25.6 8.5 19.9 1523
123111321 224 80.22 94.63 25.6 4.4 11.9 3162
12370321 176 80.2 94.64 60.2 7.2 14.0 2346
123115321 224 80.08 94.74 28.1 4.1 11.1 2969
123135321 256 80.08 94.97 19.7 4.8 11.7 3284
123141321 256 80.06 94.99 19.9 4.8 11.7 3216
123110321 224 80.06 94.95 25.6 4.1 11.1 1109
123116321 224 80.02 94.71 28.1 4.1 11.1 2962
123167321 288 79.97 95.05 25.6 6.8 18.4 2086
123168321 224 79.92 94.84 60.2 11.8 23.4 1455
123169321 224 79.91 94.82 27.6 4.3 14.4 2591
123118321 224 79.91 94.67 25.6 4.1 11.1 3456
12384321 176 79.9 94.6 44.6 4.9 10.1 3341
123172321 224 79.89 94.97 35.7 4.5 12.1 2774
123124321 224 79.88 94.87 25.6 4.1 11.1 3455
123174321 320 79.86 95.07 16.0 5.2 16.4 2168
123130321 224 79.85 94.56 25.6 4.1 11.1 3460
123176321 288 79.83 94.97 25.6 6.8 18.4 2087
123177321 224 79.82 94.62 44.6 7.8 16.2 2114
123132321 224 79.76 94.6 25.0 4.3 14.4 2943
123121321 224 79.74 94.95 25.6 4.1 11.1 3455
123129321 224 79.74 94.87 19.9 2.5 6.4 3929
123181321 288 79.71 94.83 19.7 6.0 14.8 2710
123182321 224 79.68 94.74 60.2 11.6 22.6 1486
123137321 224 79.67 94.87 25.0 4.5 15.2 2729
123184321 288 79.63 94.91 25.6 6.8 18.4 2086
123185321 224 79.56 94.72 25.6 4.3 11.8 2805
123186321 224 79.53 94.58 44.6 8.1 17.0 2062
123131321 224 79.52 94.61 25.6 4.1 11.1 3459
123125321 176 79.42 94.64 25.6 2.6 6.9 5397
123189321 288 79.4 94.66 18.0 5.9 14.6 2752
123148321 224 79.38 94.57 25.6 4.1 11.1 3459
123113321 176 79.37 94.3 25.0 2.7 9.0 4577
123192321 224 79.36 94.43 25.0 4.3 14.4 2942
123193321 224 79.31 94.52 88.8 16.5 31.2 1100
123194321 224 79.31 94.53 44.6 7.8 16.2 2125
123159321 224 79.31 94.63 25.6 5.2 12.0 2524
123136321 176 79.27 94.49 25.6 2.6 6.9 5404
123197321 224 79.25 94.31 25.0 4.3 14.4 2931
123198321 224 79.22 94.84 25.6 4.1 11.1 3451
123181321 256 79.21 94.56 19.7 4.8 11.7 3392
123200321 224 79.07 94.48 25.6 4.4 11.9 3162
123167321 224 79.03 94.38 25.6 4.1 11.1 3453
123202321 224 79.01 94.39 25.6 4.1 11.1 3461
123189321 256 79.01 94.37 18.0 4.6 11.6 3440
123174321 256 78.9 94.54 16.0 3.4 10.5 3421
123143321 160 78.89 94.11 60.2 5.9 11.5 2745
123206321 224 78.84 94.28 126.9 22.8 21.2 1079
123207321 288 78.83 94.24 16.8 4.5 16.8 2251
123176321 224 78.81 94.32 25.6 4.1 11.1 3454
123209321 288 78.74 94.33 16.8 4.5 16.7 2264
123210321 224 78.72 94.23 25.7 5.5 13.5 2796
123211321 224 78.71 94.24 25.6 4.4 11.9 3154
123212321 224 78.47 94.09 68.9 11.4 14.4 1934
123184321 224 78.46 94.27 25.6 4.1 11.1 3454
123214321 288 78.43 94.35 21.8 6.5 7.5 3291
123215321 288 78.42 94.04 10.5 3.1 13.3 3226
123216321 320 78.33 94.13 16.0 5.2 16.4 2391
123217321 224 78.32 94.04 60.2 11.6 22.6 1487
123218321 288 78.28 94.1 10.4 3.1 13.3 3062
123219321 256 78.25 94.1 10.7 2.5 12.5 3393
123220321 224 78.06 93.78 25.6 4.1 11.1 3450
123221321 224 78.0 93.99 25.6 4.4 11.9 3286
123222321 288 78.0 93.91 10.3 3.1 13.3 3297
123209321 224 77.98 93.75 16.8 2.7 10.1 3841
123224321 288 77.92 93.77 21.8 6.1 6.2 3609
123177321 160 77.88 93.71 44.6 4.0 8.3 3926
123216321 256 77.87 93.84 16.0 3.4 10.5 3772
123218321 256 77.86 93.79 10.4 2.4 10.5 4263
123172321 160 77.82 93.81 35.7 2.3 6.2 5238
123215321 256 77.81 93.82 10.5 2.4 10.5 4183
123185321 160 77.79 93.6 25.6 2.2 6.0 5329
123197321 160 77.73 93.32 25.0 2.2 7.4 5576
123232321 224 77.61 93.7 25.0 4.3 14.4 2944
123207321 224 77.59 93.61 16.8 2.7 10.2 3807
123234321 224 77.58 93.72 25.6 4.1 11.1 3455
123222321 256 77.44 93.56 10.3 2.4 10.5 4284
123236321 288 77.41 93.63 16.0 4.3 13.5 2907
123237321 224 77.38 93.54 44.6 7.8 16.2 2125
123211321 160 77.22 93.27 25.6 2.2 6.1 5982
123239321 288 77.17 93.47 10.3 3.1 13.3 3392
123240321 288 77.15 93.27 21.8 6.1 6.2 3615
123214321 224 77.1 93.37 21.8 3.9 4.5 5436
123242321 224 77.02 93.07 28.1 4.1 11.1 2952
123239321 256 76.78 93.13 10.3 2.4 10.5 4410
123236321 224 76.7 93.17 16.0 2.6 8.2 4859
123245321 288 76.5 93.35 21.8 6.1 6.2 3617
123224321 224 76.42 92.87 21.8 3.7 3.7 5984
123247321 288 76.35 93.18 16.0 3.9 12.2 3331
123248321 224 76.13 92.86 25.6 4.1 11.1 3457
123220321 160 75.96 92.5 25.6 2.1 5.7 6490
123240321 224 75.52 92.44 21.8 3.7 3.7 5991
123247321 224 75.3 92.58 16.0 2.4 7.4 5583
123245321 224 75.16 92.18 21.8 3.7 3.7 5994
123242321 160 75.1 92.08 28.1 2.1 5.7 5513
123254321 224 74.57 91.98 21.8 3.7 3.7 5984
123255321 288 73.81 91.83 11.7 3.4 5.4 5196
123256321 224 73.32 91.42 21.8 3.7 3.7 5979
123257321 224 73.28 91.73 11.7 1.8 2.5 10213
123258321 288 73.16 91.03 11.7 3.0 4.1 6050
123259321 224 72.98 91.11 21.8 3.7 3.7 5967
123260321 224 72.6 91.42 11.7 1.8 2.5 10213
123261321 288 72.37 90.59 11.7 3.0 4.1 6051
123262321 224 72.26 90.31 10.1 1.7 5.8 7026
123255321 224 72.26 90.68 11.7 2.1 3.3 8707
123258321 224 71.49 90.07 11.7 1.8 2.5 10187
123262321 176 71.31 89.69 10.1 1.1 3.6 10970
123266321 224 70.84 89.76 11.7 1.8 2.5 10210
123261321 224 70.64 89.47 11.7 1.8 2.5 10194
123259321 160 70.56 89.52 21.8 1.9 1.9 10737
123269321 224 69.76 89.07 11.7 1.8 2.5 10205
123270321 224 68.34 88.03 5.4 1.1 2.4 13079
123271321 224 68.25 88.17 11.7 1.8 2.5 10167
123270321 176 66.71 86.96 5.4 0.7 1.5 20327
123271321 160 65.66 86.26 11.7 0.9 1.3 18229

引用

@misc{rw2019timm,
  author = {Ross Wightman},
  title = {PyTorch Image Models},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub repository},
  doi = {10.5281/zenodo.4414861},
  howpublished = {\url{https://github.com/huggingface/pytorch-image-models}}
}
@article{Xie2016,
  title={Aggregated Residual Transformations for Deep Neural Networks},
  author={Saining Xie and Ross Girshick and Piotr Dollár and Zhuowen Tu and Kaiming He},
  journal={arXiv preprint arXiv:1611.05431},
  year={2016}
}
@inproceedings{zhang2019shiftinvar,
  title={Making Convolutional Networks Shift-Invariant Again},
  author={Zhang, Richard},
  booktitle={ICML},
  year={2019}
}
@article{He2015,
  author = {Kaiming He and Xiangyu Zhang and Shaoqing Ren and Jian Sun},
  title = {Deep Residual Learning for Image Recognition},
  journal = {arXiv preprint arXiv:1512.03385},
  year = {2015}
}
@inproceedings{hu2018senet,
  title={Squeeze-and-Excitation Networks},
  author={Jie Hu and Li Shen and Gang Sun},
  journal={IEEE Conference on Computer Vision and Pattern Recognition},
  year={2018}
}
@article{He2018BagOT,
  title={Bag of Tricks for Image Classification with Convolutional Neural Networks},
  author={Tong He and Zhi Zhang and Hang Zhang and Zhongyue Zhang and Junyuan Xie and Mu Li},
  journal={2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2018},
  pages={558-567}
}