TorchSeg: Semantic Segmentation models for PyTorch
Project description
TorchSeg
TorchSeg is an actively maintained and up-to-date fork of the Segmentation Models PyTorch (torchseg).
Features
The main features of this library are:
- High level API (just two lines to create a neural network)
- 9 models architectures for binary and multi class segmentation (including legendary Unet)
- 124 available encoders (and 500+ encoders from timm)
- All encoders have pre-trained weights for faster and better convergence
- Popular losses for training routines
Example Usage
Segmentation model is just a PyTorch nn.Module, which can be created as easy as:
import torchseg
model = torchseg.Unet(
encoder_name="resnet34", # choose encoder, e.g. mobilenet_v2 or efficientnet-b7
encoder_weights="imagenet", # use `imagenet` pre-trained weights for encoder initialization
in_channels=1, # model input channels (1 for gray-scale images, 3 for RGB, etc.)
classes=3, # model output channels (number of classes in your dataset)
)
- see table with available model architectures
- see table with available encoders and their corresponding weights
Models
Architectures
- Unet [paper]
- Unet++ [paper]
- MAnet [paper]
- Linknet [paper]
- FPN [paper]
- PSPNet [paper]
- PAN [paper]
- DeepLabV3 [paper]
- DeepLabV3+ [paper]
Encoders
The following is a list of supported encoders in TorchSeg. Select the appropriate family of encoders and click to expand the table and select a specific encoder and its pre-trained weights (encoder_name
and encoder_weights
parameters).
ResNet
Encoder | Weights | Params, M |
---|---|---|
resnet18 | imagenet / ssl / swsl | 11M |
resnet34 | imagenet | 21M |
resnet50 | imagenet / ssl / swsl | 23M |
resnet101 | imagenet | 42M |
resnet152 | imagenet | 58M |
ResNeXt
Encoder | Weights | Params, M |
---|---|---|
resnext50_32x4d | imagenet / ssl / swsl | 22M |
resnext101_32x4d | ssl / swsl | 42M |
resnext101_32x8d | imagenet / instagram / ssl / swsl | 86M |
resnext101_32x16d | instagram / ssl / swsl | 191M |
resnext101_32x32d | 466M | |
resnext101_32x48d | 826M |
ResNeSt
Encoder | Weights | Params, M |
---|---|---|
timm-resnest14d | imagenet | 8M |
timm-resnest26d | imagenet | 15M |
timm-resnest50d | imagenet | 25M |
timm-resnest101e | imagenet | 46M |
timm-resnest200e | imagenet | 68M |
timm-resnest269e | imagenet | 108M |
timm-resnest50d_4s2x40d | imagenet | 28M |
timm-resnest50d_1s4x24d | imagenet | 23M |
Res2Ne(X)t
Encoder | Weights | Params, M |
---|---|---|
timm-res2net50_26w_4s | imagenet | 23M |
timm-res2net101_26w_4s | imagenet | 43M |
timm-res2net50_26w_6s | imagenet | 35M |
timm-res2net50_26w_8s | imagenet | 46M |
timm-res2net50_48w_2s | imagenet | 23M |
timm-res2net50_14w_8s | imagenet | 23M |
timm-res2next50 | imagenet | 22M |
RegNet(x/y)
Encoder | Weights | Params, M |
---|---|---|
timm-regnetx_002 | imagenet | 2M |
timm-regnetx_004 | imagenet | 4M |
timm-regnetx_006 | imagenet | 5M |
timm-regnetx_008 | imagenet | 6M |
timm-regnetx_016 | imagenet | 8M |
timm-regnetx_032 | imagenet | 14M |
timm-regnetx_040 | imagenet | 20M |
timm-regnetx_064 | imagenet | 24M |
timm-regnetx_080 | imagenet | 37M |
timm-regnetx_120 | imagenet | 43M |
timm-regnetx_160 | imagenet | 52M |
timm-regnetx_320 | imagenet | 105M |
timm-regnety_002 | imagenet | 2M |
timm-regnety_004 | imagenet | 3M |
timm-regnety_006 | imagenet | 5M |
timm-regnety_008 | imagenet | 5M |
timm-regnety_016 | imagenet | 10M |
timm-regnety_032 | imagenet | 17M |
timm-regnety_040 | imagenet | 19M |
timm-regnety_064 | imagenet | 29M |
timm-regnety_080 | imagenet | 37M |
timm-regnety_120 | imagenet | 49M |
timm-regnety_160 | imagenet | 80M |
timm-regnety_320 | imagenet | 141M |
GERNet
Encoder | Weights | Params, M |
---|---|---|
timm-gernet_s | imagenet | 6M |
timm-gernet_m | imagenet | 18M |
timm-gernet_l | imagenet | 28M |
SE-Net
Encoder | Weights | Params, M |
---|---|---|
senet154 | imagenet | 113M |
se_resnet50 | imagenet | 26M |
se_resnet101 | imagenet | 47M |
se_resnet152 | imagenet | 64M |
se_resnext50_32x4d | imagenet | 25M |
se_resnext101_32x4d | imagenet | 46M |
SK-ResNe(X)t
Encoder | Weights | Params, M |
---|---|---|
timm-skresnet18 | imagenet | 11M |
timm-skresnet34 | imagenet | 21M |
timm-skresnext50_32x4d | imagenet | 25M |
DenseNet
Encoder | Weights | Params, M |
---|---|---|
densenet121 | imagenet | 6M |
densenet169 | imagenet | 12M |
densenet201 | imagenet | 18M |
densenet161 | imagenet | 26M |
Inception
Encoder | Weights | Params, M |
---|---|---|
inceptionresnetv2 | imagenet / imagenet+background | 54M |
inceptionv4 | imagenet / imagenet+background | 41M |
xception | imagenet | 22M |
EfficientNet
Encoder | Weights | Params, M |
---|---|---|
efficientnet-b0 | imagenet | 4M |
efficientnet-b1 | imagenet | 6M |
efficientnet-b2 | imagenet | 7M |
efficientnet-b3 | imagenet | 10M |
efficientnet-b4 | imagenet | 17M |
efficientnet-b5 | imagenet | 28M |
efficientnet-b6 | imagenet | 40M |
efficientnet-b7 | imagenet | 63M |
timm-efficientnet-b0 | imagenet / advprop / noisy-student | 4M |
timm-efficientnet-b1 | imagenet / advprop / noisy-student | 6M |
timm-efficientnet-b2 | imagenet / advprop / noisy-student | 7M |
timm-efficientnet-b3 | imagenet / advprop / noisy-student | 10M |
timm-efficientnet-b4 | imagenet / advprop / noisy-student | 17M |
timm-efficientnet-b5 | imagenet / advprop / noisy-student | 28M |
timm-efficientnet-b6 | imagenet / advprop / noisy-student | 40M |
timm-efficientnet-b7 | imagenet / advprop / noisy-student | 63M |
timm-efficientnet-b8 | imagenet / advprop | 84M |
timm-efficientnet-l2 | noisy-student | 474M |
timm-efficientnet-lite0 | imagenet | 4M |
timm-efficientnet-lite1 | imagenet | 5M |
timm-efficientnet-lite2 | imagenet | 6M |
timm-efficientnet-lite3 | imagenet | 8M |
timm-efficientnet-lite4 | imagenet | 13M |
MobileNet
Encoder | Weights | Params, M |
---|---|---|
mobilenet_v2 | imagenet | 2M |
timm-mobilenetv3_large_075 | imagenet | 1.78M |
timm-mobilenetv3_large_100 | imagenet | 2.97M |
timm-mobilenetv3_large_minimal_100 | imagenet | 1.41M |
timm-mobilenetv3_small_075 | imagenet | 0.57M |
timm-mobilenetv3_small_100 | imagenet | 0.93M |
timm-mobilenetv3_small_minimal_100 | imagenet | 0.43M |
DPN
Encoder | Weights | Params, M |
---|---|---|
dpn68 | imagenet | 11M |
dpn68b | imagenet+5k | 11M |
dpn92 | imagenet+5k | 34M |
dpn98 | imagenet | 58M |
dpn107 | imagenet+5k | 84M |
dpn131 | imagenet | 76M |
VGG
Encoder | Weights | Params, M |
---|---|---|
vgg11 | imagenet | 9M |
vgg11_bn | imagenet | 9M |
vgg13 | imagenet | 9M |
vgg13_bn | imagenet | 9M |
vgg16 | imagenet | 14M |
vgg16_bn | imagenet | 14M |
vgg19 | imagenet | 20M |
vgg19_bn | imagenet | 20M |
Mix Vision Transformer
Backbone from SegFormer pretrained on Imagenet! Can be used with other decoders from package, you can combine Mix Vision Transformer with Unet, FPN and others!
Limitations:
- encoder is not supported by Linknet, Unet++
- encoder is supported by FPN only for encoder depth = 5
Encoder | Weights | Params, M |
---|---|---|
mit_b0 | imagenet | 3M |
mit_b1 | imagenet | 13M |
mit_b2 | imagenet | 24M |
mit_b3 | imagenet | 44M |
mit_b4 | imagenet | 60M |
mit_b5 | imagenet | 81M |
MobileOne
Apple's "sub-one-ms" Backbone pretrained on Imagenet! Can be used with all decoders.
Note: In the official github repo the s0 variant has additional num_conv_branches, leading to more params than s1.
Encoder | Weights | Params, M |
---|---|---|
mobileone_s0 | imagenet | 4.6M |
mobileone_s1 | imagenet | 4.0M |
mobileone_s2 | imagenet | 6.5M |
mobileone_s3 | imagenet | 8.8M |
mobileone_s4 | imagenet | 13.6M |
* ssl
, swsl
- semi-supervised and weakly-supervised learning on ImageNet (repo).
Timm Encoders
Pytorch Image Models (a.k.a. timm) has a lot of pretrained models and interface which allows using these models as encoders in torchseg, however, not all models are supported
- not all transformer models have
features_only
functionality implemented that is required for encoder - some models have inappropriate strides
Total number of supported encoders: 549
Models API
model.encoder
- pretrained backbone to extract features of different spatial resolutionmodel.decoder
- depends on models architecture (Unet
/Linknet
/PSPNet
/FPN
)model.segmentation_head
- last block to produce required number of mask channels (include also optional upsampling and activation)model.classification_head
- optional block which create classification head on top of encodermodel.forward(x)
- sequentially passx
through model`s encoder, decoder and segmentation head (and classification head if specified)
Input channels
Input channels parameter allows you to create models, which process tensors with arbitrary number of channels.
If you use pretrained weights from imagenet - weights of first convolution will be reused. For
1-channel case it would be a sum of weights of first convolution layer, otherwise channels would be
populated with weights like new_weight[:, i] = pretrained_weight[:, i % 3]
and than scaled with new_weight * 3 / new_in_channels
.
model = torchseg.FPN('resnet34', in_channels=1)
mask = model(torch.ones([1, 1, 64, 64]))
Auxiliary classification output
All models support aux_params
parameters, which is default set to None
.
If aux_params = None
then classification auxiliary output is not created, else
model produce not only mask
, but also label
output with shape NC
.
Classification head consists of GlobalPooling->Dropout(optional)->Linear->Activation(optional) layers, which can be
configured by aux_params
as follows:
aux_params=dict(
pooling='avg', # one of 'avg', 'max'
dropout=0.5, # dropout ratio, default is None
activation='sigmoid', # activation function, default is None
classes=4, # define number of output labels
)
model = torchseg.Unet('resnet34', classes=4, aux_params=aux_params)
mask, label = model(x)
Depth
Depth parameter specify a number of downsampling operations in encoder, so you can make
your model lighter if specify smaller depth
.
model = torchseg.Unet('resnet34', encoder_depth=4)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file torchseg-0.0.1a1.tar.gz
.
File metadata
- Download URL: torchseg-0.0.1a1.tar.gz
- Upload date:
- Size: 57.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c8b5ed11332b71128054d552b38784a6513b9241e97229c2fd92172530d793ac |
|
MD5 | e163cdc430ed009c376594b4bfa7fe21 |
|
BLAKE2b-256 | 337e44e0639447f81e66f973ae6ab3e1717d45ff11ca6a8665b24fd1577a0219 |
File details
Details for the file torchseg-0.0.1a1-py3-none-any.whl
.
File metadata
- Download URL: torchseg-0.0.1a1-py3-none-any.whl
- Upload date:
- Size: 84.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 29d637c4ea7b82bf22b6728ed88737e8fd0b5ad49a0d3d096c075a521026dca6 |
|
MD5 | bdfa469d34b4d48e3b24474df89c5f03 |
|
BLAKE2b-256 | 131a832e5fb8d1e9f5ff1e0086725b05417e49b8a379259a43160840d85b9c7d |