Skip to main content

TorchSeg: Semantic Segmentation models for PyTorch

Project description

TorchSeg

TorchSeg is an actively maintained and up-to-date fork of the Segmentation Models PyTorch (torchseg).

Features

The main features of this library are:

  • High level API (just two lines to create a neural network)
  • 9 models architectures for binary and multi class segmentation (including legendary Unet)
  • 124 available encoders (and 500+ encoders from timm)
  • All encoders have pre-trained weights for faster and better convergence
  • Popular losses for training routines

Example Usage

Segmentation model is just a PyTorch nn.Module, which can be created as easy as:

import torchseg

model = torchseg.Unet(
    encoder_name="resnet34",        # choose encoder, e.g. mobilenet_v2 or efficientnet-b7
    encoder_weights="imagenet",     # use `imagenet` pre-trained weights for encoder initialization
    in_channels=1,                  # model input channels (1 for gray-scale images, 3 for RGB, etc.)
    classes=3,                      # model output channels (number of classes in your dataset)
)
  • see table with available model architectures
  • see table with available encoders and their corresponding weights

Models

Architectures

Encoders

The following is a list of supported encoders in TorchSeg. Select the appropriate family of encoders and click to expand the table and select a specific encoder and its pre-trained weights (encoder_name and encoder_weights parameters).

ResNet
Encoder Weights Params, M
resnet18 imagenet / ssl / swsl 11M
resnet34 imagenet 21M
resnet50 imagenet / ssl / swsl 23M
resnet101 imagenet 42M
resnet152 imagenet 58M
ResNeXt
Encoder Weights Params, M
resnext50_32x4d imagenet / ssl / swsl 22M
resnext101_32x4d ssl / swsl 42M
resnext101_32x8d imagenet / instagram / ssl / swsl 86M
resnext101_32x16d instagram / ssl / swsl 191M
resnext101_32x32d instagram 466M
resnext101_32x48d instagram 826M
ResNeSt
Encoder Weights Params, M
timm-resnest14d imagenet 8M
timm-resnest26d imagenet 15M
timm-resnest50d imagenet 25M
timm-resnest101e imagenet 46M
timm-resnest200e imagenet 68M
timm-resnest269e imagenet 108M
timm-resnest50d_4s2x40d imagenet 28M
timm-resnest50d_1s4x24d imagenet 23M
Res2Ne(X)t
Encoder Weights Params, M
timm-res2net50_26w_4s imagenet 23M
timm-res2net101_26w_4s imagenet 43M
timm-res2net50_26w_6s imagenet 35M
timm-res2net50_26w_8s imagenet 46M
timm-res2net50_48w_2s imagenet 23M
timm-res2net50_14w_8s imagenet 23M
timm-res2next50 imagenet 22M
RegNet(x/y)
Encoder Weights Params, M
timm-regnetx_002 imagenet 2M
timm-regnetx_004 imagenet 4M
timm-regnetx_006 imagenet 5M
timm-regnetx_008 imagenet 6M
timm-regnetx_016 imagenet 8M
timm-regnetx_032 imagenet 14M
timm-regnetx_040 imagenet 20M
timm-regnetx_064 imagenet 24M
timm-regnetx_080 imagenet 37M
timm-regnetx_120 imagenet 43M
timm-regnetx_160 imagenet 52M
timm-regnetx_320 imagenet 105M
timm-regnety_002 imagenet 2M
timm-regnety_004 imagenet 3M
timm-regnety_006 imagenet 5M
timm-regnety_008 imagenet 5M
timm-regnety_016 imagenet 10M
timm-regnety_032 imagenet 17M
timm-regnety_040 imagenet 19M
timm-regnety_064 imagenet 29M
timm-regnety_080 imagenet 37M
timm-regnety_120 imagenet 49M
timm-regnety_160 imagenet 80M
timm-regnety_320 imagenet 141M
GERNet
Encoder Weights Params, M
timm-gernet_s imagenet 6M
timm-gernet_m imagenet 18M
timm-gernet_l imagenet 28M
SE-Net
Encoder Weights Params, M
senet154 imagenet 113M
se_resnet50 imagenet 26M
se_resnet101 imagenet 47M
se_resnet152 imagenet 64M
se_resnext50_32x4d imagenet 25M
se_resnext101_32x4d imagenet 46M
SK-ResNe(X)t
Encoder Weights Params, M
timm-skresnet18 imagenet 11M
timm-skresnet34 imagenet 21M
timm-skresnext50_32x4d imagenet 25M
DenseNet
Encoder Weights Params, M
densenet121 imagenet 6M
densenet169 imagenet 12M
densenet201 imagenet 18M
densenet161 imagenet 26M
Inception
Encoder Weights Params, M
inceptionresnetv2 imagenet / imagenet+background 54M
inceptionv4 imagenet / imagenet+background 41M
xception imagenet 22M
EfficientNet
Encoder Weights Params, M
efficientnet-b0 imagenet 4M
efficientnet-b1 imagenet 6M
efficientnet-b2 imagenet 7M
efficientnet-b3 imagenet 10M
efficientnet-b4 imagenet 17M
efficientnet-b5 imagenet 28M
efficientnet-b6 imagenet 40M
efficientnet-b7 imagenet 63M
timm-efficientnet-b0 imagenet / advprop / noisy-student 4M
timm-efficientnet-b1 imagenet / advprop / noisy-student 6M
timm-efficientnet-b2 imagenet / advprop / noisy-student 7M
timm-efficientnet-b3 imagenet / advprop / noisy-student 10M
timm-efficientnet-b4 imagenet / advprop / noisy-student 17M
timm-efficientnet-b5 imagenet / advprop / noisy-student 28M
timm-efficientnet-b6 imagenet / advprop / noisy-student 40M
timm-efficientnet-b7 imagenet / advprop / noisy-student 63M
timm-efficientnet-b8 imagenet / advprop 84M
timm-efficientnet-l2 noisy-student 474M
timm-efficientnet-lite0 imagenet 4M
timm-efficientnet-lite1 imagenet 5M
timm-efficientnet-lite2 imagenet 6M
timm-efficientnet-lite3 imagenet 8M
timm-efficientnet-lite4 imagenet 13M
MobileNet
Encoder Weights Params, M
mobilenet_v2 imagenet 2M
timm-mobilenetv3_large_075 imagenet 1.78M
timm-mobilenetv3_large_100 imagenet 2.97M
timm-mobilenetv3_large_minimal_100 imagenet 1.41M
timm-mobilenetv3_small_075 imagenet 0.57M
timm-mobilenetv3_small_100 imagenet 0.93M
timm-mobilenetv3_small_minimal_100 imagenet 0.43M
DPN
Encoder Weights Params, M
dpn68 imagenet 11M
dpn68b imagenet+5k 11M
dpn92 imagenet+5k 34M
dpn98 imagenet 58M
dpn107 imagenet+5k 84M
dpn131 imagenet 76M
VGG
Encoder Weights Params, M
vgg11 imagenet 9M
vgg11_bn imagenet 9M
vgg13 imagenet 9M
vgg13_bn imagenet 9M
vgg16 imagenet 14M
vgg16_bn imagenet 14M
vgg19 imagenet 20M
vgg19_bn imagenet 20M
Mix Vision Transformer

Backbone from SegFormer pretrained on Imagenet! Can be used with other decoders from package, you can combine Mix Vision Transformer with Unet, FPN and others!

Limitations:

  • encoder is not supported by Linknet, Unet++
  • encoder is supported by FPN only for encoder depth = 5
Encoder Weights Params, M
mit_b0 imagenet 3M
mit_b1 imagenet 13M
mit_b2 imagenet 24M
mit_b3 imagenet 44M
mit_b4 imagenet 60M
mit_b5 imagenet 81M
MobileOne

Apple's "sub-one-ms" Backbone pretrained on Imagenet! Can be used with all decoders.

Note: In the official github repo the s0 variant has additional num_conv_branches, leading to more params than s1.

Encoder Weights Params, M
mobileone_s0 imagenet 4.6M
mobileone_s1 imagenet 4.0M
mobileone_s2 imagenet 6.5M
mobileone_s3 imagenet 8.8M
mobileone_s4 imagenet 13.6M

* ssl, swsl - semi-supervised and weakly-supervised learning on ImageNet (repo).

Timm Encoders

Pytorch Image Models (a.k.a. timm) has a lot of pretrained models and interface which allows using these models as encoders in torchseg, however, not all models are supported

  • not all transformer models have features_only functionality implemented that is required for encoder
  • some models have inappropriate strides

Total number of supported encoders: 549

Models API

  • model.encoder - pretrained backbone to extract features of different spatial resolution
  • model.decoder - depends on models architecture (Unet/Linknet/PSPNet/FPN)
  • model.segmentation_head - last block to produce required number of mask channels (include also optional upsampling and activation)
  • model.classification_head - optional block which create classification head on top of encoder
  • model.forward(x) - sequentially pass x through model`s encoder, decoder and segmentation head (and classification head if specified)
Input channels

Input channels parameter allows you to create models, which process tensors with arbitrary number of channels. If you use pretrained weights from imagenet - weights of first convolution will be reused. For 1-channel case it would be a sum of weights of first convolution layer, otherwise channels would be populated with weights like new_weight[:, i] = pretrained_weight[:, i % 3] and than scaled with new_weight * 3 / new_in_channels.

model = torchseg.FPN('resnet34', in_channels=1)
mask = model(torch.ones([1, 1, 64, 64]))
Auxiliary classification output

All models support aux_params parameters, which is default set to None. If aux_params = None then classification auxiliary output is not created, else model produce not only mask, but also label output with shape NC. Classification head consists of GlobalPooling->Dropout(optional)->Linear->Activation(optional) layers, which can be configured by aux_params as follows:

aux_params=dict(
    pooling='avg',             # one of 'avg', 'max'
    dropout=0.5,               # dropout ratio, default is None
    activation='sigmoid',      # activation function, default is None
    classes=4,                 # define number of output labels
)
model = torchseg.Unet('resnet34', classes=4, aux_params=aux_params)
mask, label = model(x)
Depth

Depth parameter specify a number of downsampling operations in encoder, so you can make your model lighter if specify smaller depth.

model = torchseg.Unet('resnet34', encoder_depth=4)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

torchseg-0.0.1a1.tar.gz (57.5 kB view details)

Uploaded Source

Built Distribution

torchseg-0.0.1a1-py3-none-any.whl (84.6 kB view details)

Uploaded Python 3

File details

Details for the file torchseg-0.0.1a1.tar.gz.

File metadata

  • Download URL: torchseg-0.0.1a1.tar.gz
  • Upload date:
  • Size: 57.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.12

File hashes

Hashes for torchseg-0.0.1a1.tar.gz
Algorithm Hash digest
SHA256 c8b5ed11332b71128054d552b38784a6513b9241e97229c2fd92172530d793ac
MD5 e163cdc430ed009c376594b4bfa7fe21
BLAKE2b-256 337e44e0639447f81e66f973ae6ab3e1717d45ff11ca6a8665b24fd1577a0219

See more details on using hashes here.

File details

Details for the file torchseg-0.0.1a1-py3-none-any.whl.

File metadata

  • Download URL: torchseg-0.0.1a1-py3-none-any.whl
  • Upload date:
  • Size: 84.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.12

File hashes

Hashes for torchseg-0.0.1a1-py3-none-any.whl
Algorithm Hash digest
SHA256 29d637c4ea7b82bf22b6728ed88737e8fd0b5ad49a0d3d096c075a521026dca6
MD5 bdfa469d34b4d48e3b24474df89c5f03
BLAKE2b-256 131a832e5fb8d1e9f5ff1e0086725b05417e49b8a379259a43160840d85b9c7d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page