Transformers documentation
Backbone
Backbone
A backbone is a model used for feature extraction for higher level computer vision tasks such as object detection and image classification. Transformers provides an AutoBackbone class for initializing a Transformers backbone from pretrained model weights, and two utility classes:
- BackboneMixin enables initializing a backbone from Transformers or timm and includes functions for returning the output features and indices.
- BackboneConfigMixin sets the output features and indices of the backbone configuration.
timm models are loaded with the TimmBackbone and TimmBackboneConfig classes.
Backbones are supported for the following models:
- BEiT
- BiT
- ConvNext
- ConvNextV2
- DiNAT
- DINOV2
- FocalNet
- MaskFormer
- NAT
- ResNet
- Swin Transformer
- Swin Transformer v2
- ViTDet
AutoBackbone
BackboneMixin
BackboneConfigMixin
A Mixin to support handling the out_features and out_indices attributes for the backbone configurations.
set_output_features_output_indices
< source >( out_features: list | None out_indices: list | None )
Parameters
Sets output indices and features to new values and aligns them with the given stage_names.
If one of the inputs is not given, find the corresponding out_features or out_indices
for the given stage_names.
Serializes this instance to a Python dictionary. Override the default to_dict() from PreTrainedConfig to
include the out_features and out_indices attributes.
Verify that out_indices and out_features are valid for the given stage_names.
TimmBackbone
Wrapper class for timm models to be used as backbones. This enables using the timm models interchangeably with the other models in the library keeping the same API.
TimmBackboneConfig
class transformers.TimmBackboneConfig
< source >( backbone = None num_channels = 3 features_only = True out_indices = None freeze_batch_norm_2d = False output_stride = None **kwargs )
Parameters
- backbone (
str, optional) — The timm checkpoint to load. - num_channels (
int, optional, defaults to 3) — The number of input channels. - features_only (
bool, optional, defaults toTrue) — Whether to output only the features or also the logits. - out_indices (
list[int], optional) — If used as backbone, list of indices of features to output. Can be any of 0, 1, 2, etc. (depending on how many stages the model has). Will default to the last stage if unset. - freeze_batch_norm_2d (
bool, optional, defaults toFalse) — Converts allBatchNorm2dandSyncBatchNormlayers of provided module intoFrozenBatchNorm2d.
This is the configuration class to store the configuration for a timm backbone TimmBackbone.
It is used to instantiate a timm backbone model according to the specified arguments, defining the model.
Configuration objects inherit from PreTrainedConfig and can be used to control the model outputs. Read the documentation from PreTrainedConfig for more information.
Example:
>>> from transformers import TimmBackboneConfig, TimmBackbone
>>> # Initializing a timm backbone
>>> configuration = TimmBackboneConfig("resnet50")
>>> # Initializing a model from the configuration
>>> model = TimmBackbone(configuration)
>>> # Accessing the model configuration
>>> configuration = model.config