imgutils.preprocess.transformers

Overview:

Convert transformers image processors to PillowCompose objects.

Supported Processors:

Name

Supported

Repos

Function

ViTImageProcessor

5906 (33.24%)

create_transforms_from_vit_processor()

DonutImageProcessor

1901 (10.70%)

N/A

DetrImageProcessor

1575 (8.86%)

N/A

CLIPImageProcessor

1374 (7.73%)

create_transforms_from_clip_processor()

VideoMAEImageProcessor

1093 (6.15%)

N/A

ConvNextImageProcessor

648 (3.65%)

create_transforms_from_convnext_processor()

SegformerImageProcessor

533 (3.00%)

N/A

BeitImageProcessor

468 (2.63%)

N/A

SiglipImageProcessor

440 (2.48%)

create_transforms_from_siglip_processor()

LayoutLMv3ImageProcessor

403 (2.27%)

N/A

LayoutLMv2ImageProcessor

332 (1.87%)

N/A

MllamaImageProcessor

332 (1.87%)

N/A

Qwen2VLImageProcessor

314 (1.77%)

N/A

BlipImageProcessor

276 (1.55%)

N/A

Idefics2ImageProcessor

226 (1.27%)

N/A

LlavaNextImageProcessor

215 (1.21%)

N/A

BitImageProcessor

210 (1.18%)

N/A

Pix2StructImageProcessor

113 (0.64%)

N/A

ConditionalDetrImageProcessor

95 (0.53%)

N/A

SamImageProcessor

92 (0.52%)

N/A

DeiTImageProcessor

91 (0.51%)

N/A

Mask2FormerImageProcessor

89 (0.50%)

N/A

VivitImageProcessor

88 (0.50%)

N/A

YolosImageProcessor

84 (0.47%)

N/A

ViltImageProcessor

73 (0.41%)

N/A

DetaImageProcessor

68 (0.38%)

N/A

PixtralImageProcessor

68 (0.38%)

N/A

MobileNetV2ImageProcessor

63 (0.35%)

N/A

MobileViTImageProcessor

61 (0.34%)

N/A

DPTImageProcessor

51 (0.29%)

N/A

MaskFormerImageProcessor

49 (0.28%)

N/A

NougatImageProcessor

48 (0.27%)

N/A

IdeficsImageProcessor

47 (0.26%)

N/A

RTDetrImageProcessor

45 (0.25%)

N/A

EfficientNetImageProcessor

40 (0.23%)

N/A

DeformableDetrImageProcessor

36 (0.20%)

N/A

Idefics3ImageProcessor

32 (0.18%)

N/A

FuyuImageProcessor

22 (0.12%)

N/A

VideoLlavaImageProcessor

17 (0.10%)

N/A

PvtImageProcessor

16 (0.09%)

N/A

OneFormerImageProcessor

14 (0.08%)

N/A

MobileNetV1ImageProcessor

12 (0.07%)

N/A

Owlv2ImageProcessor

12 (0.07%)

N/A

ChineseCLIPImageProcessor

9 (0.05%)

N/A

EfficientFormerImageProcessor

8 (0.05%)

N/A

LlavaOnevisionImageProcessor

8 (0.05%)

N/A

Swin2SRImageProcessor

8 (0.05%)

N/A

ViTHybridImageProcessor

8 (0.05%)

N/A

OwlViTImageProcessor

7 (0.04%)

N/A

GroundingDinoImageProcessor

6 (0.03%)

N/A

PerceiverImageProcessor

6 (0.03%)

N/A

ChameleonImageProcessor

5 (0.03%)

N/A

LevitImageProcessor

5 (0.03%)

N/A

VitMatteImageProcessor

5 (0.03%)

N/A

register_creators_for_transformers

imgutils.preprocess.transformers.register_creators_for_transformers()[source]

Decorator for registering transform creator functions.

This decorator adds the decorated function to the list of available transform creators that will be tried when creating transforms from a transformers processor.

Returns:

Decorator function

Return type:

callable

Example:
>>> @register_creators_for_transformers()
>>> def my_transform_creator(processor):
...     # Create and return transforms
...     pass

NotProcessorTypeError

class imgutils.preprocess.transformers.NotProcessorTypeError[source]

Exception raised when a processor type is not recognized or supported.

This error occurs when attempting to create transforms from an unsupported or unknown transformers processor type.

create_transforms_from_transformers

imgutils.preprocess.transformers.create_transforms_from_transformers(processor)[source]

Create image transforms from a transformers processor.

This function attempts to create appropriate image transforms by trying each registered creator function until one succeeds.

Parameters:

processor (object) – A transformers processor object

Returns:

Image transforms appropriate for the given processor

Return type:

object

Raises:

NotProcessorTypeError – If no suitable creator is found for the processor

Example:
>>> from transformers import AutoImageProcessor
>>> from imgutils.preprocess.transformers import create_transforms_from_transformers
>>>
>>> processor = AutoImageProcessor.from_pretrained("openai/clip-vit-base-patch32")
>>> transforms = create_transforms_from_transformers(processor)
>>> transforms
PillowCompose(
    PillowConvertRGB(force_background='white')
    PillowResize(size=224, interpolation=bicubic, max_size=None, antialias=True)
    PillowCenterCrop(size=(224, 224))
    PillowToTensor()
    PillowNormalize(mean=[0.48145467 0.4578275  0.40821072], std=[0.26862955 0.2613026  0.2757771 ])
)

create_clip_transforms

imgutils.preprocess.transformers.create_clip_transforms(do_resize: bool = True, size=<object object>, resample=3, do_center_crop=True, crop_size=<object object>, do_rescale: bool = True, rescale_factor: float = 0.00392156862745098, do_normalize: bool = True, image_mean=<object object>, image_std=<object object>, do_convert_rgb: bool = True)[source]

Creates a composition of image transforms typically used for CLIP models.

Parameters:
  • do_resize (bool) – Whether to resize the image.

  • size (dict) – Target size for resizing. Can be {“shortest_edge”: int} or {“height”: int, “width”: int}.

  • resample (int) – PIL resampling filter to use for resizing.

  • do_center_crop (bool) – Whether to center crop the image.

  • crop_size (dict) – Size for center cropping in {“height”: int, “width”: int} format.

  • do_rescale (bool) – Whether to rescale pixel values.

  • rescale_factor (float) – Factor to use for rescaling pixels.

  • do_normalize (bool) – Whether to normalize the image.

  • image_mean (list or tuple) – Mean values for normalization.

  • image_std (list or tuple) – Standard deviation values for normalization.

  • do_convert_rgb (bool) – Whether to convert image to RGB.

Returns:

A composed transformation pipeline.

Return type:

PillowCompose

create_transforms_from_clip_processor

imgutils.preprocess.transformers.create_transforms_from_clip_processor(processor)[source]

Creates image transforms from a CLIP processor configuration.

Parameters:

processor (Union[CLIPProcessor, CLIPImageProcessor]) – A CLIP processor or image processor instance from transformers library.

Returns:

A composed transformation pipeline matching the processor’s configuration.

Return type:

PillowCompose

Raises:

NotProcessorTypeError – If the provided processor is not a CLIP processor.

create_convnext_transforms

imgutils.preprocess.transformers.create_convnext_transforms(do_resize: bool = True, size=<object object>, crop_pct: float = <object object>, resample=2, do_rescale: bool = True, rescale_factor: float = 0.00392156862745098, do_normalize: bool = True, image_mean=<object object>, image_std=<object object>)[source]

Create a composition of image transforms specifically tailored for ConvNext models.

This function creates a transformation pipeline that can include resizing, rescaling, and normalization operations. The transforms are applied in the following order:

  1. Resize (optional)

  2. Convert to tensor

  3. Rescale (optional)

  4. Normalize (optional)

Parameters:
  • do_resize (bool) – Whether to resize the image

  • size (dict) – Target size dictionary with ‘shortest_edge’ key

  • crop_pct (float) – Center crop percentage, used to compute resize size

  • resample (int) – PIL resampling filter to use for resizing

  • do_rescale (bool) – Whether to rescale pixel values

  • rescale_factor (float) – Factor to use for rescaling pixels

  • do_normalize (bool) – Whether to normalize the image

  • image_mean (tuple or list) – Mean values for normalization

  • image_std (tuple or list) – Standard deviation values for normalization

Returns:

A composed transformation pipeline

Return type:

PillowCompose

create_transforms_from_convnext_processor

imgutils.preprocess.transformers.create_transforms_from_convnext_processor(processor)[source]

Create image transforms from a ConvNext processor configuration.

This function takes a Hugging Face ConvNextImageProcessor and creates a corresponding transformation pipeline that matches its configuration settings.

Parameters:

processor (ConvNextImageProcessor) – The ConvNext image processor to create transforms from

Returns:

A composed transformation pipeline matching the processor’s configuration

Return type:

PillowCompose

Raises:

NotProcessorTypeError – If the provided processor is not a ConvNextImageProcessor

create_vit_transforms

imgutils.preprocess.transformers.create_vit_transforms(do_resize: bool = True, size=<object object>, resample: int = 2, do_rescale: bool = True, rescale_factor: float = 0.00392156862745098, do_normalize: bool = True, image_mean=<object object>, image_std=<object object>)[source]

Create a composition of image transforms typically used for ViT models.

This function creates a transform pipeline that can include resizing, tensor conversion, rescaling, and normalization operations. The transforms are applied in sequence to prepare images for ViT model input.

Parameters:
  • do_resize (bool) – Whether to resize the input images

  • size (dict) – Target size for resizing, should be dict with ‘height’ and ‘width’ keys

  • resample (int) – PIL resampling filter to use for resizing

  • do_rescale (bool) – Whether to rescale pixel values

  • rescale_factor (float) – Factor to use for rescaling pixel values

  • do_normalize (bool) – Whether to normalize the image

  • image_mean (tuple or list) – Mean values for normalization

  • image_std (tuple or list) – Standard deviation values for normalization

Returns:

A composition of image transforms

Return type:

PillowCompose

create_transforms_from_vit_processor

imgutils.preprocess.transformers.create_transforms_from_vit_processor(processor)[source]

Create image transforms from a Hugging Face ViT processor configuration.

This function takes a ViT image processor from the transformers library and creates a matching transform pipeline that replicates the processor’s preprocessing steps.

Parameters:

processor (ViTImageProcessor) – A ViT image processor from Hugging Face transformers

Returns:

A composition of image transforms matching the processor’s configuration

Return type:

PillowCompose

Raises:

NotProcessorTypeError – If the provided processor is not a ViTImageProcessor

create_siglip_transforms

imgutils.preprocess.transformers.create_siglip_transforms(do_resize: bool = True, size=<object object>, resample: int = 3, do_rescale: bool = True, rescale_factor: float = 0.00392156862745098, do_normalize: bool = True, image_mean=<object object>, image_std=<object object>, do_convert_rgb: bool = True)[source]

Creates a composition of image transformations for SigLIP model input processing.

This function builds a pipeline of image transformations that can include:

  • RGB conversion

  • Image resizing

  • Tensor conversion

  • Image rescaling

  • Normalization

Parameters:
  • do_resize (bool) – Whether to resize the image

  • size (dict) – Target size dictionary with ‘height’ and ‘width’ keys

  • resample (int) – PIL image resampling filter to use for resizing

  • do_rescale (bool) – Whether to rescale pixel values

  • rescale_factor (float) – Factor to use for pixel value rescaling

  • do_normalize (bool) – Whether to normalize the image

  • image_mean (tuple or list) – Mean values for normalization

  • image_std (tuple or list) – Standard deviation values for normalization

  • do_convert_rgb (bool) – Whether to convert image to RGB

Returns:

A composed transformation pipeline

Return type:

PillowCompose

create_transforms_from_siglip_processor

imgutils.preprocess.transformers.create_transforms_from_siglip_processor(processor)[source]

Creates image transformations from a SigLIP processor configuration.

This function extracts transformation parameters from a HuggingFace SigLIP image processor and creates a corresponding transformation pipeline.

Parameters:

processor (SiglipImageProcessor) – A HuggingFace SigLIP image processor instance

Returns:

A composed transformation pipeline

Return type:

PillowCompose

Raises:

NotProcessorTypeError – If the processor is not a SiglipImageProcessor