imgutils.preprocess.transformers

Overview:: Convert transformers image processors to PillowCompose objects.

Supported Processors:

Name

Supported

Repos

Function

ViTImageProcessor

✅

5906 (33.24%)

create_transforms_from_vit_processor()

DonutImageProcessor

❌

1901 (10.70%)

N/A

DetrImageProcessor

❌

1575 (8.86%)

N/A

CLIPImageProcessor

✅

1374 (7.73%)

create_transforms_from_clip_processor()

VideoMAEImageProcessor

❌

1093 (6.15%)

N/A

ConvNextImageProcessor

✅

648 (3.65%)

create_transforms_from_convnext_processor()

SegformerImageProcessor

❌

533 (3.00%)

N/A

BeitImageProcessor

❌

468 (2.63%)

N/A

SiglipImageProcessor

✅

440 (2.48%)

create_transforms_from_siglip_processor()

LayoutLMv3ImageProcessor

❌

403 (2.27%)

N/A

LayoutLMv2ImageProcessor

❌

332 (1.87%)

N/A

MllamaImageProcessor

❌

332 (1.87%)

N/A

Qwen2VLImageProcessor

❌

314 (1.77%)

N/A

BlipImageProcessor

❌

276 (1.55%)

N/A

Idefics2ImageProcessor

❌

226 (1.27%)

N/A

LlavaNextImageProcessor

❌

215 (1.21%)

N/A

BitImageProcessor

❌

210 (1.18%)

N/A

Pix2StructImageProcessor

❌

113 (0.64%)

N/A

ConditionalDetrImageProcessor

❌

95 (0.53%)

N/A

SamImageProcessor

❌

92 (0.52%)

N/A

DeiTImageProcessor

❌

91 (0.51%)

N/A

Mask2FormerImageProcessor

❌

89 (0.50%)

N/A

VivitImageProcessor

❌

88 (0.50%)

N/A

YolosImageProcessor

❌

84 (0.47%)

N/A

ViltImageProcessor

❌

73 (0.41%)

N/A

DetaImageProcessor

❌

68 (0.38%)

N/A

PixtralImageProcessor

❌

68 (0.38%)

N/A

MobileNetV2ImageProcessor

❌

63 (0.35%)

N/A

MobileViTImageProcessor

❌

61 (0.34%)

N/A

DPTImageProcessor

❌

51 (0.29%)

N/A

MaskFormerImageProcessor

❌

49 (0.28%)

N/A

NougatImageProcessor

❌

48 (0.27%)

N/A

IdeficsImageProcessor

❌

47 (0.26%)

N/A

RTDetrImageProcessor

❌

45 (0.25%)

N/A

EfficientNetImageProcessor

❌

40 (0.23%)

N/A

DeformableDetrImageProcessor

❌

36 (0.20%)

N/A

Idefics3ImageProcessor

❌

32 (0.18%)

N/A

FuyuImageProcessor

❌

22 (0.12%)

N/A

VideoLlavaImageProcessor

❌

17 (0.10%)

N/A

PvtImageProcessor

❌

16 (0.09%)

N/A

OneFormerImageProcessor

❌

14 (0.08%)

N/A

MobileNetV1ImageProcessor

❌

12 (0.07%)

N/A

Owlv2ImageProcessor

❌

12 (0.07%)

N/A

ChineseCLIPImageProcessor

❌

9 (0.05%)

N/A

EfficientFormerImageProcessor

❌

8 (0.05%)

N/A

LlavaOnevisionImageProcessor

❌

8 (0.05%)

N/A

Swin2SRImageProcessor

❌

8 (0.05%)

N/A

ViTHybridImageProcessor

❌

8 (0.05%)

N/A

OwlViTImageProcessor

❌

7 (0.04%)

N/A

GroundingDinoImageProcessor

❌

6 (0.03%)

N/A

PerceiverImageProcessor

❌

6 (0.03%)

N/A

ChameleonImageProcessor

❌

5 (0.03%)

N/A

LevitImageProcessor

❌

5 (0.03%)

N/A

VitMatteImageProcessor

❌

5 (0.03%)

N/A

register_creators_for_transformers

imgutils.preprocess.transformers.register_creators_for_transformers()[source]

Decorator for registering transform creator functions.

This decorator adds the decorated function to the list of available transform creators that will be tried when creating transforms from a transformers processor.

Returns:

Decorator function

Return type:

callable

Example:

>>> @register_creators_for_transformers()
>>> def my_transform_creator(processor):
...     # Create and return transforms
...     pass

NotProcessorTypeError

class imgutils.preprocess.transformers.NotProcessorTypeError[source]

Exception raised when a processor type is not recognized or supported.

This error occurs when attempting to create transforms from an unsupported or unknown transformers processor type.

create_transforms_from_transformers

imgutils.preprocess.transformers.create_transforms_from_transformers(processor)[source]

Create image transforms from a transformers processor.

This function attempts to create appropriate image transforms by trying each registered creator function until one succeeds.

Parameters:

processor (object) – A transformers processor object

Returns:

Image transforms appropriate for the given processor

Return type:

object

Raises:

NotProcessorTypeError – If no suitable creator is found for the processor

Example:

>>> from transformers import AutoImageProcessor
>>> from imgutils.preprocess.transformers import create_transforms_from_transformers
>>>
>>> processor = AutoImageProcessor.from_pretrained("openai/clip-vit-base-patch32")
>>> transforms = create_transforms_from_transformers(processor)
>>> transforms
PillowCompose(
    PillowConvertRGB(force_background='white')
    PillowResize(size=224, interpolation=bicubic, max_size=None, antialias=True)
    PillowCenterCrop(size=(224, 224))
    PillowToTensor()
    PillowNormalize(mean=[0.48145467 0.4578275  0.40821072], std=[0.26862955 0.2613026  0.2757771 ])
)

create_clip_transforms

imgutils.preprocess.transformers.create_clip_transforms(do_resize: bool = True, size=<object object>, resample=3, do_center_crop=True, crop_size=<object object>, do_rescale: bool = True, rescale_factor: float = 0.00392156862745098, do_normalize: bool = True, image_mean=<object object>, image_std=<object object>, do_convert_rgb: bool = True)[source]

Creates a composition of image transforms typically used for CLIP models.

Parameters:

do_resize (bool) – Whether to resize the image.
size (dict) – Target size for resizing. Can be {“shortest_edge”: int} or {“height”: int, “width”: int}.
resample (int) – PIL resampling filter to use for resizing.
do_center_crop (bool) – Whether to center crop the image.
crop_size (dict) – Size for center cropping in {“height”: int, “width”: int} format.
do_rescale (bool) – Whether to rescale pixel values.
rescale_factor (float) – Factor to use for rescaling pixels.
do_normalize (bool) – Whether to normalize the image.
image_mean (list or tuple) – Mean values for normalization.
image_std (list or tuple) – Standard deviation values for normalization.
do_convert_rgb (bool) – Whether to convert image to RGB.

Returns:

A composed transformation pipeline.

Return type:

PillowCompose

create_transforms_from_clip_processor

imgutils.preprocess.transformers.create_transforms_from_clip_processor(processor)[source]

Creates image transforms from a CLIP processor configuration.

Parameters:: processor (Union[CLIPProcessor, CLIPImageProcessor]) – A CLIP processor or image processor instance from transformers library.
Returns:: A composed transformation pipeline matching the processor’s configuration.
Return type:: PillowCompose
Raises:: NotProcessorTypeError – If the provided processor is not a CLIP processor.

create_convnext_transforms

imgutils.preprocess.transformers.create_convnext_transforms(do_resize: bool = True, size=<object object>, crop_pct: float = <object object>, resample=2, do_rescale: bool = True, rescale_factor: float = 0.00392156862745098, do_normalize: bool = True, image_mean=<object object>, image_std=<object object>)[source]

Create a composition of image transforms specifically tailored for ConvNext models.

This function creates a transformation pipeline that can include resizing, rescaling, and normalization operations. The transforms are applied in the following order:

Resize (optional)
Convert to tensor
Rescale (optional)
Normalize (optional)

Parameters:

do_resize (bool) – Whether to resize the image
size (dict) – Target size dictionary with ‘shortest_edge’ key
crop_pct (float) – Center crop percentage, used to compute resize size
resample (int) – PIL resampling filter to use for resizing
do_rescale (bool) – Whether to rescale pixel values
rescale_factor (float) – Factor to use for rescaling pixels
do_normalize (bool) – Whether to normalize the image
image_mean (tuple or list) – Mean values for normalization
image_std (tuple or list) – Standard deviation values for normalization

Returns:

A composed transformation pipeline

Return type:

PillowCompose

create_transforms_from_convnext_processor

imgutils.preprocess.transformers.create_transforms_from_convnext_processor(processor)[source]

Create image transforms from a ConvNext processor configuration.

This function takes a Hugging Face ConvNextImageProcessor and creates a corresponding transformation pipeline that matches its configuration settings.

Parameters:: processor (ConvNextImageProcessor) – The ConvNext image processor to create transforms from
Returns:: A composed transformation pipeline matching the processor’s configuration
Return type:: PillowCompose
Raises:: NotProcessorTypeError – If the provided processor is not a ConvNextImageProcessor

create_vit_transforms

imgutils.preprocess.transformers.create_vit_transforms(do_resize: bool = True, size=<object object>, resample: int = 2, do_rescale: bool = True, rescale_factor: float = 0.00392156862745098, do_normalize: bool = True, image_mean=<object object>, image_std=<object object>)[source]

Create a composition of image transforms typically used for ViT models.

This function creates a transform pipeline that can include resizing, tensor conversion, rescaling, and normalization operations. The transforms are applied in sequence to prepare images for ViT model input.

Parameters:

do_resize (bool) – Whether to resize the input images
size (dict) – Target size for resizing, should be dict with ‘height’ and ‘width’ keys
resample (int) – PIL resampling filter to use for resizing
do_rescale (bool) – Whether to rescale pixel values
rescale_factor (float) – Factor to use for rescaling pixel values
do_normalize (bool) – Whether to normalize the image
image_mean (tuple or list) – Mean values for normalization
image_std (tuple or list) – Standard deviation values for normalization

Returns:

A composition of image transforms

Return type:

PillowCompose

create_transforms_from_vit_processor

imgutils.preprocess.transformers.create_transforms_from_vit_processor(processor)[source]

Create image transforms from a Hugging Face ViT processor configuration.

This function takes a ViT image processor from the transformers library and creates a matching transform pipeline that replicates the processor’s preprocessing steps.

Parameters:: processor (ViTImageProcessor) – A ViT image processor from Hugging Face transformers
Returns:: A composition of image transforms matching the processor’s configuration
Return type:: PillowCompose
Raises:: NotProcessorTypeError – If the provided processor is not a ViTImageProcessor

create_siglip_transforms

imgutils.preprocess.transformers.create_siglip_transforms(do_resize: bool = True, size=<object object>, resample: int = 3, do_rescale: bool = True, rescale_factor: float = 0.00392156862745098, do_normalize: bool = True, image_mean=<object object>, image_std=<object object>, do_convert_rgb: bool = True)[source]

Creates a composition of image transformations for SigLIP model input processing.

This function builds a pipeline of image transformations that can include:

RGB conversion
Image resizing
Tensor conversion
Image rescaling
Normalization

Parameters:

do_resize (bool) – Whether to resize the image
size (dict) – Target size dictionary with ‘height’ and ‘width’ keys
resample (int) – PIL image resampling filter to use for resizing
do_rescale (bool) – Whether to rescale pixel values
rescale_factor (float) – Factor to use for pixel value rescaling
do_normalize (bool) – Whether to normalize the image
image_mean (tuple or list) – Mean values for normalization
image_std (tuple or list) – Standard deviation values for normalization
do_convert_rgb (bool) – Whether to convert image to RGB

Returns:

A composed transformation pipeline

Return type:

PillowCompose

create_transforms_from_siglip_processor

imgutils.preprocess.transformers.create_transforms_from_siglip_processor(processor)[source]

Creates image transformations from a SigLIP processor configuration.

This function extracts transformation parameters from a HuggingFace SigLIP image processor and creates a corresponding transformation pipeline.

Parameters:: processor (SiglipImageProcessor) – A HuggingFace SigLIP image processor instance
Returns:: A composed transformation pipeline
Return type:: PillowCompose
Raises:: NotProcessorTypeError – If the processor is not a SiglipImageProcessor