imgutils.preprocess.transformers
- Overview:
Convert transformers image processors to PillowCompose objects.
Supported Processors:
Name
Supported
Repos
Function
ViTImageProcessor
✅
5906 (33.24%)
DonutImageProcessor
❌
1901 (10.70%)
N/A
DetrImageProcessor
❌
1575 (8.86%)
N/A
CLIPImageProcessor
✅
1374 (7.73%)
VideoMAEImageProcessor
❌
1093 (6.15%)
N/A
ConvNextImageProcessor
✅
648 (3.65%)
SegformerImageProcessor
❌
533 (3.00%)
N/A
BeitImageProcessor
❌
468 (2.63%)
N/A
SiglipImageProcessor
✅
440 (2.48%)
LayoutLMv3ImageProcessor
❌
403 (2.27%)
N/A
LayoutLMv2ImageProcessor
❌
332 (1.87%)
N/A
MllamaImageProcessor
❌
332 (1.87%)
N/A
Qwen2VLImageProcessor
❌
314 (1.77%)
N/A
BlipImageProcessor
❌
276 (1.55%)
N/A
Idefics2ImageProcessor
❌
226 (1.27%)
N/A
LlavaNextImageProcessor
❌
215 (1.21%)
N/A
BitImageProcessor
❌
210 (1.18%)
N/A
Pix2StructImageProcessor
❌
113 (0.64%)
N/A
ConditionalDetrImageProcessor
❌
95 (0.53%)
N/A
SamImageProcessor
❌
92 (0.52%)
N/A
DeiTImageProcessor
❌
91 (0.51%)
N/A
Mask2FormerImageProcessor
❌
89 (0.50%)
N/A
VivitImageProcessor
❌
88 (0.50%)
N/A
YolosImageProcessor
❌
84 (0.47%)
N/A
ViltImageProcessor
❌
73 (0.41%)
N/A
DetaImageProcessor
❌
68 (0.38%)
N/A
PixtralImageProcessor
❌
68 (0.38%)
N/A
MobileNetV2ImageProcessor
❌
63 (0.35%)
N/A
MobileViTImageProcessor
❌
61 (0.34%)
N/A
DPTImageProcessor
❌
51 (0.29%)
N/A
MaskFormerImageProcessor
❌
49 (0.28%)
N/A
NougatImageProcessor
❌
48 (0.27%)
N/A
IdeficsImageProcessor
❌
47 (0.26%)
N/A
RTDetrImageProcessor
❌
45 (0.25%)
N/A
EfficientNetImageProcessor
❌
40 (0.23%)
N/A
DeformableDetrImageProcessor
❌
36 (0.20%)
N/A
Idefics3ImageProcessor
❌
32 (0.18%)
N/A
FuyuImageProcessor
❌
22 (0.12%)
N/A
VideoLlavaImageProcessor
❌
17 (0.10%)
N/A
PvtImageProcessor
❌
16 (0.09%)
N/A
OneFormerImageProcessor
❌
14 (0.08%)
N/A
MobileNetV1ImageProcessor
❌
12 (0.07%)
N/A
Owlv2ImageProcessor
❌
12 (0.07%)
N/A
ChineseCLIPImageProcessor
❌
9 (0.05%)
N/A
EfficientFormerImageProcessor
❌
8 (0.05%)
N/A
LlavaOnevisionImageProcessor
❌
8 (0.05%)
N/A
Swin2SRImageProcessor
❌
8 (0.05%)
N/A
ViTHybridImageProcessor
❌
8 (0.05%)
N/A
OwlViTImageProcessor
❌
7 (0.04%)
N/A
GroundingDinoImageProcessor
❌
6 (0.03%)
N/A
PerceiverImageProcessor
❌
6 (0.03%)
N/A
ChameleonImageProcessor
❌
5 (0.03%)
N/A
LevitImageProcessor
❌
5 (0.03%)
N/A
VitMatteImageProcessor
❌
5 (0.03%)
N/A
register_creators_for_transformers
- imgutils.preprocess.transformers.register_creators_for_transformers()[source]
Decorator for registering transform creator functions.
This decorator adds the decorated function to the list of available transform creators that will be tried when creating transforms from a transformers processor.
- Returns:
Decorator function
- Return type:
callable
- Example:
>>> @register_creators_for_transformers() >>> def my_transform_creator(processor): ... # Create and return transforms ... pass
NotProcessorTypeError
create_transforms_from_transformers
- imgutils.preprocess.transformers.create_transforms_from_transformers(processor)[source]
Create image transforms from a transformers processor.
This function attempts to create appropriate image transforms by trying each registered creator function until one succeeds.
- Parameters:
processor (object) – A transformers processor object
- Returns:
Image transforms appropriate for the given processor
- Return type:
object
- Raises:
NotProcessorTypeError – If no suitable creator is found for the processor
- Example:
>>> from transformers import AutoImageProcessor >>> from imgutils.preprocess.transformers import create_transforms_from_transformers >>> >>> processor = AutoImageProcessor.from_pretrained("openai/clip-vit-base-patch32") >>> transforms = create_transforms_from_transformers(processor) >>> transforms PillowCompose( PillowConvertRGB(force_background='white') PillowResize(size=224, interpolation=bicubic, max_size=None, antialias=True) PillowCenterCrop(size=(224, 224)) PillowToTensor() PillowNormalize(mean=[0.48145467 0.4578275 0.40821072], std=[0.26862955 0.2613026 0.2757771 ]) )
create_clip_transforms
- imgutils.preprocess.transformers.create_clip_transforms(do_resize: bool = True, size=<object object>, resample=3, do_center_crop=True, crop_size=<object object>, do_rescale: bool = True, rescale_factor: float = 0.00392156862745098, do_normalize: bool = True, image_mean=<object object>, image_std=<object object>, do_convert_rgb: bool = True)[source]
Creates a composition of image transforms typically used for CLIP models.
- Parameters:
do_resize (bool) – Whether to resize the image.
size (dict) – Target size for resizing. Can be {“shortest_edge”: int} or {“height”: int, “width”: int}.
resample (int) – PIL resampling filter to use for resizing.
do_center_crop (bool) – Whether to center crop the image.
crop_size (dict) – Size for center cropping in {“height”: int, “width”: int} format.
do_rescale (bool) – Whether to rescale pixel values.
rescale_factor (float) – Factor to use for rescaling pixels.
do_normalize (bool) – Whether to normalize the image.
image_mean (list or tuple) – Mean values for normalization.
image_std (list or tuple) – Standard deviation values for normalization.
do_convert_rgb (bool) – Whether to convert image to RGB.
- Returns:
A composed transformation pipeline.
- Return type:
PillowCompose
create_transforms_from_clip_processor
- imgutils.preprocess.transformers.create_transforms_from_clip_processor(processor)[source]
Creates image transforms from a CLIP processor configuration.
- Parameters:
processor (Union[CLIPProcessor, CLIPImageProcessor]) – A CLIP processor or image processor instance from transformers library.
- Returns:
A composed transformation pipeline matching the processor’s configuration.
- Return type:
PillowCompose
- Raises:
NotProcessorTypeError – If the provided processor is not a CLIP processor.
create_convnext_transforms
- imgutils.preprocess.transformers.create_convnext_transforms(do_resize: bool = True, size=<object object>, crop_pct: float = <object object>, resample=2, do_rescale: bool = True, rescale_factor: float = 0.00392156862745098, do_normalize: bool = True, image_mean=<object object>, image_std=<object object>)[source]
Create a composition of image transforms specifically tailored for ConvNext models.
This function creates a transformation pipeline that can include resizing, rescaling, and normalization operations. The transforms are applied in the following order:
Resize (optional)
Convert to tensor
Rescale (optional)
Normalize (optional)
- Parameters:
do_resize (bool) – Whether to resize the image
size (dict) – Target size dictionary with ‘shortest_edge’ key
crop_pct (float) – Center crop percentage, used to compute resize size
resample (int) – PIL resampling filter to use for resizing
do_rescale (bool) – Whether to rescale pixel values
rescale_factor (float) – Factor to use for rescaling pixels
do_normalize (bool) – Whether to normalize the image
image_mean (tuple or list) – Mean values for normalization
image_std (tuple or list) – Standard deviation values for normalization
- Returns:
A composed transformation pipeline
- Return type:
PillowCompose
create_transforms_from_convnext_processor
- imgutils.preprocess.transformers.create_transforms_from_convnext_processor(processor)[source]
Create image transforms from a ConvNext processor configuration.
This function takes a Hugging Face ConvNextImageProcessor and creates a corresponding transformation pipeline that matches its configuration settings.
- Parameters:
processor (ConvNextImageProcessor) – The ConvNext image processor to create transforms from
- Returns:
A composed transformation pipeline matching the processor’s configuration
- Return type:
PillowCompose
- Raises:
NotProcessorTypeError – If the provided processor is not a ConvNextImageProcessor
create_vit_transforms
- imgutils.preprocess.transformers.create_vit_transforms(do_resize: bool = True, size=<object object>, resample: int = 2, do_rescale: bool = True, rescale_factor: float = 0.00392156862745098, do_normalize: bool = True, image_mean=<object object>, image_std=<object object>)[source]
Create a composition of image transforms typically used for ViT models.
This function creates a transform pipeline that can include resizing, tensor conversion, rescaling, and normalization operations. The transforms are applied in sequence to prepare images for ViT model input.
- Parameters:
do_resize (bool) – Whether to resize the input images
size (dict) – Target size for resizing, should be dict with ‘height’ and ‘width’ keys
resample (int) – PIL resampling filter to use for resizing
do_rescale (bool) – Whether to rescale pixel values
rescale_factor (float) – Factor to use for rescaling pixel values
do_normalize (bool) – Whether to normalize the image
image_mean (tuple or list) – Mean values for normalization
image_std (tuple or list) – Standard deviation values for normalization
- Returns:
A composition of image transforms
- Return type:
PillowCompose
create_transforms_from_vit_processor
- imgutils.preprocess.transformers.create_transforms_from_vit_processor(processor)[source]
Create image transforms from a Hugging Face ViT processor configuration.
This function takes a ViT image processor from the transformers library and creates a matching transform pipeline that replicates the processor’s preprocessing steps.
- Parameters:
processor (ViTImageProcessor) – A ViT image processor from Hugging Face transformers
- Returns:
A composition of image transforms matching the processor’s configuration
- Return type:
PillowCompose
- Raises:
NotProcessorTypeError – If the provided processor is not a ViTImageProcessor
create_siglip_transforms
- imgutils.preprocess.transformers.create_siglip_transforms(do_resize: bool = True, size=<object object>, resample: int = 3, do_rescale: bool = True, rescale_factor: float = 0.00392156862745098, do_normalize: bool = True, image_mean=<object object>, image_std=<object object>, do_convert_rgb: bool = True)[source]
Creates a composition of image transformations for SigLIP model input processing.
This function builds a pipeline of image transformations that can include:
RGB conversion
Image resizing
Tensor conversion
Image rescaling
Normalization
- Parameters:
do_resize (bool) – Whether to resize the image
size (dict) – Target size dictionary with ‘height’ and ‘width’ keys
resample (int) – PIL image resampling filter to use for resizing
do_rescale (bool) – Whether to rescale pixel values
rescale_factor (float) – Factor to use for pixel value rescaling
do_normalize (bool) – Whether to normalize the image
image_mean (tuple or list) – Mean values for normalization
image_std (tuple or list) – Standard deviation values for normalization
do_convert_rgb (bool) – Whether to convert image to RGB
- Returns:
A composed transformation pipeline
- Return type:
PillowCompose
create_transforms_from_siglip_processor
- imgutils.preprocess.transformers.create_transforms_from_siglip_processor(processor)[source]
Creates image transformations from a SigLIP processor configuration.
This function extracts transformation parameters from a HuggingFace SigLIP image processor and creates a corresponding transformation pipeline.
- Parameters:
processor (SiglipImageProcessor) – A HuggingFace SigLIP image processor instance
- Returns:
A composed transformation pipeline
- Return type:
PillowCompose
- Raises:
NotProcessorTypeError – If the processor is not a SiglipImageProcessor