imgutils.generic.yolo

This module provides functionality for YOLO object detection using ONNX models from Hugging Face.

It includes utilities for preprocessing images, performing object detection, and post-processing the results. The main components are:

  1. YOLOModel class: Manages YOLO models from a Hugging Face repository.

  2. Helper functions for coordinate conversion, non-maximum suppression, and image processing.

  3. A high-level function ‘yolo_predict’ for easy object detection on images.

The module supports various image input types and allows customization of confidence and IoU thresholds.

YOLOModel

class imgutils.generic.yolo.YOLOModel(repo_id: str, hf_token: str | None = None)[source]

A class to manage YOLO models from a Hugging Face repository.

This class handles the loading, caching, and inference of YOLO models.

Parameters:
  • repo_id (str) – The Hugging Face repository ID containing the YOLO models.

  • hf_token (Optional[str]) – Optional Hugging Face authentication token.

Example:

>>> model = YOLOModel("username/repo_name")
>>> image = Image.open("path/to/image.jpg")
>>> detections = model.predict(image, "model_name")
__init__(repo_id: str, hf_token: str | None = None)[source]

Initialize the YOLOModel.

Parameters:
  • repo_id (str) – The Hugging Face repository ID containing the YOLO models.

  • hf_token (Optional[str]) – Optional Hugging Face authentication token.

clear()[source]

Clear cached model and metadata.

This method removes all cached models and their associated metadata from memory. It’s useful for freeing up memory or ensuring that the latest versions of models are loaded.

launch_demo(default_model_name: str | None = None, default_conf_threshold: float = 0.25, default_iou_threshold: float = 0.7, server_name: str | None = None, server_port: int | None = None, **kwargs)[source]

Launch a Gradio demo for object detection.

This method creates and launches a Gradio demo that allows users to interactively perform object detection on uploaded images using the YOLO model.

Parameters:
  • default_model_name (Optional[str]) – The name of the default model to use. If None, the most recently updated model is selected.

  • default_conf_threshold (float) – Default confidence threshold for the demo. Default is 0.25.

  • default_iou_threshold (float) – Default IoU threshold for the demo. Default is 0.7.

  • server_name (Optional[str]) – The name of the server to run the demo on. Default is None.

  • server_port (Optional[int]) – The port to run the demo on. Default is None.

  • kwargs – Additional keyword arguments to pass to gr.Blocks.launch().

Raises:

EnvironmentError – If Gradio is not installed in the environment.

Example:
>>> model = YOLOModel("username/repo_name")
>>> model.launch_demo(default_model_name="yolov5s", server_name="0.0.0.0", server_port=7860)
make_ui(default_model_name: str | None = None, default_conf_threshold: float = 0.25, default_iou_threshold: float = 0.7)[source]

Create a Gradio-based user interface for object detection.

This method sets up an interactive UI that allows users to upload images, select models, and adjust detection parameters. It uses the Gradio library to create the interface.

Parameters:
  • default_model_name (Optional[str]) – The name of the default model to use. If None, the most recently updated model is selected.

  • default_conf_threshold (float) – Default confidence threshold for the UI. Default is 0.25.

  • default_iou_threshold (float) – Default IoU threshold for the UI. Default is 0.7.

Raises:

ImportError – If Gradio is not installed in the environment.

Example:

>>> model = YOLOModel("username/repo_name")
>>> model.make_ui(default_model_name="yolov5s")
predict(image: str | PathLike | bytes | bytearray | BinaryIO | Image, model_name: str, conf_threshold: float = 0.25, iou_threshold: float = 0.7, allow_dynamic: bool = False) List[Tuple[Tuple[int, int, int, int], str, float]][source]

Perform object detection on an image using the specified YOLO model.

Parameters:
  • image (ImageTyping) – Input image for object detection.

  • model_name (str) – Name of the YOLO model to use.

  • conf_threshold (float) – Confidence threshold for filtering detections. Default is 0.25.

  • iou_threshold (float) – IoU threshold for non-maximum suppression. Default is 0.7.

  • allow_dynamic (bool) – If True, allows dynamic resizing of the image while maintaining aspect ratio. Default is False.

Returns:

List of detections, each in the format ((x0, y0, x1, y1), label, confidence).

Return type:

List[Tuple[Tuple[int, int, int, int], str, float]]

Example:

>>> model = YOLOModel("username/repo_name")
>>> image = Image.open("path/to/image.jpg")
>>> detections = model.predict(image, "model_name")
>>> print(detections[0])  # First detection
((100, 200, 300, 400), 'person', 0.95)

yolo_predict

imgutils.generic.yolo.yolo_predict(image: str | PathLike | bytes | bytearray | BinaryIO | Image, repo_id: str, model_name: str, conf_threshold: float = 0.25, iou_threshold: float = 0.7, hf_token: str | None = None, **kwargs) List[Tuple[Tuple[int, int, int, int], str, float]][source]

Perform object detection on an image using a YOLO model from a Hugging Face repository.

This function is a high-level wrapper around the YOLOModel class, providing a simple interface for object detection without needing to explicitly manage model instances.

Parameters:
  • image (ImageTyping) – Input image for object detection.

  • repo_id (str) – The Hugging Face repository ID containing the YOLO models.

  • model_name (str) – Name of the YOLO model to use.

  • conf_threshold (float) – Confidence threshold for filtering detections. Default is 0.25.

  • iou_threshold (float) – IoU threshold for non-maximum suppression. Default is 0.7.

  • hf_token (Optional[str]) – Optional Hugging Face authentication token.

Returns:

List of detections, each in the format ((x0, y0, x1, y1), label, confidence).

Return type:

List[Tuple[Tuple[int, int, int, int], str, float]]

Example:

>>> from PIL import Image
>>> image = Image.open("path/to/image.jpg")
>>> detections = yolo_predict(image, "username/repo_name", "model_name")
>>> print(detections[0])  # First detection
((100, 200, 300, 400), 'person', 0.95)