imgutils.detect.person

Overview:

This module provides functionality for detecting human bodies (including the entire body) in anime images. It uses YOLOv8 models trained on the AniDet3 dataset from Roboflow.

../../_images/person_detect_demo.plot.py.svg

The module includes a main function detect_person() for performing the detection task, and utilizes the yolo_predict function from the generic module for the actual prediction.

The module supports different model levels and versions, allowing users to choose between speed and accuracy based on their requirements.

This is an overall benchmark of all the person detect models:

../../_images/person_detect_benchmark.plot.py.svg

detect_person

Detect human bodies (including the entire body) in anime images.

This function uses YOLOv8 models to detect human bodies in anime-style images. It supports different model levels and versions, allowing users to balance between detection speed and accuracy.

Parameters:

image (ImageTyping) – The input image for detection. Can be various types as defined by ImageTyping.
level (str) – The model level to use. Options are ‘n’, ‘s’, ‘m’, or ‘x’. ‘n’ is fastest but less accurate, ‘x’ is most accurate but slower.
version (str) – The version of the model to use. Available versions are ‘v0’, ‘v1’, and ‘v1.1’.
model_name (Optional[str]) – Optional custom model name. If provided, overrides the auto-generated model name.
conf_threshold (float) – Confidence threshold for detections. Only detections with confidence above this value are returned.
iou_threshold (float) – Intersection over Union (IoU) threshold for non-maximum suppression.

Returns:

A list of detection results. Each result is a tuple containing: ((x0, y0, x1, y1), ‘person’, confidence_score)

Return type:

List[Tuple[Tuple[int, int, int, int], str, float]]

Raises:

ValueError – If an invalid level or version is provided.

Example:

>>> from imgutils.detect import detect_person, detection_visualize
>>> image = 'genshin_post.jpg'
>>> result = detect_person(image)
>>> print(result)
[
    ((371, 232, 564, 690), 'person', 0.7533698678016663),
    ((30, 135, 451, 716), 'person', 0.6788613796234131),
    ((614, 393, 830, 686), 'person', 0.5612757205963135),
    ((614, 3, 1275, 654), 'person', 0.4047100841999054)
]

Note

For visualization of results, you can use the imgutils.detect.visual.detection_visualize() function.