imgutils.tagging.camie

Overview:: This module provides functionality for image tagging using the Camie Tagger model, which can identify over 70,000 tags in images. The implementation is based on the Camais03/camie-tagger project, with ONNX optimizations available at deepghs/camie_tagger_onnx.

Note

The tagger categorizes tags into multiple types including rating, general, characters, year, meta, artist, and copyright. While rating, general, and character tags tend to be accurate, other tag types (year, meta, artist, copyright) have shown limited accuracy in testing and are not included in default outputs.

get_camie_tags

imgutils.tagging.camie.get_camie_tags(image: str | PathLike | bytes | bytearray | BinaryIO | Image, model_name: str = 'initial', mode: Literal['balanced', 'high_precision', 'high_recall', 'micro_opt', 'macro_opt'] = 'balanced', thresholds: float | Dict[str, float] | None = None, no_underline: bool = False, drop_overlap: bool = False, fmt: Any = ('rating', 'general', 'character'))[source]

Extract tags from an image using the Camie model.

Parameters:

image (ImageTyping) – Input image (can be path, URL, or image data).
model_name (str) – Name of the Camie model to use.
mode (CamieModeTyping) – Prediction mode affecting threshold values.
thresholds (Optional[Union[float, Dict[str, float]]]) – Custom thresholds for tag selection.
no_underline (bool) – Whether to remove underscores from tag names. Default is False.
drop_overlap (bool) – Whether to remove overlapping tags. Default is False.
fmt (Any) – Format specification for output. Default is ('rating', 'general', 'character').

Returns:

Extracted tags and embeddings, follow the format from fmt.

Return type:

Any

Note

Modes for selection:

balanced: Balanced precision/recall
high_precision: Higher precision thresholds
high_recall: Higher recall thresholds
micro_opt: Micro-optimized thresholds
macro_opt: Macro-optimized thresholds

Note

The fmt argument can include the following keys:

rating: a dict containing ratings and their confidences
general: a dict containing general tags and their confidences
character: a dict containing character tags and their confidences
copyright: a dict containing copyright tags and their confidences
artist: a dict containing artist tags and their confidences
meta: a dict containing meta tags and their confidences
year: a dict containing year tags and their confidences
tag: a dict containing all tags (including general, character, copyright, artist, meta, year, not including rating) and their confidences
embedding: a 1-dim embedding of image, recommended for index building after L2 normalization
prediction: a 1-dim prediction result of image

You can extract embedding of the given image with the following code

>>> from imgutils.tagging import get_camie_tags
>>>
>>> embedding = get_camie_tags('skadi.jpg', fmt='embedding')
>>> embedding.shape
(1280, )

This embedding is valuable for constructing indices that enable rapid querying of images based on visual features within large-scale datasets.

Warning

From our testings, other tag types (year, meta, artist, copyright) have shown limited accuracy. Especially for artist tags, just a bit better than np.random.randn. So these tag types are not included in default fmt.

Example:

Here are some images for example

>>> rating, features, chars = get_camie_tags('skadi.jpg')
>>> rating
{'general': 0.04246556758880615, 'sensitive': 0.6936423778533936, 'questionable': 0.23721203207969666, 'explicit': 0.033293724060058594}
>>> features
{'1girl': 0.8412569165229797, 'blush': 0.38029077649116516, 'breasts': 0.618192195892334, 'cowboy_shot': 0.37446439266204834, 'large_breasts': 0.5698797702789307, 'long_hair': 0.7119565010070801, 'looking_at_viewer': 0.5252856612205505, 'shirt': 0.46417444944381714, 'solo': 0.5428758859634399, 'standing': 0.34731733798980713, 'tail': 0.3911612927913666, 'thigh_gap': 0.2932726740837097, 'thighs': 0.4544200003147125, 'very_long_hair': 0.44711941480636597, 'ass': 0.2854885458946228, 'outdoors': 0.6344638466835022, 'red_eyes': 0.611354410648346, 'day': 0.564970850944519, 'hair_between_eyes': 0.4444340467453003, 'holding': 0.35846662521362305, 'parted_lips': 0.3867686092853546, 'blue_sky': 0.3723931908607483, 'cloud': 0.31086698174476624, 'short_sleeves': 0.43279752135276794, 'sky': 0.3896197974681854, 'gloves': 0.6638736724853516, 'grey_hair': 0.5094802975654602, 'sweat': 0.4867050349712372, 'navel': 0.6593714952468872, 'crop_top': 0.5243107676506042, 'shorts': 0.4374789893627167, 'artist_name': 0.3754707872867584, 'midriff': 0.6238733530044556, 'ass_visible_through_thighs': 0.31088054180145264, 'gym_uniform': 0.37657681107521057, 'black_shirt': 0.3012588620185852, 'watermark': 0.5147127509117126, 'web_address': 0.6296812295913696, 'short_shorts': 0.29214906692504883, 'black_shorts': 0.37801358103752136, 'buruma': 0.536261260509491, 'bike_shorts': 0.35828399658203125, 'black_gloves': 0.4156728982925415, 'sportswear': 0.44427722692489624, 'baseball_bat': 0.2838006019592285, 'crop_top_overhang': 0.49192047119140625, 'stomach': 0.36012423038482666, 'black_buruma': 0.3422132134437561, 'official_alternate_costume': 0.2783987522125244, 'baseball': 0.38377970457077026, 'baseball_mitt': 0.32592540979385376, 'cropped_shirt': 0.35402947664260864, 'holding_baseball_bat': 0.2758416533470154, 'black_sports_bra': 0.3463800549507141, 'sports_bra': 0.28466159105300903, 'exercising': 0.2603980302810669, 'bike_jersey': 0.2661605477333069, 'patreon_username': 0.7087235450744629, 'patreon_logo': 0.560276210308075}
>>> chars
{'skadi_(arknights)': 0.5921452641487122}
>>>
>>> rating, features, chars = get_camie_tags('hutao.jpg')
>>> rating
{'general': 0.41121846437454224, 'sensitive': 0.4002530574798584, 'questionable': 0.03438958525657654, 'explicit': 0.04617959260940552}
>>> features
{'1girl': 0.8312125205993652, 'blush': 0.3996567726135254, 'cowboy_shot': 0.28660568594932556, 'long_hair': 0.7184156775474548, 'long_sleeves': 0.4706878066062927, 'looking_at_viewer': 0.5503140687942505, 'school_uniform': 0.365602970123291, 'shirt': 0.41183334589004517, 'sidelocks': 0.28638553619384766, 'smile': 0.3707748055458069, 'solo': 0.520854115486145, 'standing': 0.2960333526134491, 'tongue': 0.6556028127670288, 'tongue_out': 0.6966925859451294, 'very_long_hair': 0.5526134371757507, 'skirt': 0.6872812509536743, 'brown_hair': 0.5945607423782349, 'hair_ornament': 0.4464661478996277, 'hair_ribbon': 0.3646523952484131, 'outdoors': 0.37938451766967773, 'red_eyes': 0.5426545143127441, 'ribbon': 0.3027467727661133, 'bag': 0.8986430168151855, 'hair_between_eyes': 0.337802529335022, 'holding': 0.38589367270469666, 'pleated_skirt': 0.6475872993469238, 'school_bag': 0.666648805141449, 'ahoge': 0.4749193489551544, 'white_shirt': 0.27104783058166504, 'closed_mouth': 0.28101325035095215, 'collared_shirt': 0.37030768394470215, 'miniskirt': 0.32576680183410645, ':p': 0.4337637424468994, 'alternate_costume': 0.42441293597221375, 'black_skirt': 0.34694597125053406, 'twintails': 0.5711237192153931, 'open_clothes': 0.31017544865608215, 'nail_polish': 0.534726083278656, 'jacket': 0.4544385075569153, 'open_jacket': 0.27831193804740906, 'flower': 0.45064714550971985, 'plaid_clothes': 0.5494365096092224, 'plaid_skirt': 0.610480546951294, 'red_flower': 0.35928308963775635, 'contemporary': 0.37732189893722534, 'backpack': 0.5575172305107117, 'fingernails': 0.27776333689689636, 'cardigan': 0.3264558017253876, 'blue_jacket': 0.31882336735725403, 'ghost': 0.5534622073173523, 'red_nails': 0.38771501183509827, ':q': 0.3758758008480072, 'hair_flower': 0.39574217796325684, 'charm_(object)': 0.5394986271858215, 'handbag': 0.37014907598495483, 'black_bag': 0.44918346405029297, 'shoulder_bag': 0.5881174802780151, 'symbol-shaped_pupils': 0.5163478255271912, 'blue_cardigan': 0.28089386224746704, 'black_nails': 0.42480990290641785, 'bag_charm': 0.5010414123535156, 'plum_blossoms': 0.27618563175201416, 'flower-shaped_pupils': 0.5317837595939636}
>>> chars
{'hu_tao_(genshin_impact)': 0.8859397172927856, 'boo_tao_(genshin_impact)': 0.7348971366882324}

convert_camie_emb_to_prediction

imgutils.tagging.camie.convert_camie_emb_to_prediction(emb: ndarray, model_name: str = 'initial', is_refined: bool = True, mode: Literal['balanced', 'high_precision', 'high_recall', 'micro_opt', 'macro_opt'] = 'balanced', thresholds: float | Dict[str, float] | None = None, no_underline: bool = False, drop_overlap: bool = False, fmt: Any = ('rating', 'general', 'character'))[source]

Convert stored embeddings back to tag predictions.

Useful for reprocessing existing embeddings with new thresholds or formats.

Parameters:

emb (np.ndarray) – Embedding vector(s) from previous inference
model_name (str) – Original model variant name
is_refined (bool) – Whether embeddings come from refined stage, otherwise from initial stage
mode (CamieModeTyping) – Threshold selection strategy
thresholds (Optional[Union[float, Dict[str, float]]]) – Custom threshold values
no_underline (bool) – Remove underscores from tag names
drop_overlap (bool) – Remove overlapping tags in general category
fmt (Any) – Output format specification

Returns:

Formatted results matching original prediction format

Return type:

Any

Note

Modes for selection:

balanced: Balanced precision/recall
high_precision: Higher precision thresholds
high_recall: Higher recall thresholds
micro_opt: Micro-optimized thresholds
macro_opt: Macro-optimized thresholds

For batch processing (2-dim input), returns a list where each element corresponds to one embedding’s predictions in the same format as single embedding output.

Example:

>>> import numpy as np
>>> from imgutils.tagging import get_camie_tags, convert_camie_emb_to_prediction
>>>
>>> # extract the feature embedding, shape: (W, )
>>> embedding = get_camie_tags('skadi.jpg', fmt='embedding')
>>>
>>> # convert to understandable result
>>> rating, general, character = convert_camie_emb_to_prediction(embedding)
>>> # these 3 dicts will be the same as that returned by `get_camie_tags('skadi.jpg')`
>>>
>>> # Batch processing, shape: (B, W)
>>> embeddings = np.stack([
...     get_camie_tags('img1.jpg', fmt='embedding'),
...     get_camie_tags('img2.jpg', fmt='embedding'),
... ])
>>> # results will be a list of (rating, general, character) tuples
>>> results = convert_camie_emb_to_prediction(embeddings)