Module vision

Module vision 

Source
Expand description

Vision model captioning for images.

This module provides infrastructure for generating captions from images using vision models. With the vision feature enabled, a BLIP-based captioner is available. Otherwise, only a placeholder implementation exists.

Structs§

BlipCaptioner
BLIP-based image captioner using Candle.
CaptionConfig
Configuration for vision captioning.
PlaceholderCaptioner
Placeholder vision captioner that returns no captions.

Enums§

CaptionError
Error type for vision captioning operations.

Constants§

BLIP_IMAGE_SIZE 🔒
Image size for BLIP preprocessing.
BLIP_MODEL_ID 🔒
BLIP model identifier on HuggingFace Hub.

Traits§

ImageCaptioner
Trait for vision-based image captioning.