Skip to main content
Version: 2025.1

Hugging Face Image To Text

This action can be executed on an asset level and lets you automatically send selected assets to a configurable Hugging Face endpoint to extract text from the images. The extracted text will be stored in the asset's metadata.

Available Context

For more information on context limitations refer to the Available Context section.

Multiselect: Yes Elements: Images

Configuration Options

# Name of the metadata field in which the extracted text will be stored.
meta_data_field_name: 'extracted_text'

# Language of the metadata field
meta_data_field_language: 'en'

# Name of a thumbnail configuration to be used.
asset_thumbnail_configuration_name: 'content'

# Endpoint url of the Hugging Face model to use.
model_endpoint: 'https://api-inference.huggingface.co/models/Salesforce/blip-image-captioning-base'

Detailed Configuration Options

  • meta_data_field_name: Required string. Name of the metadata field in which the extracted text will be stored. If the field does not exist, it will be created.
  • meta_data_field_language: Optional string. Language of the metadata field in which the extracted text will be stored.
  • asset_thumbnail_configuration_name: Optional string. Name of a thumbnail configuration to be used. Instead of the original image, the thumbnail will be sent to the endpoint.
  • model_endpoint: Required string. Endpoint url of the Hugging Face model to use.