Caption Anything: Interactive Image Description with Diverse M | Machinelearning

Caption Anything: Interactive Image Description with Diverse Multimodal Controls

Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences.

Универсальный инструмент для работы с изображениями, сочетающий в себе возможности, Visual Captioning, SAM, ChatGPT. Модель генерирует описательные подписи для любого объекта на изображении.

Github: https://github.com/ttengwang/caption-anything

Paper: https://arxiv.org/abs/2305.02677v1

Dataset: https://paperswithcode.com/dataset/cityscapes-3d

Colab: https://colab.research.google.com/github/ttengwang/Caption-Anything/blob/main/notebooks/tutorial.ipynb

ai_machinelearning_big_data

Machinelearning

🧛 70.96K
Технологии

Разбираем лучшие open source новинки из мира ml, код, вопросы с собеседований, публикуем открытые курсы и гайды. Пер�...

Join
▲ Vote (1)

Caption Anything: Interactive Image Description with Diverse M | Machinelearning

Login