[CVPR 2024] Official code for "Text-Driven Image Editing via Learnable Regions"
-
Updated
Sep 28, 2024 - Python
[CVPR 2024] Official code for "Text-Driven Image Editing via Learnable Regions"
Image Textualization: An Automatic Framework for Generating Rich and Detailed Image Descriptions (NeurIPS 2024)
A fine tune version of Stable Diffusion model on self-translate 10k diffusiondb Chinese Corpus and "extend" it
A Light Neural Network To Control Stable Diffusion Spatial Information tuned by Chinese
Use CLIP to create matching texts + embeddings for given images; useful for XAI, adversarial training
lmmtoolkit is a toolkit for Multi-Modal Learning
A small script for CLIP attn entropy plots
A PyTorch implementation of "TextFuseNet: Scene Text Detection with Richer Fused Features".
To Fuse Semantic and Positional Clues with Cross-Attention for Scene Text Recognition
Multi-Modal Image Generation for News Stories
Text-Image-Text is a bidirectional system that enables seamless retrieval of images based on text descriptions, and vice versa. It leverages state-of-the-art language and vision models to bridge the gap between textual and visual representations.
Add a description, image, and links to the text-image topic page so that developers can more easily learn about it.
To associate your repository with the text-image topic, visit your repo's landing page and select "manage topics."