SwiftAnnotate 🚀

SwiftAnnotate is a comprehensive auto-labeling tool designed for Text, Image, and Video data. It leverages state-of-the-art (SOTA) Vision Language Models (VLMs) and Large Language Models (LLMs) through a robust annotator-validator pipeline, ensuring high-quality, grounded annotations while minimizing hallucinations. SwiftAnnotate also supports annotations tasks like Object Detection and Segmentation through SOTA CV models like SAM2, YOLOWorld, and OWL-ViT.

Key Features 🎯

Text Processing 📝
Perform classification, summarization, and text generation with state-of-the-art NLP models. Solve real-world problems like spam detection, sentiment analysis, and content creation.
Image Analysis 🖼️
Generate captions for images to provide meaningful descriptions. Classify images into predefined categories with high precision. Detect objects in images using models like YOLOWorld. Achieve pixel-perfect segmentation with SAM2 and OWL-ViT.
Video Processing 🎥
Generate captions for videos with frame-level analysis and temporal understanding Understand video content by detecting scenes and actions effortlessly.
Quality Assurance ✅
Use a two-stage pipeline for annotation and validation to ensure high data quality. Validate outputs rigorously to maintain reliability before deployment.
Multi-modal Support 🌐
Seamlessly process text, images, and videos within a unified framework. Combine data types for powerful multi-modal insights and applications.
Customization 🛠️ Easily extend and adapt the framework to suit specific project needs. Integrate new models and tasks seamlessly with modular architecture.
Developer-Friendly 👩‍💻👨‍💻 Easy-to-use package and detailed documentation to get started quickly.

Installation Guide

To install SwiftAnnotate from PyPI and set up the project environment, follow these steps:

Install from PyPI

Run the following command to install the package directly:
```
pip install swiftannotate
```
For Development (Using Poetry)

If you want to contribute or explore the project codebase ensure you have Poetry installed. Follow the steps given below:
```
git clone https://github.com/yasho191/SwiftAnnotate
cd SwiftAnnotate
poetry install
```
You're now ready to explore and develop SwiftAnnotate!

Annotator-Validator Pipeline for LLMs and VLMs

The annotator-validator pipeline ensures high-quality annotations through a two-stage process:

Stage 1: Annotation

Primary LLM/VLM generates initial annotations
Configurable model selection (OpenAI, Google Gemini, Anthropic, Mistral, Qwen-VL)

Stage 2: Validation

Secondary model validates initial annotations
Cross-checks for hallucinations and factual accuracy
Provides confidence scores and correction suggestions
Option to regenerate annotations if validation fails
Structured output format for consistency

Benefits

Reduced hallucinations through 2 stage verification
Higher annotation quality and consistency
Automated quality control
Traceable annotation process

The pipeline can be customized with different model combinations and validation thresholds based on specific use cases.

Supported Modalities and Tasks

Text

Images

Captioning

Currently, we support OpenAI, Google-Gemini, Ollama, and Qwen2-VL for image captioning. As Qwen2-VL is not yet available on Ollama it is supported through HuggingFace. To get started quickly refer the code snippets shown below.

OpenAI

import os
from swiftannotate.image import OpenAIForImageCaptioning

caption_model = "gpt-4o"
validation_model = "gpt-4o-mini"
api_key = "<YOUR_OPENAI_API_KEY>"
BASE_DIR = "<IMAGE_DIR>"
image_paths = [os.path.join(BASE_DIR, image) for image in os.listdir(BASE_DIR)]

image_captioning_pipeline = ImageCaptioningOpenAI(
    caption_model=caption_model,
    validation_model=validation_model,
    api_key=api_key,
    output_file="image_captioning_output.json"
)

results = image_captioning_pipeline.generate(image_paths=image_paths)

Qwen2-VL

You can use any version for the Qwen2-VL (7B, 72B) depending on the available resources. vLLM inference is not currently supported but it will be available soon.

import os
from transformers import AutoProcessor, AutoModelForImageTextToText
from transformers import BitsAndBytesConfig
from swiftannotate.image import Qwen2VLForImageCaptioning

# Load the images
BASE_DIR = "<IMAGE_DIR>"
image_paths = [os.path.join(BASE_DIR, image) for image in os.listdir(BASE_DIR)]

quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype="float16",
    bnb_4bit_use_double_quant=True
)

model = AutoModelForImageTextToText.from_pretrained(
    "Qwen/Qwen2-VL-7B-Instruct",
    device_map="auto",
    torch_dtype="auto",
    quantization_config=quantization_config)

processor = AutoProcessor.from_pretrained("Qwen/Qwen2-VL-7B-Instruct")

# Load the Caption Model
captioning_pipeline = ImageCaptioningQwen2VL(
    model = model,
    processor = processor,
    output_file="image_captioning_output.json"
)

results = captioning_pipeline.generate(image_paths)

Videos

Contributing to SwiftAnnotate 🤝

We welcome contributions to SwiftAnnotate! There are several ways you can help improve the project:

Enhanced Prompts: Contribute better validation and annotation prompts for improved accuracy
File Support: Add support for additional input/output file formats
Cloud Integration: Implement AWS S3 storage support and other cloud services
Validation Strategies: Develop new validation approaches for different annotation tasks
Model Support: Integrate additional LLMs and VLMs
Documentation: Improve guides and examples

Please submit a pull request with your contributions or open an issue to discuss new features.

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
.github/workflows		.github/workflows
assets		assets
docs		docs
swiftannotate		swiftannotate
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SwiftAnnotate 🚀

Key Features 🎯

Installation Guide

Annotator-Validator Pipeline for LLMs and VLMs

Supported Modalities and Tasks

Text

Images

Captioning

Videos

Contributing to SwiftAnnotate 🤝

About

Releases

Packages

Languages

License

yasho191/SwiftAnnotate

Folders and files

Latest commit

History

Repository files navigation

SwiftAnnotate 🚀

Key Features 🎯

Installation Guide

Annotator-Validator Pipeline for LLMs and VLMs

Supported Modalities and Tasks

Text

Images

Captioning

Videos

Contributing to SwiftAnnotate 🤝

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages