Skip to content
View soeaver's full-sized avatar
  • BUPT
  • Beijing

Highlights

  • Pro

Block or report soeaver

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

EVE Series: Encoder-Free Vision-Language Models from BAAI

Python 285 5 Updated Feb 11, 2025
Python 17 Updated Oct 18, 2024

Code for ChatRex: Taming Multimodal LLM for Joint Perception and Understanding

Python 142 7 Updated Jan 24, 2025

Video Depth Anything: Consistent Depth Estimation for Super-Long Videos

Python 549 35 Updated Feb 8, 2025
Python 15 4 Updated Aug 9, 2024

[NeurIPS 2023] HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception

Python 40 4 Updated Mar 25, 2024

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,423 358 Updated Feb 8, 2025

Pytorch Implementation of "SMITE: Segment Me In TimE"

201 10 Updated Oct 25, 2024

Code of AAAI2025 Paper 《VIoTGPT: Learning to Schedule Vision Tools in LLMs towards Intelligent Video Internet of Things》

Python 11 3 Updated Jan 16, 2025

An official implementation of "Hulk: A Universal Knowledge Translator for Human-Centric Tasks"

Python 116 4 Updated Dec 4, 2024

DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding

Python 849 36 Updated Jan 21, 2025

Next-Token Prediction is All You Need

Python 1,987 78 Updated Oct 24, 2024

High-resolution models for human tasks.

Python 4,802 283 Updated Nov 18, 2024

Free, simple, and intuitive online database diagram editor and SQL generator.

JavaScript 24,035 1,701 Updated Feb 5, 2025

[COLM 2024] OpenAgents: An Open Platform for Language Agents in the Wild

Python 4,127 456 Updated Nov 18, 2024

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 14,001 1,418 Updated Dec 25, 2024

A programming language exclusively designed for cybersecurity

Go 427 50 Updated Feb 12, 2025

Cyber Security ALL-IN-ONE Platform

TypeScript 6,153 719 Updated Feb 12, 2025

RTMPose series (RTMPose, DWPose, RTMO, RTMW) without mmcv, mmpose, mmdet etc.

Python 285 35 Updated Jan 25, 2025
Python 297 34 Updated Dec 19, 2024
Python 470 54 Updated Aug 22, 2024

[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation

Python 4,561 402 Updated Jan 22, 2025

Robust Speech Recognition via Large-Scale Weak Supervision

Python 76,055 9,088 Updated Jan 4, 2025

one-click face swap

Python 29,276 6,606 Updated Aug 19, 2024

MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

Python 18,293 1,310 Updated Feb 11, 2025

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

Python 1,552 70 Updated Aug 15, 2024
Python 93 9 Updated Sep 5, 2023

[NeurIPS 2024] VideoTetris: Towards Compositional Text-To-Video Generation

Python 214 6 Updated Nov 4, 2024

A generative speech model for daily dialogue.

Python 34,261 3,706 Updated Jan 25, 2025
Next