Repos
95
Stars
3.1k
Forks
261
Top Language
Python
Loading contributions...
Top Repositories
(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.
A Survey on multimodal learning research.
A Survey on Transformer in CV.
[FG 2021🎈] A small-scale face image dataset with large-scale facial attributes for text-to-face generation and manipulation.
A Survey on AI in the beauty industry.
[ECCV 2024 Workshop🎈] The first agriculture benchmark to evaluate MM-LLMs.
Repositories
95[ECCV 2024 Workshop🎈] The first agriculture benchmark to evaluate MM-LLMs.
(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.
A Survey on multimodal learning research.
A Survey on Transformer in CV.
A curated list of Survey Papers on Deep Learning.
[FG 2021🎈] A small-scale face image dataset with large-scale facial attributes for text-to-face generation and manipulation.
Advice for Paper Writing.
[ECAI 2025 Workshop] From Images to Insights: Explainable Biodiversity Monitoring with Plain Language Habitat Explanations
📺 A place to discover the latest machine learning courses on YouTube.
A Survey on AI in the beauty industry.
No description provided.
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
This is an investigation and collection of the Segment Anything Model (SAM) on real-world applications.
[ACMMM 2022🎈] The first manually annotated synthetic bento dataset for novel aesthetic box lunch presentation design.
DeepEP: an efficient expert-parallel communication library
[CVPR 2025] Magma: A Foundation Model for Multimodal AI Agents
No description provided.
A generative world for general-purpose robotics & embodied AI learning.
LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
Awesome Dataset Distillation Papers
Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 30+ benchmarks
Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step
This is the repository for the BioCLIP model and the TreeOfLife-10M dataset [CVPR'24 Oral].
农业知识图谱(AgriKG):农业领域的信息检索,命名实体识别,关系抽取,智能问答,辅助决策
Firefly: 大模型训练工具,支持训练Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
12 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
Official codebase for I-JEPA, the Image-based Joint-Embedding Predictive Architecture. First outlined in the CVPR paper, "Self-supervised learning from images with a joint-embedding predictive architecture."
A collection of resources and papers on Diffusion Models and Score-based Models, a darkhorse in the field of Generative Models