1,108 results for “topic:diffusion”
Stable Diffusion web UI
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
SGLang is a high-performance serving framework for large language models and multimodal models.
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
An easy 1-click way to create beautiful artwork on your PC using AI, with no tech knowledge. Provides a browser UI for generating images from text prompts and images. Just enter your text prompt, and see the generated image.
Using Low-rank adaptation to quickly fine-tune diffusion models.
OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic 🪄: Generative-AI (AIGC), easy-to-use APIs, awsome model zoo, diffusion models, for text-to-image generation, image/video restoration/enhancement, etc.
Diffusion model(SD,Flux,Wan,Qwen Image,Z-Image,...) inference in pure C/C++
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
The open source research environment for AI researchers to seamlessly train, evaluate, and scale models from local hardware to GPU clusters.
《大模型白盒子构建指南》:一个全手搓的Tiny-Universe
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
Your Friendly Open-Source Gen-AI Platform
Stable diffusion for real-time music generation
🪩 Create Disco Diffusion artworks in one line
[CVPR 2024 - Oral, Best Paper Award Candidate] Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation
[SIGGRAPH Asia 2023] Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
A framework for efficient model inference with omni-modality models
Core Engine of Singing Voice Conversion & Singing Voice Clone
MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising
Kandinsky 2 — multilingual text2image latent diffusion model
ComfyUI Plugin of Nunchaku
Stable diffusion for real-time music generation (web app)
🔥 [ICCV 2025 Highlight] InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
[CSUR] A Survey on Video Diffusion Models
Lumina-T2X is a unified framework for Text to Any Modality Generation
collection of diffusion model papers categorized by their subareas
[NeurIPS 2025] Image editing is worth a single LoRA! 0.1% training data for fantastic image editing! Surpasses GPT-4o in ID persistence~ MoE ckpt released! Only 4GB VRAM is enough to run!
Diffusion Models in Medical Imaging (Published in Medical Image Analysis Journal)