Repositories
30alpamayo
PublicEagle
PublicEagle: Frontier Vision-Language Models with Data-Centric Strategies
Fast-FoundationStereo
Public[CVPR 2026] Fast-FoundationStereo: Real-Time Zero-Shot Stereo Matching
gbrl
PublicGradient Boosting Reinforcement Learning (GBRL)
FoundationStereo
Public[CVPR 2025 Best Paper Nomination] FoundationStereo: Zero-Shot Stereo Matching
describe-anything
Public[ICCV 2025] Implementation for Describe Anything: Detailed Localized Image and Video Captioning
GTRS
PublicLong-RL
PublicLong-RL: Scaling RL to Long Sequences (NeurIPS 2025)
neuralangelo
PublicOfficial implementation of "Neuralangelo: High-Fidelity Neural Surface Reconstruction" (CVPR 2023)
ToolOrchestra
PublicToolOrchestra is an end-to-end RL training framework for orchestrating tools and agentic workflows.
cvdp_benchmark
PublicGatedDeltaNet
Public[ICLR 2025] Official PyTorch Implementation of Gated Delta Networks: Improving Mamba2 with Delta Rule
LongLive
Public[ICLR 2026] LongLive: Real-time Interactive Long Video Generation
LoRWeB
PublicWe propose a novel modular framework that learns to dynamically mix low-rank adapters (LoRAs) to improve visual analogy learning, enabling flexible and generalizable image edits based on example transformations.
GR00T-WholeBodyControl
PublicWelcome to GR00T Whole-Body Control (WBC)! This is a unified platform for developing and deploying advanced humanoid controllers. This includes: Decoupled WBC models used in NVIDIA Isaac-Gr00t, Gr00t N1.5 and N1.6 and GEAR-SONIC
FoundationPose
Public[CVPR 2024 Highlight] FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects
instant-ngp
PublicInstant neural graphics primitives: lightning fast NeRF and more
CARI4D
Public[CVPR 2026] CARI4D: Category Agnostic 4D Reconstruction of Human-Object Interaction
rcm
Public[ICLR 2026] rCM: SOTA JVP-Based Diffusion Distillation & Few-Step Video Generation & Scaling Up sCM/MeanFlow & Real-Time Autoregressive Video Diffusion
Sana
PublicSANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
OmniVinci
PublicOmniVinci is an omni-modal LLM for joint understanding of vision, audio, and language.
WarpConvNet
PublicMake your wildest 3D ConvNet dream architectures come true
MobilityGen
PublicData Generation Pipeline for Mobility
DoRA
Public[ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation
cosmos-policy
PublicCosmos Policy
stylegan
PublicStyleGAN - Official TensorFlow Implementation
FastGen
PublicNVIDIA FastGen: Fast Generation from Diffusion Models
GDPO
PublicOfficial implementation of GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization
curobo
PublicCUDA Accelerated Robot Library
neural-robot-dynamics
Public[CoRL 2025] Neural Robot Dynamics