Advanced Intelligent Machines (AIM)

aim-uofa

A research team at Zhejiang University, focusing on Computer Vision and broad AI research ...

China

Languages

Python92%Jupyter Notebook4%JavaScript4%

Top Repositories

AdelaiDet

AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.

3.5kPython

AdelaiDepth

This repo contains the projects: 'Virtual Normal', 'DiverseDepth', and '3D Scene Shape'. They aim to solve the monocular depth estimation, 3D scene reconstruction from single image problems.

1.1kPython

Matcher

[ICLR'24 & IJCV‘25] Matcher: Segment Anything with One Shot Using All-Purpose Feature Matching

551Python

Framer

[ICLR'25] Official PyTorch implementation of "Framer: Interactive Frame Interpolation".

502Python

MovieDreamer

[ICLR'25] MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequences

322

Diception

[NeurIPS 2025 Spotlight] A Generalist Diffusion Model for Vision Perception

297Python

Repositories

aim-uofa/GVM

[SIGGRAPH2025] Generative Video Matting

Python614Updated 1 day ago

generative-modelvideo-matting

aim-uofa/Poseur

[ECCV 2022] The official repo for the paper "Poseur: Direct Human Pose Regression with Transformers".

Python18415Updated 1 day ago

coco-wholebodyhuman-pose-estimationhuman36mvision-transformers

aim-uofa/Tinker

One-shot and Few-shot 3D Editing without Per-Scene Optimization

16611Updated 2 days ago

aim-uofa/DiffewS

[NeurIPS'24] Unleashing the Potential of the Diffusion Model in Few-shot Semantic Segmentation (Diffews)

Python495Updated 3 days ago

aim-uofa/AdelaiDet

AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.

Python3.5k655Updated 4 days ago

abcnetadelaidetblendmaskboxinstcondinstdenseclfcosinstance-segmentationmeinstobject-detectionocrsolosolov2text-detectiontext-recognition

aim-uofa/dLLM-MidTruth

[ICLR'26] Official PyTorch implementation of "Time Is a Feature: Exploiting Temporal Dynamics in Diffusion Language Models".

Python623Updated 5 days ago

aim-uofa/AutoStory

[IJCV'24] AutoStory: Generating Diverse Storytelling Images with Minimal Human Effort

Jupyter Notebook1495Updated 5 days ago

aim-uofa/GenDeF

No description provided.

Python390Updated 5 days ago

aim-uofa/Diception

[NeurIPS 2025 Spotlight] A Generalist Diffusion Model for Vision Perception

Python29711Updated 5 days ago

depth-estimaitondiffusion-modelgenerative-modelmulti-task-learningnormal-estimationsegmentation

aim-uofa/PM-Loss

[3DV 2026] Revisiting Depth Representations for Feed-Forward 3D Gaussian Splatting

Python1587Updated 5 days ago

aim-uofa/BA-DDG

[ICLR 2025 Spotlight] Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions

Python441Updated 5 days ago

aim-uofa/Omni-R1

[NeurIPS 2025] Official Repo of Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration

Python1165Updated 6 days ago

grpomllmsneurips-2025omnimodalrl

aim-uofa/AdelaiDepth

This repo contains the projects: 'Virtual Normal', 'DiverseDepth', and '3D Scene Shape'. They aim to solve the monocular depth estimation, 3D scene reconstruction from single image problems.

Python1.1k149Updated 1 week ago

3d-scene-shapedepth-prediction

aim-uofa/StyleDrop-PyTorch

This is an unofficial PyTorch implementation of StyleDrop: Text-to-Image Generation in Any Style.

Python22615Updated 1 week ago

aim-uofa/MovieDreamer

[ICLR'25] MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequences

32211Updated 1 week ago

aim-uofa/DyCo3D

No description provided.

Python12821Updated 1 week ago

aim-uofa/Active-o3

ACTIVE-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO

791Updated 1 week ago

active-perceptionactive-visiongrpomllmso3rlthinking-with-image

aim-uofa/LoRAPrune

No description provided.

Python637Updated 2 weeks ago

aim-uofa/Matcher

[ICLR'24 & IJCV‘25] Matcher: Segment Anything with One Shot Using All-Purpose Feature Matching

Python55141Updated 2 weeks ago

dinov2generalist-modelin-context-segmentationmatchersam

aim-uofa/SegAgent

[CVPR2025] SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories

Python923Updated 2 weeks ago

agentmllmssegment-anythingvlms

aim-uofa/Framer

[ICLR'25] Official PyTorch implementation of "Framer: Interactive Frame Interpolation".

Python50231Updated 2 weeks ago

aim-uofa/FrozenRecon

[ICCV2023] 🧊FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models

Python1316Updated 2 weeks ago

3d-reconstruction3d-scene-reconstructionmonocular-depth-estimationpose-estimation

aim-uofa/aim-uofa.github.io

code for aim-uofa.github.io

JavaScript74Updated 2 weeks ago

aim-uofa/EvoTokenDLM

EvoToken-DLM (Beyond Hard Masks: Progressive Token Evolution for Diffusion Language)

Python270Updated 2 weeks ago

aim-uofa/SurfaceSplat

SurfaceSplat: Connecting Surface Reconstruction and Gaussian Splatting

560Updated 3 weeks ago

aim-uofa/StaMo

Unsupervised Learning of Generalizable Robot Motion from Compact State Representation

Python350Updated 1 month ago

diffusion-modelsembodied-airobotics

aim-uofa/PerturboLLaVA

No description provided.

Python170Updated 1 month ago

aim-uofa/model-quantizationFork

Collections of model quantization algorithms. Any issues, please contact Peng Chen (blueardour@gmail.com)

458Updated 1 month ago

aim-uofa/ConvNova

No description provided.

Python131Updated 1 month ago

aim-uofa/GenPercept

[ICLR2025] GenPercept: Diffusion Models Trained with Large Data Are Transferable Visual Models

Python2198Updated 1 month ago

depth-estimationdichotomous-image-segmentationhuman-pose-estimationiclr2025image-mattingmonocular-depth-estimationone-stepsemantic-segmentationsurface-normals

Advanced Intelligent Machines (AIM)

Languages

Top Repositories

Repositories

Gists

Recent Activity