Loading contributions...
Top Repositories
Code release for Hu et al., Learning to Segment Every Thing. in CVPR, 2018.
Code release for Hu et al. Learning to Reason: End-to-End Module Networks for Visual Question Answering. in ICCV, 2017
Compact Bilinear Pooling in TensorFlow
Code release for Fried et al., Speaker-Follower Models for Vision-and-Language Navigation. in NeurIPS, 2018.
Code release for Hu et al. Natural Language Object Retrieval, in CVPR, 2016
Code release for Hu et al., Language-Conditioned Graph Networks for Relational Reasoning. in ICCV, 2019
Repositories
136Code release for Hu et al. Segmentation from Natural Language Expressions. in ECCV, 2016
Code release for Fried et al., Speaker-Follower Models for Vision-and-Language Navigation. in NeurIPS, 2018.
Unofficial Python API for Google NotebookLM
verl: Volcano Engine Reinforcement Learning for LLMs
No description provided.
No description provided.
🚀 Efficient implementations of state-of-the-art linear attention models
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
Code release for Hu et al., Learning to Segment Every Thing. in CVPR, 2018.
Code release for Hu et al. Learning to Reason: End-to-End Module Networks for Visual Question Answering. in ICCV, 2017
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Detect Anything via Next Point Prediction (Based on Qwen2.5-VL-3B)
Caffe: a fast open framework for deep learning.
Compact Bilinear Pooling in TensorFlow
See details in https://github.com/pytorch/xla/blob/r1.12/torch_xla/distributed/fsdp/README.md
Serve, optimize and scale PyTorch models in production
State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
HOTA (and other) evaluation metrics for Multi-Object Tracking (MOT).
Code release for Hu et al., Language-Conditioned Graph Networks for Relational Reasoning. in ICCV, 2019
Code release for Hu et al. Modeling Relationships in Referential Expressions with Compositional Modular Networks. in CVPR, 2017
No description provided.
Code release for Hu et al. Natural Language Object Retrieval, in CVPR, 2016
Code release for Hu et al., Explainable Neural Computation via Stack Neural Module Networks. in ECCV, 2018
Used in M4C feature extraction script: https://github.com/facebookresearch/mmf/blob/project/m4c/projects/M4C/scripts/extract_ocr_frcn_feature.py
Sanguosha EX: An Open Source PC Game Based on Popular Desktop Game "Sanguosha"
Profiling analyses and comparisons between PyTorch/XLA and JAX
Enabling PyTorch on Google TPU
A simple but well-performing "single-hop" visual attention model for the GQA dataset
A list of examples for model scaling in PyTorch/XLA