GitHunt — Discover GitHub Repositories

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++12.8k2.3kUpdated 2 hours ago

deep-learninggpu-accelerationinferencenvidiatensorrt

GeeeekExplorer/nano-vllm

Nano vLLM

Python12.1k1.7kUpdated just now

deep-learninginferencellmnlppytorch+1

aws/amazon-sagemaker-examples

Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.

Jupyter Notebook10.9k7.0kUpdated 12 hours ago

awsdata-sciencedeep-learningexamplesinference+6

huggingface/text-generation-inference

Large Language Model Text Generation Inference

Python10.8k1.3kUpdated 3 hours ago

bloomdeep-learningfalcongptinference+4

triton-inference-server/server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Python10.4k1.7kUpdated 1 hour ago

clouddatacenterdeep-learningedgegpu+2

RunanywhereAI/runanywhere-sdks

Production ready toolkit to run AI locally

C++10.2k301Updated just now

androidapple-intelligencecppdiffusion-modelsedge+15

openvinotoolkit/openvino

OpenVINO™ is an open source toolkit for optimizing and deploying AI inference

C++9.8k3.1kUpdated 6 hours ago

aicomputer-visiondeep-learningdeploy-aidiffusion-models+14

xorbitsai/inference

Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source, speech, and multimodal models on cloud, on-prem, or your laptop — all through one unified, production-ready inference API.

Python9.1k801Updated 6 hours ago

artificial-intelligencechatglmdeploymentflan-t5gemma+15

oumi-ai/oumi

Easily fine-tune, evaluate and deploy gpt-oss, Qwen3, DeepSeek-R1, or any open source LLM / VLM!

Python8.9k707Updated 1 hour ago

dpoevaluationfine-tuninggpt-ossgpt-oss-120b+7

dusty-nv/jetson-inference

Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.

C++8.7k3.1kUpdated 1 hour ago

caffecomputer-visiondeep-learningdigitsembedded+15

LMCache/LMCache

Supercharge Your LLM with the Fastest KV Cache Layer

Python7.6k981Updated 1 hour ago

amdcudafastinferencekv-cache+5

Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB

💎1MB lightweight face detection model (1MB轻量级人脸检测模型)

Python7.5k1.6kUpdated 6 hours ago

armface-detectioninferencemnnncnn

gcanti/io-ts

Runtime type system for IO decoding/encoding

TypeScript6.8k318Updated just now

inferenceruntimetypestypescriptvalidation

Trusted-AI/adversarial-robustness-toolbox

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams

Python5.9k1.3kUpdated 12 hours ago

adversarial-attacksadversarial-examplesadversarial-machine-learningaiartificial-intelligence+12

argmaxinc/WhisperKit

On-device Speech Recognition for Apple Silicon

Swift5.8k513Updated 5 hours ago

inferenceiosmacosspeech-recognitionswift+4

OpenCSGs/csghub

CSGHub is a brand-new open-source platform for managing LLMs, developed by the OpenCSG team. It offers both open-source and on-premise/SaaS solutions, with features comparable to Hugging Face. Gain full control over the lifecycle of LLMs, datasets, and agents, with Python SDK compatibility with Hugging Face. Join us! ⭐️

Vue5.5k683Updated 2 days ago

aiasset-managementdatasetdeepseekdeploy+11

superduper-io/superduper

Superduper: End-to-end framework for building custom AI applications and agents.

Python5.3k534Updated 3 days ago

aichatbotdatadatabasedistributed-ml+15

AutoGPTQ/AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Python5.0k532Updated 1 hour ago

deep-learninginferencelarge-language-modelsllmsnlp+4

kvcache-ai/Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++4.9k585Updated 8 hours ago

disaggregationinferencekvcachellmrdma+2

tencentmusic/cube-studio

cube studio开源云原生一站式机器学习/深度学习/大模型AI平台，mlops算法链路全流程，算力租赁平台，notebook在线开发，拖拉拽任务流pipeline编排，多机多卡分布式训练，超参搜索，推理服务VGPU虚拟化，边缘计算，标注平台自动化标注，deepseek等大模型sft微调/奖励模型/强化学习训练，vllm/ollama/mindie大模型多机推理，私有知识库，AI模型市场，支持国产cpu/gpu/npu 昇腾生态，支持RDMA，支持pytorch/tf/mxnet/deepspeed/paddle/colossalai/horovod/ray/volcano等分布式

Python4.9k862Updated 6 hours ago

aiaihubargoautomldeepseek+12

Page 1 of 34