"topic:mlsys" — Search | GitHunt

© 2026 GitHunt · tansuasici

45 results for “topic:mlsys”

Infrasys-AI/AISystem

AISystem 主要是指AI系统，包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Jupyter Notebook16.4k2.3kUpdated 3 hours ago

aiaiinfraaisysdlsysmlsys

inclusionAI/AReaL

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

Python4.6k380Updated just now

agentllmllm-agentllm-reasoningmachine-learning-systemsmlsysreinforcement-learningrl

HuaizhengZhang/AI-Infra-from-Zero-to-Hero

🚀 Awesome System for Machine Learning ⚡️ AI System Papers and Industry Practice. ⚡️ System for Machine Learning, LLM (Large Language Model), GenAI (Generative AI). 🍻 OSDI, NSDI, SIGCOMM, SoCC, MLSys, etc. 🗃️ Llama3, Mistral, etc. 🧑‍💻 Video Tutorials.

3.8k369Updated 2 hours ago

ai-infragenailarge-language-modelsllmsysmlsysmodel-servingmodel-training

nunchaku-ai/nunchaku

[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

Python3.7k227Updated 11 hours ago

comfyuidiffusion-modelsfluxgenaiiclriclr2025loramlsysquantization

thu-ml/SageAttention

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda3.2k365Updated 5 hours ago

attentioncudaefficient-attentioninference-accelerationllmllm-inframlsysquantizationtritonvideo-generatevideo-generationvit

nunchaku-ai/ComfyUI-nunchaku

ComfyUI Plugin of Nunchaku

Python2.8k150Updated 10 hours ago

comfyuidiffusionfluxgenaimlsysquantization

thu-ml/SpargeAttn

[ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.

Cuda95387Updated 2 days ago

ai-infraattentioninference-accelerationllmmlsysquantizationsageattentionsparse-attentionvideo-generationvision-transformervit

bytedance/byteir

A model compilation solution for various hardware

MLIR46453Updated 2 weeks ago

llmllvmmlirmlsysonnxpytorchtensorflow

SymbioticLab/FedScale

FedScale is a scalable and extensible open-source federated learning (FL) platform.

Python412120Updated 1 day ago

benchmarkdatasetdeep-learningdeploymentdistributedfederated-learningicmlmachine-learningmlsysosdipytorchtensorflow

Measure and optimize the energy consumption of your AI applications!

Python33940Updated 13 hours ago

deep-learningenergymlsys

MLSys-Learner-Resources/Awesome-MLSys-Blogger

The repository has collected a batch of noteworthy MLSys bloggers (Algorithms/Systems)

HTML3279Updated 1 day ago

llmllm-inferencellm-trainingmachine-learningmachine-learning-systemsmlsys

jjiantong/Awesome-KV-Cache-Optimization

[Survey] Towards Efficient Large Language Model Serving: A Survey on System-Aware KV Cache Optimization

Python30910Updated 10 hours ago

aicomputer-architecturekv-cachellmllm-inferencellm-servingmachine-learningmlsysneural-language-processingserving-mlsystem

SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse–Linear Attention

Python28418Updated 6 days ago

ai-infradiffusion-transformerinference-accelerationlinear-attentionmlsyssparse-attentionsparse-linear-attentiontrain-accelerationtransformervideo-generation

sbu-fsl/kernel-ml

Machine Learning Framework for Operating Systems - Brings ML to Linux kernel

C25531Updated 2 weeks ago

auto-tuningkernel-modulelinux-kernelmachine-learningmlsysoperating-systems

xlite-dev/ffpa-attn

🤖FFPA: Extend FlashAttention-2 with Split-D, ~O(1) SRAM complexity for large headdim, 1.8x~3x↑🎉 vs SDPA EA.

Cuda25413Updated 2 days ago

attentioncudadeepseekdeepseek-r1deepseek-v3flash-attentionflash-mlafused-mlamlamlsyssdpatensor-cores

A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems

Python24114Updated 1 week ago

datasetllmllm-servingmlsys

bytedance/ABQ-LLM

An acceleration library that supports arbitrary bit-width combinatorial quantization operations

C++24021Updated 1 day ago

cudallm-inferencemlsysquantized-networksresearch

HuaizhengZhang/Active-Learning-as-a-Service

A scalable & efficient active learning/data selection system for everyone.

Python21815Updated 1 month ago

active-learningautomldeep-learningmachine-learningmlopsmlsyspytorch

Optimal Sparse Decision Trees

Python10813Updated 1 month ago

accelerateacceleration-modelalgorithmalgorithm-optimizationdata-miningdata-scienceinterpretable-mlmachine-learningml-systemmlsysneuripspythonpython3

jacopotagliabue/FREE_7773

Materials for my 2021 NYU class on NLP and ML Systems (Master of Engineering).

Jupyter Notebook9713Updated 3 months ago

metaflowmlsysnlpteaching

AmberLJC/FLsystem-paper

Federated Learning Systems Paper List

757Updated 6 months ago

awesome-listfederated-learningmachine-learningmlsyspaperssystems

tanyuqian/redco

NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to Automate Distributed Training and Inference

Python697Updated 2 months ago

differential-privacydiffusion-modelsdistributed-trainingfedavgfederated-learningflan-t5-xxlgemmaimage-captioningjaxlarge-language-modelsllamamamlmeta-learningmixed-precisionmlsysmodel-parallelismpporeinforcement-learningseq2seqstable-diffusion

GuanhuaWang/sensAI

sensAI: ConvNets Decomposition via Class Parallelism for Fast Inference on Live Data

Python669Updated 1 month ago

cifar-10cifar-100cifar10cifar100cnn-classificationdeep-learningdeep-neural-networksdistributed-deep-learningdistributed-machine-learningdistributed-systemsimagenetimagenet1kmachine-learningmlsysmobilenet-v2resnetshufflenet-v2sysmlvgg

hegongshan/Storage-for-AI-Paper

Accelerating AI Training and Inference from Storage Perspective (Must-read Papers on Storage for AI)

586Updated 1 week ago

checkpointdata-loadingdata-preparationdata-preprocessingdata-storagedataloaderdeep-learningmlsysmodel-inferencemodel-storagemodel-trainingpytorchstorage-for-aistorage-systemtensorflow

DerrickYLJ/TidalDecode

[ICLR 2025] TidalDecode: A Fast and Accurate LLM Decoding with Position Persistent Sparse Attention

Python534Updated 1 week ago

efficientlong-context-attentionmlsys

NoakLiu/GraphSnapShot

GraphSnapShot: Caching Local Structure for Fast Graph Learning [Efficient ML System]

Python406Updated 2 months ago

efficient-modelgraph-learning-kernellarge-scale-computingmlsysrecommendationrecsys

flashserve/RAGPulse

An Open-Source RAG Workload Trace to Optimize RAG Serving Systems

Python352Updated 1 month ago

datasetllmllm-inferencemlsysretrieval-augmented-generationtraceworkload

NoakLiu/Awesome-Efficient-Foundation-Models-Design

Efficient Foundation Model Design: A Perspective From Model and System Co-Design [Efficient ML System & Model]

291Updated 1 week ago

efficientsysmlsys

NoakLiu/LLMEasyQuant

A Serving System for Distributed and Parallel LLM Quantization [Efficient ML System]

Python261Updated 5 months ago

compression-algorithmefficient-computingefficientmllarge-language-modelsllmsmlsysmlsystemquantization

dywsjtu/apparate

Artifact for "Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving" [SOSP '24]

Python242Updated 1 week ago

early-exitinferencemlsyssosp

Page 1 of 2