Repos
19
Stars
0
Forks
8
Top Language
Python
Loading contributions...
Repositories
19SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime
No description provided.
A high-throughput and memory-efficient inference and serving engine for LLMs
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
oneAPI Technical Advisory Board (TAB) Meeting Notes
No description provided.
Enabling PyTorch on Google TPU
Tengine is a lite, high performance, modular inference engine for embedded device
A performant and modular runtime for TensorFlow
OpenVINO™ Toolkit repository
Stores documents used by the TensorFlow developer community
No description provided.
a test
NumPy-like API accelerated with CUDA
Computation using data flow graphs for scalable machine learning
ncnn is a high-performance neural network inference framework optimized for the mobile platform
implement algorithm(sort, query) and benchmark them,
A flexible framework of neural networks for deep learning
Easy benchmarking of all publicly accessible implementations of convnets