172 results for “topic:sparsity”
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
PyTorch native quantization and sparsity for training and inference
SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime
Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
PaddleSlim is an open-source library for deep model compression and architecture search.
A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
Neural Network Compression Framework for enhanced OpenVINO™ inference
Network Slimming (Pytorch) (ICCV 2017)
More readable and flexible yolov5 with more backbone(gcn, resnet, shufflenet, moblienet, efficientnet, hrnet, swin-transformer, etc) and (cbam,dcn and so on), and tensorrt
[NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.
Caffe for Sparse and Low-rank Deep Neural Networks
An innovative library for efficient LLM inference via low-bit quantization
Reference ImageNet implementation of SelecSLS CNN architecture proposed in the SIGGRAPH 2020 paper "XNect: Real-time Multi-Person 3D Motion Capture with a Single RGB Camera". The repository also includes code for pruning the model based on implicit sparsity emerging from adaptive gradient descent methods, as detailed in the CVPR 2019 paper "On implicit filter level sparsity in Convolutional Neural Networks".
Sparse Optimisation Research Code
Always sparse. Never dense. But never say never. A Sparse Training repository for the Adaptive Sparse Connectivity concept and its algorithmic instantiation, i.e. Sparse Evolutionary Training, to boost Deep Learning scalability on various aspects (e.g. memory and computational time efficiency, representation and generalization power).
FasterAI: Prune and Distill your models with FastAI and PyTorch
[CVPR 2021] Exploring Sparsity in Image Super-Resolution for Efficient Inference
Sparse and structured neural attention mechanisms
Sparse Inferencing for transformer based LLMs
A Python library for Gene–environment interaction analysis via deep learning
Learning both Weights and Connections for Efficient Neural Networks https://arxiv.org/abs/1506.02626
A research library for pytorch-based neural network pruning, compression, and more.
No description provided.
Zero-label image classification via OpenCLIP knowledge distillation
🎬 3.7× faster video generation E2E 🖼️ 1.6× faster image generation E2E ⚡ ColumnSparseAttn 9.3× vs FlashAttn‑3 💨 ColumnSparseGEMM 2.5× vs cuBLAS
🚀 LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training
Soft Threshold Weight Reparameterization for Learnable Sparsity
Official Pytorch Implementation of "Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity"
Sparse Recurrent Neural Networks -- Pruning Connections and Hidden Sizes (TensorFlow)
Fast operator-overloading Jacobian & Hessian sparsity pattern detection.