68 results for “topic:numa”
LvLLM is a special NUMA extension of vllm that makes full use of CPU and memory resources, reduces GPU memory requirements, and features an efficient GPU parallel and NUMA parallel architecture, supporting hybrid inference for MOE large models.
Python Multi-Process Execution Pool: concurrent asynchronous execution pool with custom resource constraints (memory, timeouts, affinity, CPU cores and caching), load balancing and profiling capabilities of the external apps on NUMA architecture
A community-oriented list of useful NUMA-related libraries, tools, and other resources
Multi-core Window-Based Stream Processing Engine
RAM Coffers: Conditional Memory via NUMA-Distributed Weight Banking - O(1) lookup routing for LLM inference (Dec 16, 2025 - predates DeepSeek Engram by 27 days)
NUMAPROF is a NUMA memory profliler based on Pintool to track your remote memory accesses.
Texel chess engine
Rust bindings to Open MPI Portable Hardware Locality "hwloc" library, covering version 2.0 and above.
AltiVec/VSX optimized llama.cpp for IBM POWER8
Lsglang is a special extension of sglang that fully utilizes CPU and GPU computing resources with an efficient GPU parallel + NUMA parallel architecture, suitable for MOE model hybrid inference.
Data Plane Development Kit (DPDK) integration into OpenWrt
Go package providing information about the number of CPUs in the system
NumaMMA is a lightweight memory profiler for parallel applications
Local-affinity first NUMA-aware allocator with optional fallback.
NUMA-aware multi-CPU multi-GPU data transfer benchmarks
No description provided.
cgroups-based cpuset isolator and resource estimator modules for mesos
Non-unix, custom-API hybrid OS kernel written in C++ which can be thought of as an emulated microkernel. The native API is almost fully asynchronous and the kernel is aimed at high-scaling, high-throughput-requiring multiprocessor workloads, with working support for SMP and NUMA already implemented. Join the IRC channel, #zbz-dev on freenode!
numad for debian/ubuntu
A repo to allow validation of performance results in the knor paper and provide a fast, scalable k-means implementation.
A LLaMA2-7b chatbot with memory running on CPU, and optimized using smooth quantization, 4-bit quantization or Intel® Extension For PyTorch with bfloat16.
Pterodactyl Docker Images
HPC Affinity Tracker (HPCAT) is designed to showcase NUMA, CPU core, NIC and GPU affinities in the context of High Performance Computing (HPC) applications.
NUMA-aware GPU provisioning and orchestration for MoE workloads of all sizes - *Claude Code native*
A benchmark framework for POWER and x86_64
Multi-processor extensions for .NET
An anisotropic mesh adaptation library designed for non-uniform memory access multicore and manycore nodes.
Black-box Concurrent Data Structures for NUMA Architectures
A low-level rust binding for libnuma, allowing rust access to NUMA functionality on Linux supercomputers
Lightweight Trace Profiler for Multi/Manycore Systems