"topic:numa" — Search

68 results for “topic:numa”

LvLLM is a special NUMA extension of vllm that makes full use of CPU and memory resources, reduces GPU memory requirements, and features an efficient GPU parallel and NUMA parallel architecture, supporting hybrid inference for MOE large models.

Python25223Updated 12 hours ago

cpudecodegpuhybridinferencemodelmoenumaparallelismprefillvllm

eXascaleInfolab/PyExPool

Python Multi-Process Execution Pool: concurrent asynchronous execution pool with custom resource constraints (memory, timeouts, affinity, CPU cores and caching), load balancing and profiling capabilities of the external apps on NUMA architecture

Python16812Updated 11 months ago

application-frameworkbenchmarking-frameworkcache-controlexecution-poolin-memory-computationsload-balancingmonitoring-servermultiprocessingnumaparallel-computingparallel-processingtask-queue

domargan/awesome-numa

A community-oriented list of useful NUMA-related libraries, tools, and other resources

7714Updated 5 days ago

awesome-listmultiprocessingmultithreadingnon-uniform-memory-accessnumanuma-awarenuma-benchmarksnuma-systemsshared-memory

lsds/LightSaber

Multi-core Window-Based Stream Processing Engine

C++7319Updated 5 months ago

aggregationcompressioncppincremental-computationlibaiollvmmulti-corenumardmasliding-windowsssdstream-processing

Scottcjn/ram-coffers

RAM Coffers: Conditional Memory via NUMA-Distributed Weight Banking - O(1) lookup routing for LLM inference (Dec 16, 2025 - predates DeepSeek Engram by 27 days)

C5716Updated 1 hour ago

aicognitive-computingcpu-inferencefirst-timers-onlygood-first-issuehacktoberfestllama-cppllmmemory-managementneuromorphicnumapower8ppc64leram

memtt/numaprof

NUMAPROF is a NUMA memory profliler based on Pintool to track your remote memory accesses.

C++528Updated 4 weeks ago

instrumentationmemorynumaprofiler

peterosterlund2/texel

Texel chess engine

C++494Updated 1 week ago

androidchess-engineclustercmakecpp14linuxmpinumasmpwindows

HadrienG2/hwlocality

Rust bindings to Open MPI Portable Hardware Locality "hwloc" library, covering version 2.0 and above.

Rust457Updated 8 hours ago

cacheffi-bindingshardware-supporthwloclocalitymemory-managementnumaos

Scottcjn/llama-cpp-power8

AltiVec/VSX optimized llama.cpp for IBM POWER8

C4012Updated 1 hour ago

aialtiveccpu-inferencefirst-timers-onlyggmlgood-first-issuehacktoberfestibmllama-cppllmmachine-learningnumapower8powerpcppc64levsx

guqiong96/Lsglang

Lsglang is a special extension of sglang that fully utilizes CPU and GPU computing resources with an efficient GPU parallel + NUMA parallel architecture, suitable for MOE model hybrid inference.

Python384Updated 16 hours ago

cpudecodegpuhybirdinferencemodelmoenumaparallelismprefillsglang

k13132/openwrt-dpdk

Data Plane Development Kit (DPDK) integration into OpenWrt

Makefile3314Updated 6 days ago

dpdkdpdk-driverkernel-modulenumaopenwrtopenwrt-feedopenwrt-package

tklauser/numcpus

Go package providing information about the number of CPUs in the system

Go308Updated 1 week ago

bsdcpucputopologygogolanglinuxnumaofflineonlineunix

numamma/numamma

NumaMMA is a lightweight memory profiler for parallel applications

C3012Updated 7 months ago

memorynumapebsprofile

bastion-rs/numanji

Local-affinity first NUMA-aware allocator with optional fallback.

Rust291Updated 4 months ago

allocatorglobalallocatormmapnumanuma-awarerust

c3sr/comm_scope

NUMA-aware multi-CPU multi-GPU data transfer benchmarks

C++283Updated 2 weeks ago

bandwidthbenchmark-suitecudagpuhipnumanvlinkperformance

numap-library/numap

No description provided.

C2114Updated 1 year ago

memorynumapebsprofile

ct-clmsn/mesos-cpusets

cgroups-based cpuset isolator and resource estimator modules for mesos

C++193Updated 1 year ago

cloudcloud-computinghardware-topologymesosnuma

latentPrion/zambesii

Non-unix, custom-API hybrid OS kernel written in C++ which can be thought of as an emulated microkernel. The native API is almost fully asynchronous and the kernel is aimed at high-scaling, high-throughput-requiring multiprocessor workloads, with working support for SMP and NUMA already implemented. Join the IRC channel, #zbz-dev on freenode!

C++193Updated 2 weeks ago

c-plus-plushybrid-kernelkernelnon-uniform-memory-accessnumaoperating-systemoperating-system-kernelossmpsymmetric-multiprocessingudiuniform-driver-interface

yhaenggi/numad

numad for debian/ubuntu

C178Updated 2 months ago

daemonnumanumad

flashxio/knor

A repo to allow validation of performance results in the knor paper and provide a fast, scalable k-means implementation.

C++154Updated 1 month ago

algorithmclusterdistributed-computingexternal-memorykmeans-clusteringnumastreaming

aahouzi/llama2-chatbot-cpu

A LLaMA2-7b chatbot with memory running on CPU, and optimized using smooth quantization, 4-bit quantization or Intel® Extension For PyTorch with bfloat16.

Python150Updated 1 month ago

4-bit-cpubfloat16chatbotchatbot-memorychatgptcpuhuggingfaceint8intelipexlangchainllamallama2metameta-aineural-compressionnumaoptimizationsmooth-quantizationstreamlit

vanes430/java

Pterodactyl Docker Images

Dockerfile111Updated 2 days ago

dockerdocker-imagefoliafolia-supportedminecraft-servernumanuma-awarepapermcpterodactylpterodactyl-dockerpterodactyl-docker-imagespterodactyl-egg

HewlettPackard/hpcat

HPC Affinity Tracker (HPCAT) is designed to showcase NUMA, CPU core, NIC and GPU affinities in the context of High Performance Computing (HPC) applications.

C102Updated 5 months ago

affinitybindinggpuhpclocalitynumaperformance

theoddden/Terradev

NUMA-aware GPU provisioning and orchestration for MoE workloads of all sizes - *Claude Code native*

Python101Updated 3 hours ago