"topic:ptx" — Search

53 results for “topic:ptx”

Playing around "Less Slow" coding practices in C++ 20, C, CUDA, PTX, & Assembly, from numerics & SIMD to coroutines, ranges, exception handling, networking and user-space IO

C++1.9k81Updated 2 months ago

assemblyassembly-languageavx512benchmarkcoroutinescppcpp-programmingcpp17cpp20cudagccgoogle-benchmarkhpcio-uringlinux-kernelllvmptxrangestutorialtutorials

m4rs-mt/ILGPU

ILGPU JIT Compiler for high-performance .Net GPU programs

C#1.7k139Updated 3 days ago

amdcilcompilercpucudadotnetgpgpugpgpu-computinggpuilgpuinteljitkernelsmsilnvidiaopenclparallelptx

tpoisonooo/how-to-optimize-gemm

row-major matmul optimization

C++71095Updated 3 weeks ago

arm64armv7cudacuda-kernelgemm-optimizationint4ptxvulkan

coderonion/awesome-cuda-and-hpc

🚀🚀🚀 This repository lists some awesome public CUDA, cuda-python, cuBLAS, cuDNN, CUTLASS, TensorRT, TensorRT-LLM, Triton, TVM, MLIR, PTX and High Performance Computing (HPC) projects.

46041Updated 7 months ago

awesomeblascublascudacudnncutlassdeepseekgemmgpuhpcllamallmmlirptxpytorchtensorrttensorrt-llmtritontvmvlm

SunsetQuest/CudaPAD

CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of the assembly.

C#12718Updated 3 years ago

cudacuda-programminggpunvidiaptxptx-utilswindows

zamaudio/ptformat

Free software file format parser for Avid ProTools sessions

C++8718Updated 3 months ago

ardourinteroperabilityprotoolsptfptxsession

deciding/txl

TeraXLang - Triton Extension for LLM. As fast as FlashAttention FlashMLA, etc.

C++681Updated 3 days ago

compilerdslmlirptxtriton

Energinet-SimTools/MTB

Energinets Model Testbench. Automate gridcompliance studies in PSCAD and Powerfactory.

Python5716Updated 4 days ago

dccgeneratorgreen-transitiongridcompliancehigh-voltagehvdcpower-electronicspower2xpowerfactorypowergridpowersystem-simulationpowersystemspscadptxrenewable-energyrfgsolar-energywind-energy

ProjectPhysX/PTXprofiler

A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.

C++577Updated 1 year ago

cudagpugpu-accelerationgpu-computinggpu-programminghpcnvidianvidia-cudanvidia-gpuopenclprofilerptxptx-utilsroofline-modelsycl

seekbytes/ptxNinja

Binary Ninja plugin for reverse engineering PTX -- the virtual instruction set architecture of CUDA-based GPUs.

Rust462Updated 2 weeks ago

binaryninjacudadecompilationptxseekbytes-ptxninja-41d6b9de

bikrammajhi/100-days-of-GPU

This is my 🔥 100 Days of GPU — a wild, hands-on journey through CUDA/CUTLASS kernels, Triton spells, and PTX sorcery.

HTML365Updated 5 days ago

cudacutlassmojonsight-computeptxthunderkittenstriton

danielcamposramos/Knowledge3D

Web knowledge is fragmented — duplicated across fonts, embeddings, metadata, and renderings. Humans see pixels, AI sees tokens, neither shares the source. Knowledge3D: a sovereign GPU-native reference implementation for W3C PM-KR, where humans and AI consume the same procedural knowledge from one source.

Python345Updated 12 hours ago