"topic:hpc" — Search

Singularity has been renamed to Apptainer as part of us moving the project to the Linux Foundation. This repo has been persisted as a snapshot right before the changes.

Go2.6k426Updated 3 hours ago

cloud-nativecontainercontainershpclinuxparallelportabilityportablereproduciblereproducible-sciencerootless-containerssciencesingularitysingularity-container

open-mpi/ompi

Open MPI main development repository

C2.5k948Updated 1 hour ago

cfortranhacktoberfesthpcmpiopenmpi

NVIDIA/cccl

CUDA Core Compute Libraries

C++2.2k352Updated just now

accelerated-computingcppcpp-programmingcudacuda-cppcuda-kernelscuda-librarycuda-programminggpugpu-accelerationgpu-computinggpu-programminghpcmodern-cppnvidianvidia-gpuparallel-algorithmparallel-computingparallel-programming

mfem/mfem

Lightweight, general, scalable C++ library for finite element methods

C++2.1k604Updated 1 day ago

amrcomputational-sciencefemfinite-elementshigh-orderhigh-performance-computinghpcmath-physicsparallel-computingradiussscientific-computing

chapel-lang/chapel

a Productive Parallel Programming Language

Chapel2.0k445Updated 1 day ago

chapelcompilerconcurrencydistributed-computinggpuhigh-performance-computinghpclanguageopen-sourceparallelparallel-computingperformanceproductiveprogramming-languagescientific-computing

moosefs/moosefs

MooseFS Distributed Storage – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System / Software-Defined Storage

C1.9k236Updated 3 days ago

backupblock-storagedistributed-computingdistributed-file-storagedistributed-storageditributed-systemserasure-codingfile-storagefile-systemsfilesystemfuse-filesystemhigh-availabilityhpchpc-clusterhpc-storageposixposix-compliantsnapshotssoftware-defined-storagestorage

ashvardanian/less_slow.cpp

Playing around "Less Slow" coding practices in C++ 20, C, CUDA, PTX, & Assembly, from numerics & SIMD to coroutines, ranges, exception handling, networking and user-space IO

C++1.9k81Updated 1 week ago

assemblyassembly-languageavx512benchmarkcoroutinescppcpp-programmingcpp17cpp20cudagccgoogle-benchmarkhpcio-uringlinux-kernelllvmptxrangestutorialtutorials

AdaptiveCpp/AdaptiveCpp

Compiler for multiple programming models (SYCL, C++ standard parallelism, HIP/CUDA) for CPUs and GPUs from all vendors: The independent, community-driven compiler for C++-based heterogeneous programming models. Lets applications adapt themselves to all the hardware in the system - even at runtime!

C++1.8k212Updated 2 hours ago

adaptivecppcompilergpgpugpu-computinghigh-performancehigh-performance-computinghipsyclhpcopensyclstdparsycl

apptainer/apptainer

Apptainer: Application containers for Linux

Go1.8k172Updated just now

apptainercontainershpclinuxrootless-containerssciencesingularitysingularity-container

DTolm/VkFFT

Vulkan/CUDA/HIP/OpenCL/Level Zero/Metal Fast Fourier Transform library

C++1.7k129Updated 4 days ago

c2rconvolutioncudadctffthiphpclevelzerometalopenclr2cr2rvulkan

indigo-dc/udocker

A basic user tool to execute simple docker containers in batch or interactive systems without root privileges.

Python1.7k164Updated 1 day ago

batchchrootcontainersdeep-hybrid-dataclouddockerdocker-containersemulationeosc-hubfakechrootgridhpcindigoprootroot-privilegesruncuser

boostorg/compute

A C++ GPU Computing Library for OpenCL

C++1.6k340Updated 7 hours ago

boostc-plus-pluscomputecppgpgpugpuhpcopenclperformance

su2code/SU2

SU2: An Open-Source Suite for Multiphysics Simulation and Design

C++1.6k945Updated 1 day ago

c-plus-pluscfdflowfluidfluid-dynamicshpcopensourceoptimizationphysicspythonsimulation

dealii/dealii

The development repository for the deal.II finite element library

C++1.6k838Updated 20 hours ago

c-plus-plusfemfinite-elementshpcparallel-computingscientific-computing

openucx/ucx

Unified Communication X (mailing list - https://elist.ornl.gov/mailman/listinfo/ucx-group)

C1.6k532Updated 1 day ago

ariescc-plus-pluscraydriversgeminihacktoberfesthpcinfinibandiwarpmpinetworkingopenshmempgasrdmaroceshared-memoryshmemtcp-ipverbs

NVIDIA/MatX

An efficient C++20 GPU numerical computing library with Python-like syntax

C++1.4k112Updated 1 day ago

cudagpgpugpugpu-computinghpc

trilinos/Trilinos

Primary repository for the Trilinos Project

C++1.4k609Updated 4 hours ago

c-plus-plushigh-performance-computinghpchpsfsandia-national-laboratoriesscientific-computingsnl-science-libstrilinos

jfalcou/eve

Expressive Vector Engine - SIMD in C++ Goes Brrrr

C++1.3k66Updated 4 days ago

aarch64altivecarm-svearm-sve2avxavx2avx512cppcpp-libraryhpcneonsimdsimd-librarysimd-parallelismsimd-programmingsse2ssse3

Liu-xiandong/How_to_optimize_in_GPU

This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.

Cuda1.2k178Updated 2 days ago

elementwisegpu-accelerationhigh-performance-computinghpcreducesgemmsgemv

uccl-project/uccl

UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g., GPU-driven)

C++1.2k127Updated 13 hours ago

aiallreduceamdbroadcomcollectivecudagpuhpckvcachellmmoenetworkingnvidiap2prdma

Page 1 of 34