2,758 results for “topic:hpc”
The Julia Programming Language
Making large AI models cheaper, faster and more accessible
A Cloud Native Batch System (Project under CNCF)
A flexible package manager that supports multiple versions, configurations, platforms, and compilers.
The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.
ArrayFire: a general purpose GPU library.
A DSL for data-driven computational pipelines
Any model. Any hardware. Zero compromise. Built with @ziglang / @openxla / MLIR / @bazelbuild
:boom::computer::boom: A data-parallel functional programming language
BLAS-like Library Instantiation Software Framework
Singularity has been renamed to Apptainer as part of us moving the project to the Linux Foundation. This repo has been persisted as a snapshot right before the changes.
Open MPI main development repository
CUDA Core Compute Libraries
Lightweight, general, scalable C++ library for finite element methods
a Productive Parallel Programming Language
MooseFS Distributed Storage – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System / Software-Defined Storage
Playing around "Less Slow" coding practices in C++ 20, C, CUDA, PTX, & Assembly, from numerics & SIMD to coroutines, ranges, exception handling, networking and user-space IO
Compiler for multiple programming models (SYCL, C++ standard parallelism, HIP/CUDA) for CPUs and GPUs from all vendors: The independent, community-driven compiler for C++-based heterogeneous programming models. Lets applications adapt themselves to all the hardware in the system - even at runtime!
Apptainer: Application containers for Linux
Vulkan/CUDA/HIP/OpenCL/Level Zero/Metal Fast Fourier Transform library
A basic user tool to execute simple docker containers in batch or interactive systems without root privileges.
A C++ GPU Computing Library for OpenCL
SU2: An Open-Source Suite for Multiphysics Simulation and Design
The development repository for the deal.II finite element library
Unified Communication X (mailing list - https://elist.ornl.gov/mailman/listinfo/ucx-group)
An efficient C++20 GPU numerical computing library with Python-like syntax
Primary repository for the Trilinos Project
Expressive Vector Engine - SIMD in C++ Goes Brrrr
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g., GPU-driven)