100 results for “topic:openacc”
This repository consists for gpu bootcamp material for HPC and AI
This is a set of simple programs that can be used to explore the features of a parallel platform.
Abstraction Library for Parallel Kernel Acceleration :llama:
Python extension language using accelerators
STREAM, for lots of devices written in many programming models
Exascale multiphase flow solver — 2025 Gordon Bell Prize Finalist | 200T grid points on 43K+ GPUs
LargeScale Multiphysics Scientific Simulation Environment-OneFLOW CFD
GPUFORT: S2S translation tool for CUDA Fortran and Fortran+X in the spirit of hipify
Training materials provided by OpenACC.org.
N-Ways to GPU Programming Bootcamp
No description provided.
Simple OpenACC Fortran Examples
A solver for the coupled and decoupled electron and phonon Boltzmann transport equations.
Interoperability examples for OpenACC.
POT3D: High Performance Potential Field Solver
CLAW Compiler for Performance Portability
High Performance Computing Strategies for Boundary Value Problems
RTM
OpenACC* to OpenMP* API assisting migration tool
The sources for the OpenACC Programming and Best Practices Guide.
Code examples for CUDA and OpenACC
Multi-GPU code for intreface-resolved simulations of multiphase turbulence (triply periodic box and channel). DNS of Navier-Stokes equations coupled with a phase-field method (ACDI) for interface description. The cuDecomp library is used for parallelization.
Profiling with NVIDIA Nsight Tools Bootcamp
OpenACC for Python
Fortran UNified Device Acceleration Library
Case studies constitute a modern interdisciplinary and valuable teaching practice which plays a critical and fundamental role in the development of new skills and the formation of new knowledge. This research studies the behavior and performance of two interdisciplinary and widely adopted scientific kernels, a Fast Fourier Transform and Matrix Multiplication. Both routines are implemented in the two current most popular many-core programming models CUDA and OpenACC. A Fast Fourier Transform (FFT) samples a signal over a period of time and divides it into its frequency components, computing the Discrete Fourier Transform (DFT) of a sequence. Unlike the traditional approach to computing a DFT, FFT algorithms reduce the complexity of the problem from O(n2) to O(nLog2n). Matrix multiplication is a cornerstone routine in Mathematics, Artificial Intelligence and Machine Learning. This research also shows that the nature of the problem plays a crucial role in determining what many-core model will provide the highest benefit in performance.
Immersed Boundary method fast Test Facility based OpenAcc
The repository containing everything you need to compete in the IHPCSS 2019 programming challenge.
jacobi - a benchmark by solving 2D laplace equation with jacobi iterative method. GPU or Xeon Phi can be used.
traveling salesman problem solved with different programing models