gpu-sync Lite

This repository contains code for generating a shared library that intercepts calls to the NVIVIA Cuda 6 runtime library and redirects them to your own locking code.

How to use:

Generate cuda function stubs through the GPUSync/Wrapper/make_stubs.py program. See program comments for instructions.
Find cuda calls that you would like to trigger locking and insert locking code (described in Wrapper/GPU_Locks.h). We recommend:
cudaLaunch(): acquire lock, run kernel, release lock.
all variants of cudaMemcpy(): acquire lock, do memcpy, release lock.
all async variants of i and ii: acquire lock, do operation.
cudaStreamSynchronize, cudaDeviceSynchronize: release last acquired lock.
Generate the library via GPUSync/Wrapper/wrap_generate.py.
Compile the library:
Use the makefile in GPUSync/Locks/Kernel_Locks/ if you would like to use a kernel module.
Use the makefile in GPUSync/Locks/POSIX_Locks/ if you would like to use a runtime implementation.
Use the locks:
Compile your CUDA 6 program to link the cuda library at runtime, not statically. This is achieved with the --cudart shared flag to nvcc.
Use the interception library: LD_PRELOAD=/path/to/libcudart_wrapper.so ./cuda_program

UgurKap/gpu-sync

gpu-sync Lite

How to use:

On this page

Contributors