gpu-sync Lite
This repository contains code for generating a shared library that intercepts calls to the NVIVIA Cuda 6 runtime library and redirects them to your own locking code.
How to use:
- Generate cuda function stubs through the
GPUSync/Wrapper/make_stubs.pyprogram. See program comments for instructions. - Find cuda calls that you would like to trigger locking and insert locking code (described in
Wrapper/GPU_Locks.h). We recommend: cudaLaunch(): acquire lock, run kernel, release lock.- all variants of
cudaMemcpy(): acquire lock, do memcpy, release lock. - all async variants of i and ii: acquire lock, do operation.
cudaStreamSynchronize,cudaDeviceSynchronize: release last acquired lock.- Generate the library via
GPUSync/Wrapper/wrap_generate.py. - Compile the library:
- Use the makefile in
GPUSync/Locks/Kernel_Locks/if you would like to use a kernel module. - Use the makefile in
GPUSync/Locks/POSIX_Locks/if you would like to use a runtime implementation. - Use the locks:
- Compile your CUDA 6 program to link the cuda library at runtime, not statically. This is achieved with the
--cudart sharedflag tonvcc. - Use the interception library:
LD_PRELOAD=/path/to/libcudart_wrapper.so ./cuda_program
On this page
Contributors
Apache License 2.0
Created August 1, 2018
Updated August 1, 2018