GitHunt

Jake Hemstad

jrhemstad

@NVIDIA Lead for CUDA Core Compute Libraries (CCCL) CUDA at the speed-of-(de)light.

@NVIDIA
Minneapolis, MN

Languages

C++43%CMake21%Shell14%Cuda7%Assembly7%Python7%

Repos

52

Stars

36

Forks

7

Top Language

C++

Loading contributions...

Top Repositories

Repositories

52
JR
jrhemstad/clang-include-graphFork

Simple tool for analyzing C++ project include graph

00Updated 1 week ago
JR
jrhemstad/ccclFork

CUDA C++ Core Libraries

C++01Updated 1 week ago
JR
jrhemstad/resource-streamFork

GPU programming related news and material links

10Updated 6 months ago
JR
jrhemstad/cuda_scalar_result

Answering "What is the faster way to return a single scalar from a kernel to host?"

CMake91Updated 3 years ago
JR
jrhemstad/example_cuda_benchmark

Template repository for CUDA enabled benchmarks using Google Benchmark

CMake92Updated 4 years ago
JR
jrhemstad/discord-cluster-managerFork

Write a fast kernel and run it on Discord. See how you compare against the best!

00Updated 10 months ago
JR
jrhemstad/cuda-samplesFork

Samples for CUDA Developers which demonstrates features in CUDA Toolkit

10Updated 1 year ago
JR
jrhemstad/two_largest

Adventure in profiling and optimization.

C++61Updated 3 years ago
JR
jrhemstad/accelerated-computing-hubFork

NVIDIA curated collection of educational resources related to general purpose GPU programming.

00Updated 1 year ago
JR
jrhemstad/jrhemstad

No description provided.

00Updated 1 year ago
JR
jrhemstad/llm.cFork

LLM training in simple, raw C/CUDA

00Updated 1 year ago
JR
jrhemstad/github-markdownFork

No description provided.

00Updated 2 years ago
JR
jrhemstad/test_workflow_failure

No description provided.

00Updated 2 years ago
JR
jrhemstad/cutlassFork

CUDA Templates for Linear Algebra Subroutines

00Updated 2 years ago
JR
jrhemstad/creduce-example

Examples on how to use C-Reduce to create minimal compiler bug reproducers

Shell10Updated 5 years ago
JR
jrhemstad/cuda-quantumFork

C++ and Python support for the CUDA Quantum programming model for heterogeneous quantum-classical workflows

00Updated 2 years ago
JR
jrhemstad/devcontainersFork

No description provided.

00Updated 2 years ago
JR
jrhemstad/libcudacxxFork

The NVIDIA C++ Standard Library

00Updated 3 years ago
JR
jrhemstad/cubFork

Cooperative primitives for CUDA C++.

Cuda00Updated 3 years ago
JR
jrhemstad/stdexecFork

`std::execution`, the proposed C++ framework for asynchronous and parallel programming.

00Updated 3 years ago
JR
jrhemstad/cuda_arch_odr

No description provided.

Shell40Updated 3 years ago
JR
jrhemstad/.githubFork

No description provided.

00Updated 3 years ago
JR
jrhemstad/link_test

Testing linkage of function local statics

C++11Updated 4 years ago
JR
jrhemstad/nvtx_wrappers

This repository is deprecated and the code has moved to the official NVIDIA NVTX github repository: https://github.com/NVIDIA/NVTX

C++20Updated 4 years ago
JR
jrhemstad/thrustFork

Thrust is a C++ parallel programming library which resembles the C++ Standard Library.

C++00Updated 3 years ago
JR
jrhemstad/gil_preload

Add NVTX ranges to Python GIL

C++00Updated 4 years ago
JR
jrhemstad/nvbenchFork

CUDA Kernel Benchmarking Library

00Updated 4 years ago
JR
jrhemstad/compiler-explorerFork

Run compilers interactively from your web browser and interact with the assembly

Assembly00Updated 2 years ago
JR
jrhemstad/infraFork

Infrastructure to set up the public Compiler Explorer instances and compilers

Python00Updated 2 years ago
JR
jrhemstad/cuda_random_memory

Benchmarks for sequential and random memory accesses to global memory

CMake21Updated 5 years ago

Gists

Recent Activity

Jake Hemstad (jrhemstad) | GitHunt