Perkz Zheng

PerkzZheng

Currently work as an AI Technology Developer Engineer @ Nvidia

Nvidia-Beijing

Beijing

Languages

Python50%C++50%

Repos

Stars

Forks

Top Language

Python

Loading contributions...

Repositories

PerkzZheng/flashinferFork

FlashInfer: Kernel Library for LLM Serving

00Updated 10 hours ago

PerkzZheng/TensorRT-LLMFork

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

00Updated 1 week ago

PerkzZheng/cutlassFork

CUDA Templates for Linear Algebra Subroutines

00Updated 3 years ago

PerkzZheng/fastertransformer_backendFork

No description provided.

Python01Updated 3 years ago

PerkzZheng/apexFork

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

00Updated 3 years ago

PerkzZheng/FasterTransformerFork

Transformer related optimization, including BERT, GPT