Terry Kong

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper GPUs, to provide better performance with lower memory utilization in both training and inference.

Python00Updated 10 months ago

terrykong/testing-github-things

No description provided.

00Updated 10 months ago

terrykong/NeMo-AlignerFork

Scalable toolkit for efficient model alignment

Python00Updated 11 months ago

terrykong/lm-evaluation-harnessFork

A framework for few-shot evaluation of language models.

00Updated 1 year ago

terrykong/TensorRT-LLMFork

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

C++00Updated 1 year ago

terrykong/grainFork

No description provided.

00Updated 2 years ago

terrykong/NeMo-Megatron-LauncherFork

NeMo Megatron launcher and tools

Python00Updated 2 years ago

terrykong/xlaFork

A machine learning compiler for GPUs, CPUs, and ML accelerators

C++00Updated 2 years ago

terrykong/testing-actions

No description provided.

00Updated 2 years ago

terrykong/action-workflow-run-waitFork

wait for all `workflow_run` required workflows to be successful

00Updated 2 years ago

terrykong/lingvoFork

Lingvo

Python00Updated 2 years ago

terrykong/flaxFork

Flax is a neural network library for JAX that is designed for flexibility.

00Updated 1 year ago

terrykong/tqdmFork

A Fast, Extensible Progress Bar for Python and CLI

00Updated 2 years ago

terrykong/praxisFork

No description provided.

Python00Updated 2 years ago

terrykong/t5xFork

No description provided.

Python00Updated 2 years ago

terrykong/paxmlFork

Pax is a Jax-based machine learning framework for training large scale models. Pax allows for advanced and fully configurable experimentation and parallelization, and has demonstrated industry leading model flop utilization rates.

00Updated 2 years ago

terrykong/tfrecord-browser

Read your tfrecord files from the command line

Python82Updated 6 years ago

command-line-tooltensorflowtfrecords

terrykong/JAX-ToolboxFork

JAX-Toolbox

Shell00Updated 2 years ago

terrykong/webdatasetFork

A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.

00Updated 3 years ago

Terry Kong

Languages

Loading contributions...

Top Repositories

Repositories

Gists

Recent Activity