GitHunt

Kun Wu

K-Wu

Making the Stack Data-Efficient, Composable & Scalable!⚓@NVIDIA Backend Compiler Engineer⚓PhD (@illinois-impact)⚓BEng (Tsinghua)

@NVIDIA
Santa Clara

Organizations

Languages

Python31%TeX10%Jupyter Notebook10%C++7%Matlab7%Verilog7%HTML3%Cuda3%JavaScript3%VHDL3%

Top Repositories

Repositories

138
K-
K-Wu/OneKEFork

[WWW 2025] A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System.

HTML00Updated 23 hours ago
K-
K-Wu/CV-tsinghua-template

All hail, Thy Highest University (THU)

TeX4812Updated 1 day ago
latex-templateresume-templatetemplatetemplatestsinghua
K-
K-Wu/taxes-2018Fork

Fills out forms for 2018 tax returns.

Python00Updated 2 days ago
K-
K-Wu/FlashTrain

An Activation Offloading Framework to SSDs for Faster Large Language Model Training

Python61Updated 1 week ago
hooksinterpositionllm-trainingoffloadingpytorchssd
K-
K-Wu/Qidian_Webnovel_DataCollectionFork

No description provided.

Jupyter Notebook00Updated 1 month ago
K-
K-Wu/jjwxc-crawlerFork

A simple tool to scrape and download non-V chapters of any novel from jjwxc.net in .docx format, built with Python and Scrapy | 基于Scrapy开发的晋江爬虫,根据书号下载小说非V章节,生成可编辑的Word文档

Python00Updated 1 month ago
K-
K-Wu/QidianCrawler

本程序是一个基于DrissionPage库的小说爬虫,用于爬取起点中文网的小说内容,它使用Rich库来提供丰富的输出信息。

Python31Updated 3 months ago
K-
K-Wu/cuMemcpy

[In progress] Performant memcpy within cuda kernel

Cuda00Updated 3 months ago
K-
K-Wu/Megatron-DeepSpeedFork

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Python11Updated 8 months ago
K-
K-Wu/intrasm_engine

Enhancing CUDA Intra-Streaming-Multiprocessor Parallelism for Large Language Models via Fine-Grained Task Graph

Jupyter Notebook00Updated 8 months ago
cuda-graphcupyinstruction-level-parallelismllmpeftpytorchsparse-matrixtransformer
K-
K-Wu/pytorch-direct_dgl

Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture (accepted by PVLDB)

444Updated 8 months ago
cudadeep-learningdglgnngpupytorchvldb
K-
K-Wu/K-Wu.github.ioFork

A beautiful, simple, clean, and responsive Jekyll theme for academics

JavaScript33Updated 8 months ago
K-
K-Wu/HET

HET: The HET Hetero-GNN Kernel Optimization and Code Generation Project

C++50Updated 9 months ago
codegencompilerscudadeep-learningdglgnngpulibtorchpytorch
K-
K-Wu/Stanford-Machine-Learning-CourseFork

machine learning course programming exercise

Matlab10Updated 11 months ago
K-
K-Wu/tensorflow-Tutorial

No description provided.

Jupyter Notebook10Updated 11 months ago
K-
K-Wu/Digital-Circuit-Experiment

No description provided.

Verilog10Updated 11 months ago
K-
K-Wu/mips_converter

convert mips assembly to machine code

Python31Updated 11 months ago
K-
K-Wu/Jpeg-toy

No description provided.

Matlab10Updated 11 months ago
K-
K-Wu/espArchived

ECE 527 Course Project

VHDL10Updated 11 months ago
K-
K-Wu/mips-pipeline-final

No description provided.

Verilog10Updated 11 months ago
K-
K-Wu/UIUCThesisClean

No description provided.

TeX00Updated 11 months ago
K-
K-Wu/cs526_reportArchived

No description provided.

TeX00Updated 12 months ago
K-
K-Wu/douban_movie_reviewFork

豆瓣Top250影评爬虫(用于情感分析语料)

Python00Updated 1 year ago
K-
K-Wu/pytorch-direct

Code for Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture (accepted by PVLDB).The outdated write-up (https://arxiv.org/abs/2101.07956) explains engineering details, but only a portion of the functionality is migrated to this newer PyTorch version 1.8.0nightly (e152ca5).

C++94Updated 1 year ago
deep-learninggnnlibtorchneural-networkpytorch
K-
K-Wu/cs526_mp2Archived

No description provided.

LLVM00Updated 1 year ago
K-
K-Wu/llm-analysisFork

Latency and Memory Analysis of Transformer Models for Training and Inference

Python00Updated 1 year ago
customizationendurancellm-inferencellm-trainingpcieperformance-modeling-and-analysisssd
K-
K-Wu/IGB-DatasetsFork

Largest realworld open-source graph dataset - Worked done under IBM-Illinois Discovery Accelerator Institute and Amazon Research Awards and in collaboration with NVIDIA Research.

Python00Updated 1 year ago
K-
K-Wu/ThesisMisc

No description provided.

R00Updated 1 year ago
K-
K-Wu/GpuCpuApiFork

List GPU CPU

PHP00Updated 1 year ago
K-
K-Wu/ext2-5.0

ext2 Linux File System Extracted from torvalds/linux v5.0

C10Updated 1 year ago

Gists

Recent Activity