GitHunt

LiYu Lu

luliyucoordinate

Pytorch/TensorFlow/CUDA/HPC/more

hangzhou

Organizations

Languages

C++43%Python29%Cuda24%C5%

Loading contributions...

Top Repositories

Repositories

45
LU
luliyucoordinate/cute-flash-attention

Implement Flash Attention using Cute.

Cuda1028Updated 1 year ago
LU
luliyucoordinate/Leetcode

Play Leetcode with different programming language

C++1.5k477Updated 2 years ago
ccppgojavajavascriptleetcoderust
LU
luliyucoordinate/flash-attention-minimalFork

Flash Attention in ~100 lines of CUDA (forward pass only)

Cuda100Updated 1 year ago
LU
luliyucoordinate/myos

No description provided.

C++7517Updated 4 years ago
LU
luliyucoordinate/daily_stock_analysisFork

LLM驱动的 A/H股智能分析器,多数据源行情 + 实时新闻 + Gemini 决策仪表盘 + 多渠道推送,零成本,纯白嫖,定时运行

00Updated 1 month ago
LU
luliyucoordinate/FPN_pytorch

Implement FPN with pytorch

Python6639Updated 7 years ago
LU
luliyucoordinate/StockTradebyZFork

No description provided.

00Updated 9 months ago
LU
luliyucoordinate/HunyuanDiTFork

Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

00Updated 1 year ago
LU
luliyucoordinate/mynet

No description provided.

C++208Updated 4 years ago
LU
luliyucoordinate/Awesome-CuteFork

No description provided.

00Updated 1 year ago
LU
luliyucoordinate/native-sparse-attentionFork

🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"

Python00Updated 1 year ago
LU
luliyucoordinate/eval_voc

eval voc data use python

Python1210Updated 7 years ago
LU
luliyucoordinate/DeepSpeedFork

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python00Updated 2 years ago
LU
luliyucoordinate/YHs_SampleFork

Yinghan's Code Sample

Cuda10Updated 1 year ago
LU
luliyucoordinate/CUDA-GEMM-OptimizationFork

CUDA Matrix Multiplication Optimization

Cuda10Updated 1 year ago
LU
luliyucoordinate/cutlassFork

CUDA Templates for Linear Algebra Subroutines

C++00Updated 1 year ago
LU
luliyucoordinate/e-bookFork

各方面的电子书籍

10Updated 6 years ago
LU
luliyucoordinate/TensorRT-LLMFork

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

C++00Updated 1 year ago
LU
luliyucoordinate/tiny-triton

No description provided.

10Updated 1 year ago
LU
luliyucoordinate/CoreFusionGEMM

No description provided.

00Updated 1 year ago
LU
luliyucoordinate/ThunderKittensFork

Tile primitives for speedy kernels

00Updated 1 year ago
LU
luliyucoordinate/Python2048

Use python to implement 2048 game

Python58Updated 5 years ago
LU
luliyucoordinate/cute-gemmFork

No description provided.

C++00Updated 2 years ago
LU
luliyucoordinate/tensorflowFork

An Open Source Machine Learning Framework for Everyone

C++00Updated 3 years ago
LU
luliyucoordinate/recommenders-addonsFork

Additional utils and helpers to extend TensorFlow when build recommendation systems, contributed and maintained by SIG Recommenders.

Cuda00Updated 3 years ago
LU
luliyucoordinate/mynet-test

No description provided.

C++02Updated 4 years ago
LU
luliyucoordinate/HP-CPP

No description provided.

C++00Updated 4 years ago
LU
luliyucoordinate/YOLOv2-pytorch

Implement YOLOv2 with pytorch

Python41Updated 8 years ago
LU
luliyucoordinate/play-linux

No description provided.

C11Updated 6 years ago
LU
luliyucoordinate/ebookFork

classic books of computer science!

11Updated 6 years ago

Gists

Recent Activity