Repos
31
Stars
26
Forks
0
Top Language
Python
Loading contributions...
Top Repositories
Official repository for the paper Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention For Test-Time Regression
Running large language models like OPT-175B/GPT-3 on a single GPU. Focusing on high-throughput generation.
Course project for EE3001 Machine Learning
Serving multiple LoRA finetuned LLM as one
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, with more integrations coming..
Repositories
31No description provided.
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, with more integrations coming..
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
Official repository for the paper Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention For Test-Time Regression
NanoGPT (124M) in 5 minutes
No description provided.
Several simple extensions that add some nifty features to Python
Running large language models like OPT-175B/GPT-3 on a single GPU. Focusing on high-throughput generation.
A MAD laboratory to improve AI architecture designs 🧪
Sky-T1: Train your own O1 preview model within $450
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
No description provided.
FlashInfer: Kernel Library for LLM Serving
Blockchain dark forest selfguard handbook. Master these, master the security of your cryptocurrency.
Princeton University - COS/ECE 470 : Principles of Blockchains
Course project for EE3001 Machine Learning
Fast and memory-efficient exact attention
A generic framework for on-demand, incrementalized computation. Inspired by adapton, glimmer, and rustc's query system.
Datalog compiler embedded in Rust as a procedural macro
A massively parallel, high-level programming language
No description provided.
Official implementation for "iTransformer: Inverted Transformers Are Effective for Time Series Forecasting" (ICLR 2024 Spotlight), https://openreview.net/forum?id=JePfAI8fah
A robot control program based on STM32F103ZET6.
Redesigned course project for Compiler Principle 2023 Fall
No description provided.
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models
Minimalist ML framework for Rust
Serving multiple LoRA finetuned LLM as one
Empowering everyone towards next generation AI and software.
A massively parallel, optimal functional runtime in Rust