Shauheen
shauheen
Senior Staff Machine Learning Engineering Manager
Languages
Repos
52
Stars
2
Forks
0
Top Language
Python
Loading contributions...
Top Repositories
A list of awesome compiler projects and papers for tensor computation and deep learning.
An Open Source Machine Learning Framework for Everyone
KernelBench: Can LLMs Write GPU Kernels? - Benchmark with Torch -> CUDA problems
PyTorch per step fault tolerance (actively under development)
PyTorch Single Controller
A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.
Repositories
52KernelBench: Can LLMs Write GPU Kernels? - Benchmark with Torch -> CUDA problems
PyTorch per step fault tolerance (actively under development)
PyTorch Single Controller
A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.
A profiling and performance analysis tool for machine learning
Enabling PyTorch on Google TPU
No description provided.
AMD's graph optimization engine.
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).
Development repository for the Triton language and compiler
Source code for the book "Quantum Computing for Programmers", Cambridge University Press
A simplified and automated orchestration workflow to perform ML end-to-end (E2E) model tests and benchmarking on Cloud VMs across different frameworks.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Generate snapshots and rankings of monthly committer and issue/PR activity
Karras et al. (2022) diffusion models for PyTorch
A community-driven and modular open source compiler.
A C++ standalone library for machine learning
Hummingbird compiles trained ML models into tensor computation for faster inference.
A list of awesome compiler projects and papers for tensor computation and deep learning.
.NET bindings for the Pytorch engine
An Open Source Machine Learning Framework for Everyone
"Multi-Level Intermediate Representation" Compiler Infrastructure
The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github pull requests at this moment. Please submit your patches at http://reviews.llvm.org.
PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, EfficientNetV2, NFNet, Vision Transformer, MixNet, MobileNet-V3/V2, RegNet, DPN, CSPNet, and more
Google Cloud TPU Utilization Bar for Training Models
Babysit your preemptible TPUs
This repository contains example code to build models on TPUs
Testing framework for Deep Learning models (Tensorflow and PyTorch) on Google Cloud hardware accelerators (TPU and GPU)
Julia on TPUs
This is the hub for all the projects that are part of the .NET Foundation. MD files in the projects folder feed the content on the .NET Foundation website