Olivier Chafik
ochafik
Pragmatic programmer with some idealism left; Views expressed are my own; 🏳️🌈
Languages
Loading contributions...
Top Repositories
OpenSCAD Web Playground
A minimalistic C++ Jinja templating engine for LLM chat templates
A lightweight, lightning-fast, in-process vector database
A lightweight inference engine supporting speculative speculative decoding (SSD).
Context7 MCP Server -- Up-to-date documentation for LLMs and AI code editors
📰 Must-read papers on KV Cache Compression (constantly updating 🤗).
Repositories
398A lightweight inference engine supporting speculative speculative decoding (SSD).
A curated list of 450+ awesome open-source alternatives to proprietary SaaS. Deployment configs, self-hosting guides, and tool directory.
MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.
Homebrew Tap for llama.cpp
LM Studio Apple MLX engine
Access Google Workspace when using Gemini CLI
OpenSCAD Web Playground
Port of Facebook's LLaMA model in C/C++
Context7 MCP Server -- Up-to-date documentation for LLMs and AI code editors
📰 Must-read papers on KV Cache Compression (constantly updating 🤗).
Enhanced ChatGPT Clone: Features Agents, MCP, DeepSeek, Anthropic, AWS, OpenAI, Responses API, Azure, Groq, o1, GPT-5, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message search, Code Interpreter, langchain, DALL-E-3, OpenAPI Actions, Functions, Secure Multi-User Auth, Presets, open-source for self-hosting. Active.
ToolOrchestra is an end-to-end RL training framework for orchestrating tools and agentic workflows.
A minimalistic C++ Jinja templating engine for LLM chat templates
This is MCP server for Claude that gives it terminal control, file system search and diff file editing capabilities
🧱 MCP App for making my brick dreams a reality
Inference RWKV with multiple supported backends.
Run LLMs with MLX
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RNN and transformer - great performance, linear time, constant space (no kv-cache), fast training, infinite ctx_len, and free sentence embedding.
A lightweight, lightning-fast, in-process vector database
🤖 WebMCP
🚀 The fast, Pythonic way to build MCP servers and clients
Official repo for spec & SDK of MCP Apps protocol - standard for UIs embedded AI chatbots, served by MCP servers
The fastest JSON schema Validator. Supports JSON Schema draft-04/06/07/2019-09/2020-12 and JSON Type Definition (RFC8927)
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.
No description provided.
My personal Claude assistant that runs in Apple containers. Lightweight, secure, and built to be understood and customized for your own needs.
MCP for (webtop) dockerized desktop environments (chrome devtools + debugger + computer use)
Playwright MCP server
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
OpenAI-compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX backend, 400+ tok/s.